[review][autoindex] AutoIndex review extended

Hello all, The AutoIndex review is extended for another week, it's now running until Saturday 21st May. This will hopefully give people who have been busy preparing for boostcon a chance to review it in a quiet moment. I'm also interested in hearing from people who won't be using the tool directly, but might be using the indexes it generates. Or perhaps already are, since it's already been used for Boost.Spirit, Boost.Type Traits and Boost.Math: http://www.boost.org/libs/spirit/ http://www.boost.org/libs/type_traits/ http://www.boost.org/libs/math/doc/sf_and_dist/html/ It doesn't need to be a long or in depth review to be useful. The documentation is online at: http://svn.boost.org/svn/boost/sandbox/tools/auto_index/doc/html/index.html AutoIndex can be downloaded from the vault: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=auto_index-0.9.zip thanks, Daniel James AutoIndex review manager

Hi John,
I'm also interested in hearing from people who won't be using the tool directly, but might be using the indexes it generates. Or perhaps already are, since it's already been used for Boost.Spirit, Boost.Type Traits and Boost.Math:
http://www.boost.org/libs/spirit/ http://www.boost.org/libs/type_traits/ http://www.boost.org/libs/math/doc/sf_and_dist/html/
It doesn't need to be a long or in depth review to be useful.
It is working well and generates an impressive index. But... it is actually working a little too good. This temporary site <http://www.xs4all.nl/~barend/boost.geometry/qbk/libs/geometry/doc/html_ai/in... <http://www.xs4all.nl/%7Ebarend/boost.geometry/qbk/libs/geometry/doc/html_ai/index.html>> contains the updated Boost.Geometry doc including the "index" entry. This is the rough form, I didn't finetune the index yet. I've noticed in the docs how to exclude a term from a section, or limit it to a section. But in my case I probably have to exclude each term from a section, and more... 1) I want to exclude a whole section (the reference matrix, for obvious reasons). Can that be done? 2) I also want to exclude all examples (nearly every page in Boost.Geometry will have one; these examples of course also include other "terms"; which should not be indexed on that page). Is there a way to mark a qbk-block to be excluded from being indexed? Thanks, Barend

1) I want to exclude a whole section (the reference matrix, for obvious reasons). Can that be done?
No, I guess that's a feature request then ;-)
2) I also want to exclude all examples (nearly every page in Boost.Geometry will have one; these examples of course also include other "terms"; which should not be indexed on that page). Is there a way to mark a qbk-block to be excluded from being indexed?
Not at present no.... I'm not really sure how one would even do that, there would have to be some kind of docbook XML container that was used to represent "don't index this block". Can you provide an example of a spurious entry? It could be that tweaking the scanning regular expressions used could fix this. John.

Hi John, On 14-5-2011 10:27, John Maddock wrote:
1) I want to exclude a whole section (the reference matrix, for obvious reasons). Can that be done?
No, I guess that's a feature request then ;-)
Right ;-)
2) I also want to exclude all examples (nearly every page in Boost.Geometry will have one; these examples of course also include other "terms"; which should not be indexed on that page). Is there a way to mark a qbk-block to be excluded from being indexed?
Not at present no.... I'm not really sure how one would even do that, there would have to be some kind of docbook XML container that was used to represent "don't index this block". Can you provide an example of a spurious entry? It could be that tweaking the scanning regular expressions used could fix this.
I understand. After more thought, maybe it is not so bad that the samples are indexed. Because the indexed terms are really shown there in their context. There were some spurious index terms as "p", "r", "for", probably because of the samples, and they could be turned off as documented. Then there are spurious terms as "point_type" because in the example I often typedef a point a "point_type", but that term also exists as a regular, indexable, entry. They have either to be turned off manually, or turned on in specific sections, or I've to rename them to e.g. ptype in the examples (which might actually be better). But turning them off for specific sections does not work for me, probably I do something wrong. I define: point_type "" "(?!geometry.reference.adapted.register.*).*" to omit it from all sections starting with reference.adapted.register, but the point_type still appears there. I copied and pasted it from the doc. I added !debug regular-expression , I don't see anything in the log. I'm not a regex-expert and don't see what is wrong here. Another question about this: is it possible to exclude a term twice? So e.g. point_type "" "(?!geometry.reference.adapted.register.*).*" point_type "" "(?!geometry.reference.exclude_also_from_this.*).*" Or do I have to write a more complex regex for this? Thanks, Barend

On 14-5-2011 12:53, Barend Gehrels wrote:
I added !debug regular-expression , I don't see anything in the log.
Oops, I misunderstood this, sorry. I thought that any regex would be debugged. But, after re-reading, I noticed that of course you have to add a regex there. So I added !debug point_type And it gives me: Debug term found, in block with ID: geometry.reference.adapted.register.boost_geometry_register_box_2d_4values Current section title is: BOOST_GEOMETRY_REGISTER_BOX_2D_4VALUES The main index entry will be : BOOST_GEOMETRY_REGISTER_BOX_2D_4VALUES The indexed term is: point_type The search regex is: \<point_type\> The section constraint is: (?!geometry.reference.adapted.register.*).* The index type for this entry is: But this block is exactly the same as in constraints where it should show up: Debug term found, in block with ID: geometry.design Current section title is: Design Rationale The main index entry will be : Design Rationale The indexed term is: point_type The search regex is: \<point_type\> The section constraint is: (?!geometry.reference.adapted.register.*).* The index type for this entry is: Barend

Not at present no.... I'm not really sure how one would even do that, there would have to be some kind of docbook XML container that was used to represent "don't index this block". Can you provide an example of a spurious entry? It could be that tweaking the scanning regular expressions used could fix this.
I understand. After more thought, maybe it is not so bad that the samples are indexed. Because the indexed terms are really shown there in their context.
There were some spurious index terms as "p", "r", "for", probably because of the samples, and they could be turned off as documented.
Then there are spurious terms as "point_type" because in the example I often typedef a point a "point_type", but that term also exists as a regular, indexable, entry. They have either to be turned off manually, or turned on in specific sections, or I've to rename them to e.g. ptype in the examples (which might actually be better).
Hmmm, the trouble is, assuming it's searching for something along the lines of typedef something point_type; Then that will occur both where you want it to be indexed (class definitions) and where you don't (example code). So that leaves you with two options - either a section constraint (see my comments below), or exclude that term altogether and add a manual index entry for it by escaping to XML and adding the necessary <indexterm>'s. I accept that's a touch hardcore though!
But turning them off for specific sections does not work for me, probably I do something wrong. I define:
point_type "" "(?!geometry.reference.adapted.register.*).*"
to omit it from all sections starting with reference.adapted.register, but the point_type still appears there. I copied and pasted it from the doc. I added !debug regular-expression , I don't see anything in the log. I'm not a regex-expert and don't see what is wrong here.
Nor do I, the debug info in the other mail suggests it should not be indexed, so I can't see what's wrong.... can you let me have your index-script file so I can try it here?
Another question about this: is it possible to exclude a term twice? So e.g.
point_type "" "(?!geometry.reference.adapted.register.*).*" point_type "" "(?!geometry.reference.exclude_also_from_this.*).*"
Ah, you can do that, but it takes the union (logical or) of the two regexes, and that's not what you want here, which is more akin to a logical and.
Or do I have to write a more complex regex for this?
Nod, something like: "(?!geometry.reference.adapted.register.*|geometry.reference.exclude_also_from_this.*).*" I guess I should make this clearer in the docs... HTH, John.

Hi John, On 15-5-2011 11:43, John Maddock wrote:
Not at present no.... I'm not really sure how one would even do that, there would have to be some kind of docbook XML container that was used to represent "don't index this block". Can you provide an example of a spurious entry? It could be that tweaking the scanning regular expressions used could fix this.
I understand. After more thought, maybe it is not so bad that the samples are indexed. Because the indexed terms are really shown there in their context.
There were some spurious index terms as "p", "r", "for", probably because of the samples, and they could be turned off as documented.
Then there are spurious terms as "point_type" because in the example I often typedef a point a "point_type", but that term also exists as a regular, indexable, entry. They have either to be turned off manually, or turned on in specific sections, or I've to rename them to e.g. ptype in the examples (which might actually be better).
Hmmm, the trouble is, assuming it's searching for something along the lines of
typedef something point_type;
Then that will occur both where you want it to be indexed (class definitions) and where you don't (example code).
So that leaves you with two options - either a section constraint (see my comments below), or exclude that term altogether and add a manual index entry for it by escaping to XML and adding the necessary <indexterm>'s. I accept that's a touch hardcore though!
I would not prefer the last option indeed...
But turning them off for specific sections does not work for me, probably I do something wrong. I define:
point_type "" "(?!geometry.reference.adapted.register.*).*"
to omit it from all sections starting with reference.adapted.register, but the point_type still appears there. I copied and pasted it from the doc. I added !debug regular-expression , I don't see anything in the log. I'm not a regex-expert and don't see what is wrong here.
Nor do I, the debug info in the other mail suggests it should not be indexed, so I can't see what's wrong.... can you let me have your index-script file so I can try it here?
It is attached. I tried some other things, and don't get it working, either by including those sections, or by excluding them. Note that to build our documentation you will need some additional steps (we convert from Doxygen to Qbk - the converted files are not in SVN), if you want I can send off-list a zip with generated content such that calling bjam is sufficient
Another question about this: is it possible to exclude a term twice? So e.g.
point_type "" "(?!geometry.reference.adapted.register.*).*" point_type "" "(?!geometry.reference.exclude_also_from_this.*).*"
Ah, you can do that, but it takes the union (logical or) of the two regexes, and that's not what you want here, which is more akin to a logical and.
Or do I have to write a more complex regex for this?
Nod, something like:
"(?!geometry.reference.adapted.register.*|geometry.reference.exclude_also_from_this.*).*"
I guess I should make this clearer in the docs...
That would be useful. Regards, Barend

to omit it from all sections starting with reference.adapted.register, but the point_type still appears there. I copied and pasted it from the doc. I added !debug regular-expression , I don't see anything in the log. I'm not a regex-expert and don't see what is wrong here.
Nor do I, the debug info in the other mail suggests it should not be indexed, so I can't see what's wrong.... can you let me have your index-script file so I can try it here?
It is attached. I tried some other things, and don't get it working, either by including those sections, or by excluding them.
Reproduced - it's a bug, and turned out to be a simple regular expression usage error much to my embarressment!! :-( Fixed in the sandbox, also updated the tests. HTH, John.

Hi John, On 14-5-2011 10:27, John Maddock wrote:
2) I also want to exclude all examples (nearly every page in Boost.Geometry will have one; these examples of course also include other "terms"; which should not be indexed on that page). Is there a way to mark a qbk-block to be excluded from being indexed?
Not at present no.... I'm not really sure how one would even do that, there would have to be some kind of docbook XML container that was used to represent "don't index this block".
A bit of hacking could enable this. If I add literal docbook-entries in QuickBook, like this: ''' <para role="auto-index-skip-begin" /> ''' [heading Example] [box_view] [box_view_output] ''' <para role="auto-index-skip-end" /> ''' The para entries are included in my final docbook XML, on the right places, like this: <para><para role="auto-index-skip-begin"/></para> They are empty and have no visual effects. So they could be used for skipping. Of course, it is not really beautiful but it would work. I tried several other things (begin para, end para, or an invisible section) but that does not work. Using some other entries (I tried "markup" the same way) will work. So with this, we could create non-indexable parts in quickbook, for examples or otherwise. Using QuickBook I could create a section with an id "box_view_example_skip_autoindex" and that would work either, but that show up in the page differently, because the section will have an entry in the hierarchy. I don't know of there are other containers available. I will react on your recent reaction later. Regards, Barend

2) I also want to exclude all examples (nearly every page in Boost.Geometry will have one; these examples of course also include other "terms"; which should not be indexed on that page). Is there a way to mark a qbk-block to be excluded from being indexed?
Not at present no.... I'm not really sure how one would even do that, there would have to be some kind of docbook XML container that was used to represent "don't index this block".
A bit of hacking could enable this.
If I add literal docbook-entries in QuickBook, like this:
''' <para role="auto-index-skip-begin" /> '''
[heading Example] [box_view] [box_view_output]
''' <para role="auto-index-skip-end" /> '''
The para entries are included in my final docbook XML, on the right places, like this: <para><para role="auto-index-skip-begin"/></para>
They are empty and have no visual effects. So they could be used for skipping. Of course, it is not really beautiful but it would work. I tried several other things (begin para, end para, or an invisible section) but that does not work. Using some other entries (I tried "markup" the same way) will work.
So with this, we could create non-indexable parts in quickbook, for examples or otherwise.
Using QuickBook I could create a section with an id "box_view_example_skip_autoindex" and that would work either, but that show up in the page differently, because the section will have an entry in the hierarchy. I don't know of there are other containers available.
If I go this route, I'd rather use docbook processing instructions: http://www.sagehill.net/docbookxsl/ProcessingInstructions.html I guess these could be arbitrarily complex, but I'd prefer to keep it simple if possible, something like: <?boost.ai exclude-enclosing ?> To exclude the enclosing XML container from indexing. I guess we could have start/stop instructions as well, but you can imagine the trouble they could cause if mis-matched or at different levels in the XML - I guess we'd have to say that they apply only until the enclosing XML scope is exited. I think these are neater than abusing the XML structure, but would likely require modification to the Boostbook stylesheets (to pass them through unchanged), and also to the XML parser which I think currently ignores these... John.

On 15-5-2011 17:07, John Maddock wrote:
2) I also want to exclude all examples (nearly every page in Boost.Geometry will have one; these examples of course also include other "terms"; which should not be indexed on that page). Is there a way to mark a qbk-block to be excluded from being indexed?
Not at present no.... I'm not really sure how one would even do that, there would have to be some kind of docbook XML container that was used to represent "don't index this block".
A bit of hacking could enable this.
If I add literal docbook-entries in QuickBook, like this:
''' <para role="auto-index-skip-begin" /> '''
[heading Example] [box_view] [box_view_output]
''' <para role="auto-index-skip-end" /> '''
The para entries are included in my final docbook XML, on the right places, like this: <para><para role="auto-index-skip-begin"/></para>
They are empty and have no visual effects. So they could be used for skipping. Of course, it is not really beautiful but it would work. I tried several other things (begin para, end para, or an invisible section) but that does not work. Using some other entries (I tried "markup" the same way) will work.
So with this, we could create non-indexable parts in quickbook, for examples or otherwise.
Using QuickBook I could create a section with an id "box_view_example_skip_autoindex" and that would work either, but that show up in the page differently, because the section will have an entry in the hierarchy. I don't know of there are other containers available.
If I go this route, I'd rather use docbook processing instructions: http://www.sagehill.net/docbookxsl/ProcessingInstructions.html
I guess these could be arbitrarily complex, but I'd prefer to keep it simple if possible, something like:
<?boost.ai exclude-enclosing ?>
To exclude the enclosing XML container from indexing.
I don't get this, how would that container would be marked? The whole section? That is the only XML container there currently is... The sample is not contained in any container right now.
I guess we could have start/stop instructions as well, but you can imagine the trouble they could cause if mis-matched or at different levels in the XML - I guess we'd have to say that they apply only until the enclosing XML scope is exited.
The markers I gave were in a closed <para />, so different levels in XML is no problem. Besides that, there is also [section] [endsect] which should match as well. If auto-index is commonplace, QuickBook could be extended by something like [exclude_from_ai] ... [end_exclude_from_ai]
I think these are neater than abusing the XML structure, but would likely require modification to the Boostbook stylesheets (to pass them through unchanged), and also to the XML parser which I think currently ignores these...
I agree that it is not neat, it is a hack, and abusing. So if processing instructions work, I would like that being included. Regards, Barend

On 15 May 2011 16:07, John Maddock <boost.regex@virgin.net> wrote:
I think these are neater than abusing the XML structure, but would likely require modification to the Boostbook stylesheets (to pass them through unchanged)
Sorry, I'd forgotten about that, just checked in the patch I wrote. Not sure how well it will work, so I won't merge to release for a while.

I think these are neater than abusing the XML structure, but would likely require modification to the Boostbook stylesheets (to pass them through unchanged)
Sorry, I'd forgotten about that, just checked in the patch I wrote. Not sure how well it will work, so I won't merge to release for a while.
Nod. Many thanks for committing that, will no doubt test it out post-review when I try and implement markup support for AutoIndex. John.
participants (3)
-
Barend Gehrels
-
Daniel James
-
John Maddock