Re: [boost] Formal Review: Indexed Set

30 Mar 2004

      Hello Rob, thank you for reviewing indexed_set.

Rob Stewart ha escrito:

[snipped stuffed not needing further comments]
...
* Do you think the library should be accepted as a Boost library?
Yes, Boost should accept this library.
Fine! I'm glad you like the library. Thank you.
...
----------------------
Documentation Issues.
The main page mentions "STL-like indices," but I don't think of
the Standard Library as providing indices, so this doesn't mean
anything to me.
Well the meaning is that indices (italicized as the term is first
introduced) have STL-like characteristics. I'd be happy to rewrite this
in a cleaner way if only you provided me with a different wording.
...
While it will complicate matters somewhat, I'd prefer to see the
examples written as if "namespace is = boost::indexed_sets" (or
similar) were in effect.  That would highlight the functionality
not already in namespace boost.
I don't quite get it. The docs are written as if the library is
already in Boost. Other tutorials in Boost also apply
the implicit using policy.
...
"Lookup" is only one word when it is used as a noun.  When used
as a verb, it is spelled "look up."
Yep, right. Fixed, thank you.
...
The initial discussion of "regular indices" in the tutorial gave
me no idea what they were.  Even after the example, I was left
wondering what that "regular index" thing actually was.  I didn't
see anything called "regular index" in the example code.  It
wasn't until the "Index types" section that I find the definition
of a "regular index."
I'm not sure what I could do to improve this part. Any specific
suggestion. I thought it'd be clear indices are wht one specifies
with unique<> etc.
...
Your discussion of modify_key doesn't mention any impact on the
Index Set should the new key change the sort order or violate a
unique index.
OK. Added a phrase stating this. In case it wasn't clear to you,
modify_key has the same semantics as modify
(collision->element is erased).
...
I didn't find the use of "far pointer" helpful.  It connotes the
pointer types germane to segmented memory architectures (old
Intel, for example).  Since the only thing you're trying to
convey is that the pointer-like type indexed_set may see chains
to another pointer-like type which may chain to yet another, ad
infinitum, may I suggest "chained pointer" instead?
Pavel also hated the name for the same connotation problems.
Well, I'll add it to my list, will change it if I no one else
complains.
...
Your "safe mode" name is inconsistent with the
"invariant-checking mode."  The latter clearly indicates what is
being checked, whereas the other doesn't.
I retained the name "safe mode" because users are likely
familiar with the name as given in STLport.
...
I expected that "safe
mode" would be a superset, but it is orthogonal.  How about
naming it "precondition-checking mode?"  A related issue is that
you say that invariant-checking can be turned on separately from
"safe mode" and that the former indicates an internal error.
However, I don't see how you can assert that the failure to
maintain an invariant is an internal error if you don't ensure
the preconditions were met.  IOW, shouldn't turning on
invariant-checking mode imply turning on precondition-checking?
I'd like to give maximum flexibility to the user. Your reasoning
is correct, but then again you can follow your own advice in
your program and never enable invariant-checking without
enabling safe mode also. I guess you're for stricter enforcing
of this.
...
In Debugging support:Invariant-checking mode, in the advanced
topics page, you write:
...
It is recommended that users of Boost.IndexedSet always set the
invariant-checking mode. By default (i.e. if the user has not
provided a definition for BOOST_INDEXED_SET_INVARIANT_ASSERT),
this mode will not perform any check at all in NDEBUG builds,
but it does not have a significant impact in the compiled
binary either, so you need not set off invariant-checking for
release compilations.
I have two questions.  First, why is this information buried in
the advanced topics page?  The mode should be highlighted in the
main documentation path.
I thought the tutorial was hard enough as it was and decided to
move all the non-essential stuff to a different page. Admittedly,
other topics covered in this section are really much more advanced.
...
Second, if the overhead is so small,
why not leave it on by default in all builds and then give users
a means to disable it?  Users can then decide whether to disable
it in all builds or only release builds.
I think I'll remove this piece of advice: invariant-checking can be
left on in release builds, but it won't do anything unless
BOOST_INDEXED_SET_INVARIANT_ASSERT has been
defined by the user). To sum it up, it's a pretty useless recommendation:
withouth redefining the assert macro, users won't get anything by
leaving the mode set on in release mode.
...
Both the tutorial and the advanced topics pages claim
indexed_set's simulation of std::set, for example, is as
efficient as the real thing, yet the performance page shows a
10-20% overhead.  You should qualify your claims of equal
performance such that you say indexed_set has "nearly" as good
performance.
You're right. Added the "nearly" qualifier in the tutorial bit.
In the advanced topics section, the efficient is said to be
"comparable", which is IMHO fair enough.
...
----------------------
Naming Issues.
I, too, find "mix" too cute, so "multi_container" is better, but
that is rather indicative of implementation and doesn't clearly
convey the idea of multiply indexing data.  Thus, I'm in favor
of "multi_index," but the most recent "multi-index container"
notion may be the best course to remove all confusion.
I don't like the "non_unique" stuff.  I wonder whether it could
be dropped altogether, leaving "unique" as a modifier when that
property is present.  Otherwise, how about "dup" (short for
duplicate, in case it wasn't obvious) as a modifier?  (You could
also use "multi" to mirror the idea behind multiset and
multimap.)
I don't like the idea of having "_unique" and no counterpart
for non-unique indices (this was discussed here months ago,
and the (IMHO correct) conclusion is that both variants
should have a qualifying suffix. I'm for descriptive names,
so dup seems just too obscure. non_unique resonates well
with people into RDBMS stuff.
...
I think that "get<>" should be named "get_index<>" as that it
will read better in use:
... = es.get_index<1>();
"member" and "const_mem_fun" seem to overlap based soley on the
names.  I suggest that "member" be renamed "data_member" to avoid
confusion regarding the use of "member" to invoke a member
function.
Thanks for all comments regarding naming. I'll try to balance everybody's
opinions and come up with some final proposal to post to the list.
get_index<> I like, though get<> was chosen for sympathy with Boost.Tuple
(similar names for similar concepts.)

[...]
...
----------------------
Design Issues
I don't really like that index 0 is the default behavior.  I'd
rather an Index Set not behave like any of its indexes.  Yes that
means you must say get(_index)<0>() when you want the first
index, but I like the idea of that being explicit.  There's no
opportunity to change a typedef from std::list, for example, to
boost::indexed_set and silently use the wrong index or silently
change the complexities.  I don't see indexed_set as a drop-in
replacement for existing types, so I don't want it to be easy to
do that.
Darren also disliked this approach. To me it has some value:
in a sense, the first index is treated as the primary one (which
in many situations really is). Put another way, what disadvantages
do you find in having the default behavior?
...
I don't like the assymmetry between get<> and the pair
nth_indexed_type<> and indexed_type<>.  I'd prefer that
indexed_type<> be used, like get<>, for either a number or a tag.
I'd prefer it also, but AFAIK it cannot be done. You just cannot
"overload" a template to accept type and non-type arguments.
...
Why does tag<> accept multiple "names?"  Is there a motivating
example for why a single name isn't sufficient?  I ask because
the class is more complicated than it otherwise needs to be.
That translates into more documentation to explain it and longer
-- by however much -- compilation times.
No movitating example that I know of. It was implemented like that
cause having multiple tags was no harder than having just one.
...
----------------------
Documentation Nitpicks
[...]

Man, you did a *thorough* proofreading. Thanks so much for the
nitpicking, I'll validate and apply your changes.

Joaquín M López Muñoz
Telefónica, Investigación y Desarrollo

Re: [boost] Formal Review: Indexed Set

Joaquín Mª López Muñoz