
Hello David, thanks for the effort of reviewing indexed_set (and seems you made a thorough review indeed!) "David B. Held" ha escrito:
*** What is your evaluation of the design?
[...]
It's too bad that you have to specify the type of the member when using member<>, but I don't see a way to infer it without having measurable runtime overhead.
This is a recurrent request, but afaik noone has come up with a solution.
The range() member is very clever and is a good use case for Boost.Lambda, IMO.
Credit here goes entirely to Pavel, who pushed me to include this memfun --I was reluctant to add non-standard features.
It seems to me that modify() need not have the destructive policy that it does. It appears that modify() erases the node from the index, then reinserts it into the appropriate place. However, it seems to me that modify() could first look to see if the modified node already exists, save the iterator to the correct insertion point, and then only unlink the node if the modification would be succesful.
So the strategy would be like this:
1) Change the modifier function object to require a value() member that gives the new value to be used. 2) Instead of calling mod() right away, call something like lower_bound(), using mod.value(). Save the iterator if there isn't already an element there that conflicts. 3) Perform the erase as usual, and then do mod(), link2(), etc.
Ummm... If I followed your approach I would be incurring two extra copies which is precisely what modify() is meant to avoid. For non-destructive updating you can use update(), that's the idea. If I'm not making myself clear please tell me so.
One thing that I think should be added (though not affecting my vote) is ranked indexes. So just as an index might be unique or non-unique, it can be ranked or not ranked. Ranking gives an index an O(log n) ordinal search. That is, you can search for the ith item according to a certain index in O(log n) time, given an ordered index (obviously, the sequenced indexes would have an O(n) search time). This does change the structure of the nodes, and necessarily makes them bigger for ranked indexes. So it is not a trivial change. However, I am sure that it is doable and would add value to the library.
Yep. The idea of ranked containers came up recently, as introduced by Peter Palotas. I'll add this to the future work section. The most wanted new features seem to be (in this order) serialization, hashed indices and ranked indices.
*** What is your evaluation of the implementation?
[...]
Some of the lines are > 80 columns (e.g.: the copyright notice).
Is this strictly required? If so I'll go thru the code and make the changes. If not, I'd rather leave it like that.
I think that BOOST_NO_MEMBER_TEMPLATES should be deprecated. I don't think there's a single supported compiler in the Boost compiler configs that defines this any more. I think checking against BOOST_MSVC6_MEMBER_TEMPLATES is sufficient, but it would be nice to not even have to check against that.
<off topic> It'd be nice if Boost.Config provided a (possibly automatically generated) reference table showing which compilers do have a specific defect macro.</off topic> Would it be OK to you if we delayed the remotion of this macro till the next dev cycle? I fear I could be inadvertently missing an otherwise supportable compiler.
I do believe the parts that come from other code need to retain the original copyright. The parts that were inspired by, but not directly derived from, another implementation probably don't need a copyright notice, but should probably give a mention in the copyright, at the least as a matter of professional courtesy.
No problem. Apart from stl_set.h I borrowed nothing from other sources. Will add the (c) where appropriate.
*** What is your evaluation of the documentation?
[...]
Here is a somewhat minor nitpick. In the tutorial, the index_list is specified as so:
(type of index)<[(tag)[,(key extractor)[,(comparison predicate)]]]> (type of index)<[(key extractor)[,(comparison predicate)]]>
I would change it to thus:
(unique | non_unique)<[(tag)[,(key extractor)[,(comparison predicate)]]]> (unique | non_unique)<[(key extractor)[,(comparison predicate)]]>
Yes, why not. Consider it changed.
It would be nice to be able to read through the documentation like a manual. Thus, next links on each page would be appropriate.
On the performance page, you state that the tests were performed on a system with 260 KB "RAM memory", running W2K. I'm absolutely certain that is not correct. ;) "256 MB RAM" sounds a lot more likely.
You're absolutely correct. Fixed. Thanks.
Something that I think should be added to the documentation is a concept page that lists the main concepts in IndexedSet and a concept interface a la the SGI STL reference (www.sgi.com/tech/stl).
More on this later down the post.
*** What is your evaluation of the potential usefulness of the library?
[...]
*** Did you try to use the library? With what compiler?
[...]
*** How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
[...]
*** Are you knowledgeable about the problem domain?
[...]
*** Do you think the library should be accepted as a Boost library?
Yes, most definitely.
Great! The score is 9-0 so far.
*** Naming issues:
Here's my take on names, but should not be construed as affecting my vote for inclusion...
Namespace: I think too many namespace levels are not helpful. A separate 'container' namespace seems a bit superfluous to me. There is no danger in name clashes within Boost between container libraries and other kinds of libraries that I can see, and I don't see that it provides a significant documentation benefit. I think additional namespace depth is most appropriate for collections of algorithms, where a proliferation of names is more likely to lead to name collisions.
Well, I stated previously that I'll do what people agree on. I don't know if the final decision should be made by Pavel as the review manager. IMHO, qualified identifiers will be so long anyway that having an additional ::container:: won't hurt much.
Library name: I don't think IndexedSet is so bad, but an alternative that I like is CompositeSet. Then there could be a Composite Sorted Associative Container concept which would support the notion of containers with multiple keys. Then CompositeSet would be a model of Composite Sorted Associative Container. A synonym for Sorted Associative Container could be Simple Sorted Associative Container or Trivial Sorted Associative Container.
The main problem with "Set" slipping into the name of the container is that, for instance, the following indexed_set<int,index_list<sequenced<> > > does not have anything to do with sets (it resembles a std::list of constant ints.) For this reason I like Multicontainer better (or something along this line.) <off topic>Reviewers have showed little or no interest in sequenced indices. Seems weird to me, as they can be used to construct a seemingly useful list-like container with fast lookup.</off topic> As for the Composite Sorted Associative Container concept, we have the same problem. Technically, the requirements imposed to current or future index types by indexed_set are much weaker than being some sort of associative container. I cannot even state global complexity bounds (please check the reference, page "Index reference", section "Complexity signature"). I agree with you a concept section would be highly desirable, but here I definitely need some help. What you request is not trivial.
Indexes: According to the American Heritage Dictionary, Fourth Edition, as channeled by www.dictionary.com, "indexes" is preferred to "indices". Another spelling metric I like to use is Google. According to Google, "indexes" returns 5.8 million hits, while "indices" returns 5.5 million hits. So that's only a 5% difference, but it's statistically significant. ;) I personally think the preferred spelling should be used throughout the documentation, but that's more of a nitpick than a criticism.
My Latin (I speak Spanish) heritage makes me favor "indices" to "indexes" (and "minuscule" to "miniscule", etc.) The Google metric could be biased by the fact that "indexes" is not only the plural of "index" but also the 3rd singular formal of the Present tense of the verb "index". May people allow me to retain "indices"? If only as a small concession to those of use who studied and love the olde Latin language.
Anyway, I prefer "ordered index" to "regular index". As far as index types go, I prefer "ordered_unique" to "set_like", due to consistency. I think Ordered Index should be a concept for this library.
Others do also like ordered_unique etc. I do too.
nth_index_type: I don't like the word "nth", personally. I suggest either of the following naming schemes:
nth_index_type -> ordinal_index index_type -> named_index | tagged_index
nth_index_type -> index_byval | index_bynum index_type -> index_byname | index_bytag
Generalize appropriately to the iterator types.
I recently proposed the following template<int N> struct nth_index; template<typename Tag> struct index; template<int N> struct nth_index_iterator; template<int N> struct nth_index_const_iterator; template<typename Tag> struct index_iterator; template<typename Tag> struct index_const_iterator; which seems to me good enough. Don't you like it? The nth_ prefix has some tradition in C++, why not use it here too?
update(): Actually, I do prefer "replace()" over "update()", because it does more accurately reflect the operation.
Me too.
As far as header organization goes, I would think of it this way: What are the major features and concepts of the library, and how will they be used separately or together? Organize the headers in a way that allows users to include feature sets as modules. If you have to, create dummy headers that just include sets of headers that are logically dependent so that users can activate a certain feature by including that header. Also, consider the cost of each header. It is worth the extra effort to isolate expensive headers than cheap ones. Other than that, I don't have any particular suggestions.
Well, I'm gathering all the feedback from the reviewers wrt to naming and header organization and will try to come up with some consistent proposal today or tomorrow. Thank you again for taking the effort to review the library. I hope you can find it useful in the future. (Here goes a challenge: what about writing a multiply indexed map on top of indexed_set? You're the map expert here at Boost ;) Please check the Future work section for a preliminary discussion on the subject of indexed maps.) Best, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo