[multi_index][hash][rfc] Hashed indices preview + Boost.Hash

Hello all, You can find a preview version of Boost.MultiIndex including hashed indices plus Daniel James' Boost.Hash library at the Sandbox Files area: multi_index_140205_plus_hash_part_1.zip multi_index_140205_plus_hash_part_2.zip (split in two for maximum size reasons.) There are a number of things we'd like to ask you feedback about: 1. Hashed indices are modeled according to the latest TR1 draft, with some significant diversions: 1.1. erase() does not throw ever, even it the user-provided Hash object throws. This is imposed by the general framework of Boost.MultiIndex. This goodie does not come for free: if the Hash object throws, erase() decays to a linear search. Not that there's any alternative, but I'd like to point it out. 1.2. insert() has strong exception safety (TR1 only requires basic.) Again, the way Boost.MultiIndex is designed forces this. 1.3. hash indices iterators are bidirectional. 1.4. Usual implementations use one pointer per bucket, whereas Boost.MultiIndex has two pointers per bucket. This addresses potential performance problems of insert() and erase() under low load conditions, as discussed in http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1756.pdf issue 6.14. Is this convenient? 2. The material is complete except for a section in the tutorial and possibly one more example (example 8 already exercises hashed indices.) Tests fully sport hashed indices. The reference is also updated. 3. All in all, I'd greatly appreciate your comments on this: suitability, documentation, performance, etc. I plan to commit this to the CVS if nobody objects in a reasonable amount of time (a couple of weeks.) I'm open to all kind of suggestions --I'm eager to recieve them, as it happens :) 4. As for Boost.Hash, this library by Daniel James' implements the ideas for std::tr1::hash (albeit in boost namespace) proposed by Peter Dimov in the aforementioned paper, issue 6.18. The idea is to provide a general purpose hash<> utility for use by * Boost.MultiIndex * TR1-based unordered containers (Daniel is working on this.) * Eventually, wrapping by Boost.TR1. 5. The library comes fully documented and tested, and in our view well prepared for uploading to the CVS. 6. Yet, this hasn't undergone any formal review, so some of you might (with reason) object to its being commited to the CVS. From our point of view, we have three valid alternatives: * Boost members agree to have it in CVS without more ado. * As this is used by Boost.MultiIndex, Boost.Hash is suitable for fasttrack review. * This is untolerable and the library should be push_back()'ed to the review queue. Meanwhile, Boost.Hash should live as an impl detail of Boost.MultiIndex. We hope you find the material useful and look forward to knowing your opinions about the issues brought forward. Best, Joaquín M López Muñoz Daniel James

1. Hashed indices are modeled according to the latest TR1 draft, with some significant diversions: 1.1. erase() does not throw ever, even it the user-provided Hash object throws. This is imposed by the general framework of Boost.MultiIndex. This goodie does not come for free: if the Hash object throws, erase() decays to a linear search. Not that there's any alternative, but I'd like to point it out.
Sounds reasonable, I wouldn't normally want erase to ever throw anyway, and there has been some discussion on the C++ committee list about requiring function hash object construct/copy/swap/invoke to not throw anyway, as otherwise some things are unreasonably hard to implement.
1.2. insert() has strong exception safety (TR1 only requires basic.) Again, the way Boost.MultiIndex is designed forces this.
Good, IMO.
1.3. hash indices iterators are bidirectional.
Good.
1.4. Usual implementations use one pointer per bucket, whereas Boost.MultiIndex has two pointers per bucket. This addresses potential performance problems of insert() and erase() under low load conditions, as discussed in http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1756.pdf issue 6.14. Is this convenient?
Well it's not unreasonable, you need what you need.
2. The material is complete except for a section in the tutorial and possibly one more example (example 8 already exercises hashed indices.) Tests fully sport hashed indices. The reference is also updated. 3. All in all, I'd greatly appreciate your comments on this: suitability, documentation, performance, etc. I plan to commit this to the CVS if nobody objects in a reasonable amount of time (a couple of weeks.) I'm open to all kind of suggestions --I'm eager to recieve them, as it happens :)
Sounds reasonable, no one normally objects to library updates - and actually since some folks have asked for a release soon, it's better done sooner rather than later, if you're going to do it at all.
6. Yet, this hasn't undergone any formal review, so some of you might (with reason) object to its being commited to the CVS. From our point of view, we have three valid alternatives: * Boost members agree to have it in CVS without more ado. * As this is used by Boost.MultiIndex, Boost.Hash is suitable for fasttrack review. * This is untolerable and the library should be push_back()'ed to the review queue. Meanwhile, Boost.Hash should live as an impl detail of Boost.MultiIndex.
I don't see how you can place it in a detail namespace if users are going to be expected to (possibly) provide their own specialisations of the hash function object. I've no objection to either a fast track review, or just adding to cvs, John.
participants (2)
-
Joaquín Mª López Muñoz
-
John Maddock