
On Tue, Jan 8, 2013 at 7:31 AM, Gottlob Frege <gottlobfrege@gmail.com>wrote: Basically both an order statistics tree and boost multi-index are examples
of containers with multiple indices. I suspect there may be interesting results from a compare/contrast type of study.
Boost MultiIndex allows indexing of a container via any number/type of indices. So you could have a container indexed via key, and also indexed via integer. Now the difference, I think, is that MultiIndex uses independent indexes - changing a entry's key from "alpha" to "beta" doesn't change its integer index from 1 to 2.
So that is a significant difference. Might be worth mentioning in docs or somewhere, just to avoid confusion.
But furthermore, I could probably implement the same behaviour as a order statistics tree by using Boost.MultiIndex internally (and maintaining the interrelationships of the indices as needed). Would that be drastically less efficient?
+1. This is a very good comment and I think there is no simple answer. I was asked similar questions during Boost discussion of augmented dynamically allocated B+ trees, this is why I would like to make a few comments based on my analysis. The fact that creates an overlap with Boost.MultIndex and can cause confusion is that augmented data structures support multiple augmenting. For example, basic balanced tree provides efficient search of elements by keys. This tree requires a comparison object. Augmenting of such tree by counter of elements adds efficient access to elements by rank, which is equivalent to index value in the subscript operator of an array or std::vector. This variant of tree has extended functionality and can work both with and without a comparison object. In other words, it offers equally efficient operations for both ordered and unordered containers. Yet another augmenting by the sum of values in children nodes improves computational complexity of calculation of sum of any number of consecutive elements from linear to logarithmic. As a result of the double augmenting, the calculation of the mean value and standard deviation can be done in O(log N) time only. Such variants of augmented data structures support STL sequences and associative containers with improved efficiency of specific member functions. In applications that require multiple views of a data set these augmented data structures and containers cannot replace Boost.MultiIndex functionality. On the other hand, replacing functionality of augmented data structures with Boost.MultiIndex should be generally possible. However, this solution will be less efficient in the case when an augmented data structure offers a set of more efficient algorithms compared to data structures of Boost.MultiIndex. The disadvantage of this approach is that one less efficient operation can cause a performance bottleneck. Normally, augmented data structures offer the improvement of the running time from O(N) down to O(log N) for a single operation or an algorithm. This performance benefit of augmented data structures certainly should not be ignored by Boost.MultiIndex. Unfortunately, at this stage it is not possible for a user algorithm to take advantage of Boost.MultiIndex with advanced data structures. This example shows declaration of a multiset: typedef multi_index_container<int, indexed_by<ordered_non_unique<identity<int>>>> multi_set_int ; The obvious idea of a parameterization of containers used by views of Boost.MultiIndex is potentially useful, but deeper analysis shows that it is quite difficult and costly to implement. Boost.MultiIndex has special non-standard requirements to the view containers. For example, std::vector cannot be used in place of random access indices, since it does not remain stable after insertion or erasure operations. Hope that the discussion helps better understand the complexity of these issues. Regards, Vadim Stadnik