
On Sun, Mar 3, 2013 at 9:00 PM, Martin Knoblauch Revuelta < mkrevuelta@gmail.com> wrote:
On Sun, Mar 3, 2013 at 11:57 AM, Vadim Stadnik <vadimstdk@gmail.com> wrote:
...
Do you know any other reasons and/or concerns why augmented data structures should NOT be added to STL?
The main concern is the balance between the value they add and their cost.
On one hand, which are the real-life applications where these containers would really help? In my opinion, the programming community has learned to solve zillions of problems with arrays, lists, search trees, heaps and hash tables. Now it's not easy to find a real-life problem where augmented trees provide a significantly faster solution. Though, they may allow amazingly simple, non-tuning, easier-to-code, solutions.
On the other hand, in addition to the usual bloat, there is a deep concern regarding the possible incorrect use of these sequence containers. Lazy unexperienced programmers might misuse them as an all-purpose replacement of vector and list.
If the standard library adopted them, it would be somehow advocating their use. If there were no real-life applications, then it would be advocating inefficiency.
I agree with your concerns. In practice, on Windows systems an algorithm using N<1000 elements of small size normally does not have to use advanced data structures at all. It performs quite well with the default STL container std::vector. The performance problems arise when the number of elements increases and the algorithms uses inefficient linear cost operations of std::vector. To some extent the risk of misuse is reduced by the array based augmented B+ trees. As I mentioned before they have good locality of reference and guarantee at least quite efficient sequential processing. The representation of strings and string algorithms is one of the most practically important application areas for augmented trees. The B+ trees proposed for the Boost review support not only O(log N) insertion and erasure for a single element, but also O(log N) time for splice and split operations. This advantage is not achievable with basic data structures and STL containers.
[..] The discussion of augmented trees by Boost community is particularly interesting. Do you know if there was a formal Boost review of any variant of augmented data structures and/or containers?
There was an interesting discussion when I proposed it:
http://boost.2283326.n4.nabble.com/proposal-Sequence-container-with-O-log-n-...
http://boost.2283326.n4.nabble.com/proposal-Rank-List-old-name-AVL-Array-in-...
[..] I proposed for the Boost review array based augmented B+ trees and decided to postpone the review of dynamically allocated augmented B+ trees.
If you only plan to augment the tree with an integer, then this is perfectly reasonable.
Thank you for the links, I will analyze the discussions. I have already implemented 5 variants of basic and augmented B+ trees and STL containers using these trees. Three variants of dynamically allocated B+ trees: https://github.com/vstadnik/stl_ext_adv and two variants of array based B+ trees: https://github.com/vstadnik/stl_ext_adv_review In combination with augmented red-black trees they should help find the solution to the problem how to use augmented and advanced data structures inside or alongside STL.
Though, if you plan to augment the trees with a data type that doesn't support subtraction ---like the interval trees of "Introduction to Algorithms", or something more complex--- it might be better to stick to a few children per node. Otherwise, deleting an element might require a lot of [potentially expensive] additions of a user-defined type.
The development of augmented interval B+ trees is not impossible, but at this stage my main priority is data structures that support STL interfaces. Regards, Vadim Stadnik