
Hi Bruce 2010/12/14 Bruce Adams <tortoise_74@yahoo.co.uk>:
Hi, I have been working on a problem for which a suitable solution is an efficient container of intervals.
The two main candidates would be something like to boost::split_interval_set or a centred interval tree (along the lines of http://en.wikipedia.org/wiki/Interval_tree). It seems to me that the characteristics of an interval tree differ sufficiently from the containers defined in ICL that they might make a useful addition.
No doubt, interval_trees would be a useful addition to the library. For certain use cases they perform better than the interval container currently available in the ICL.
For the case where there is a potentially large number of overlapping intervals a split_interval_set will be less efficient than an interval tree. In the worst case intervals would be split down to unit size.
That depends what you intend to do with your interval container. One of the fundamental ideas for containers of the ICL was that of a compact and minimal representation of sets or maps using intervals. An icl::interval_set for example instantly joins inserted intervals if they overlap and thus always stays in a minimal form. Contrary to an interval tree, it forgets the intervals that have been inserted into it and it can not answer stabbing queries anymore.
Searching the archives I see there was some discussion of interval trees in the context of ICL but the additional of an interval tree container did not, as far as I can tell, come up (perhaps because it should first appear in a boost release before there is any consideration of adding to it?).
There was an agreement among the reviewers that the library in the current from is very useful, and efficient in many use cases, so it should be accepted into boost. I have agreed to integrate an interval tree implementation in the further evolution of the library. But it might also be contributed by others.
I have a few questions:
1. Is there a specific name for the data structures provided by the ICL?
As already mentioned, a basic idea for the library was to implement just sets and maps in a compact form exploiting the the fact, that elements occur in contiguous chunks: intervals. So you could say the ICL implements sets and maps. They are called interval_sets and interval_maps just as with bitsets the prefix bit... points to the way or technique the sets are implemented. I am not aware of a more specific name for this class of data structures other than interval_set and interval_map which I find quite natural.
I have searched the literature and cannot find any mention of sets that split or combine intervals this way. It is either a genuinely novel concept, which I find difficult to believe, or I haven't used the correct search terms (interval, extent or segment). Either way, I would like to update wikipedia appropriately to help other lost souls.
I admit, I never did an in-depth literature search to find out the "right" name for what I am doing. If you find out what it is, let me know ;-)
2. Are there any compelling reasons why interval trees are less useful than the containers provided by the ICL?
I don't say interval trees are less useful than icl containers. They are different data structures with different strenghts and weaknesses.
3. Perhaps based on the answer to 2. Would interval trees be a sensible addition to the ICL?
Yes.
4. Are there any other classes of interval container worthy of mention?
Roughly... (1) A flat_interval_set/map that is implemented using a vector can be more efficient, if there are mainly lookups and less inserts/deletes on the object, after is has been created. (2) Another interesting implementation of interval_sets/maps is a tree that only stores the beginning points of intervals. The interval_set/map is always infinite starting with (-inf, +inf) and gets only split by the insertion of elements. Cheers Joachim -- Interval Container Library [Boost.Icl] http://www.joachim-faulhaber.de/