[Review] ITL review starts today, February 18th

Hi all, The formal review of the Interval Template Library (ITL) starts today, February 18th, 2010 and will end February 27th, 2010. ITL is being developed by Joachim Faulhaber. ------------------------------------------------ About the library: The Interval Template Library (ITL) is a library of interval containers that facilitates computations on collections of intervals and associated values. Use cases of such computations typically occur in the date and time problem domain, when collections of time intervals with associated data are analyzed, combined and aggregated. But the scope of applications for interval containers is more general and not limited to date and time problems. Interval containers allow to access and manipulate collections of intervals on an abstract level as sets and maps (of elements), but also to exploit segmental information that is given by the interval borders. Interval containers are fairly generic. They can be instantiated with discrete domain (or key) types like integral numeric types or boost::date or time types. But they also work with continuous domain types like double, rational or string. Using the latter allows to represent infinite sets and maps. The interface of the interval containers strives to be very intuitive and in line with the stl and other boost libraries. * The set theoretic operations union, difference, intersection and symmetric difference are available as operators for all interval containers and for many useful overloads between them. * There is iterator support on the level of segments and interval containers can be combined with stl containers of intervals via stl algorithms. * For discrete domain types, element iterators are available which makes interval sets and maps models of SortedAssociative Containers so they can be used with many stl algorithms. All interval containers are addable and subtractable. The addability and subtractability concept of the ITL leads to a general mechanism of aggregation on interval_maps which is called 'aggregate on overlap'. Exploiting 'aggregate on overlap' allows to compute many useful aggregation results in a simple and abstract manner. The ITL's Set and Map concepts are defined by a signature and collections of laws or axioms. Those semantical constraints have been validated using automatic law based tests. This gives the library both a well founded formal specification and an unusual thorough test base. The ITL emerged out of real world use cases and is successfully used at Cortex Software GmbH since a couple of years in the production code of various applications. On the boost developer's list it has been proposed in four previews and a review request since May 2008 and has been refactored and refined according to boost standards and suggestions from the list. The ITL is available from the boost vault: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=itl_3_2 _0.zip&directory=Containers Documentation is online available at: http://www.herold-faulhaber.de/boost_itl/doc/libs/itl/doc/html/index.html The current sources are in the sandbox: https://svn.boost.org/svn/boost/sandbox/itl/ ------------------------------------------------ Everybody on this list is invited to participate in this formal review. I hope to see your review of this library, including your vote, and I welcome your participation in the discussions on the Boost mailing list. Please always state in your review, whether you think the library should be accepted as a Boost library. Additionally, please consider giving feedback on the following general topics: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With what compiler? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? Regards Hartmut Review Manager --------------- Meet me at BoostCon www.boostcon.com

Hartmut Kaiser wrote:
Interval containers allow to access and manipulate collections of intervals on an abstract level as sets and maps (of elements), but also to exploit segmental information that is given by the interval borders. Interval containers are fairly generic. They can be instantiated with discrete domain (or key) types like integral numeric types or boost::date or time types. But they also work with continuous domain types like double, rational or string. Using the latter allows to represent infinite sets and maps.
The interface of the interval containers strives to be very intuitive and in line with the stl and other boost libraries.
* The set theoretic operations union, difference, intersection and symmetric difference are available as operators for all interval containers and for many useful overloads between them.
I have no knowledge of the problem domain (nor did I actually look at the library yet, sorry about that), but I simply have a question: how does this compare to geometry libraries or spatial indexes that provide that kind of facilities for arbitrary dimensions? I certainly wouldn't want to force over-generalization to the authors, but I am wondering how much is gained by limiting things to one dimension.

2010/2/23 Mathias Gaunard
Hartmut Kaiser wrote:
Interval containers allow to access and manipulate collections of intervals on an abstract level as sets and maps (of elements), but also to exploit segmental information that is given by the interval borders. Interval containers are fairly generic. They can be instantiated with discrete domain (or key) types like integral numeric types or boost::date or time types. But they also work with continuous domain types like double, rational or string. Using the latter allows to represent infinite sets and maps.
The interface of the interval containers strives to be very intuitive and in line with the stl and other boost libraries.
* The set theoretic operations union, difference, intersection and symmetric difference are available as operators for all interval containers and for many useful overloads between them.
I have no knowledge of the problem domain (nor did I actually look at the library yet, sorry about that), but I simply have a question: how does this compare to geometry libraries or spatial indexes that provide that kind of facilities for arbitrary dimensions?
I certainly wouldn't want to force over-generalization to the authors, but I am wondering how much is gained by limiting things to one dimension.
So can this library survive in a field of power called boost of an ever abstracting ever generalizing and ever optimizing bunch of thinkers? A few points: * Geometry usually is based on a numeric coordinate type that comes with a distance measure. Many algorithms in geometry libraries rely on that. Intervals and interval containers of the ITL, similar to SortedAssociativeContainers, only require a strict weak ordering on their key type. So interval containers are less restricted and allow for a larger class of instantiations. You may for example want to work with interval sets of strings. * Interval containers implement interval combining styles that can be useful in providing and manipulating certain time grids. E.g. http://www.herold-faulhaber.de/boost_itl/doc/libs/itl/doc/html/boost_itl/exa... I'm not aware of something like that in geometry. * There are not only interval sets but also interval maps. itl::interval_map is the real workhorse in those real world applications that already exist. interval_maps provide a general idiom of aggregation of associated values. This idiom is very useful for computing statistical data through aggregation that occur in time. * Sometimes a more simple and basic class of data structures is advantageous. It may be simpler to implement optimizations. Like Lukes Manhattan integral polygons compared to more general GGL polygons. * Transfer of geometry knowledge to the more simple models of interval sets and maps may be a step users may not come up with. * The basic set and map concepts of the ITL allow for a variety of applications that may go beyond the scope of geometry. One such example is the self compressing large bitset, which not really has the look and feel of geometry ... http://www.herold-faulhaber.de/boost_itl/doc/libs/itl/doc/html/boost_itl/pro... ... or the example of a system of authorisations. See the example user_groups http://www.herold-faulhaber.de/boost_itl/doc/libs/itl/doc/html/boost_itl/exa... * Well known data structures from geometry like interval_tree or spatial indices do have different focuses and different applications. An interval_tree is more space efficient than an interval_map that can computes all overlaps at once. This may be desired if you need all of them. If you don't and you have reoccurring arbitrary stabbing queries interval_trees are the better choice. * itl::Maps are defining a generalized addition, which I think is a quite powerful feature that accounts for most of the goodies in the applications that we have done. This addition function is a generalisation of the common insert function on SortedAssociativeContainers and it introduces the aggregating properties of itl::Maps. The aggregate on overlap idiom is independent of the question, if the domain_type of the map or interval_map is a scalar or a tuple (has dimensions) or something else. There are already interesting applications of aggregations using itl::maps of tuples, but this exceeds the scope of the core library. Cheers, Joachim

Hartmut Kaiser wrote:
The ITL is available from the boost vault: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=itl_3_2 _0.zip&directory=Containers I have to say first I have been evaluating the version of ITL called itl_plus_3_2_1 rather than the version in the boost vault. Perhaps someone can tell me whether that is a big mistake or not.
Please always state in your review, whether you think the library should be accepted as a Boost library.
I think it should be accepted subject to the inclusion of an interval tree implementation.
- What is your evaluation of the design?
- What is your evaluation of the implementation? I would have liked to see interval trees in the implementation. For what I've needed the library for so far this has not been a problem as most of the intervals don't overlap. In the future I can see use cases where
I suspect that making the open/closedness of an interval a template parameter would suit more users although I don't have a strong opinion on this one. the lack of interval trees would make the library impractical. I don't know how much impact adding an interval tree implementation would have on the existing interface.
- What is your evaluation of the documentation? Good. A tutorial would be a welcome addition. Some functions were undocumented, I'm assuming they're documented in the review version.
- What is your evaluation of the potential usefulness of the library? Very useful. It seems to have many applications in several different domains.
- Did you try to use the library? With what compiler? Did you have any problems? I used gcc 4.4.1 and boost.python to make a python module for most of the main elements of the library. I used this module successfully in an application in genomics. However so far I have only had use for the SplitIntervalMap so I have not used most of the library.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? A reasonable amount - enough to get up to speed and use the library.
- Are you knowledgeable about the problem domain? Not especially.

Please always state in your review, whether you think the library should be accepted as a Boost library.
Yes, I think the library (with the interface improvements brought out during the review) should be accepted into boost. I've already successfully used the library with it's existing interface in a commercial product.
- What is your evaluation of the design?
As brought up during the course of the review period my main concerns for the library's interface are: 1) the bloated size of the interval class with it's dynamic bound data member. This is being addressed by Joachim providing an interface to allow use of static bound interval classes. 2) the fat interface on the interval class. In my view the myriad of mutators would best be replaced with free functions in terms of the interval's extents and return a interval passing the calculated extents to an interval constructor which enforces the interval's invariant(s). There should also be a method to construct an interval given unordered extents that effectively does interval(min(arg1, arg2), max(arg1, arg2)). 3) better integration of std and user defined data types en lieu of itl's specific types. Joachim has agreed to improvements in this area. 4) I'd also like to see std::inserter "just work" with the containers. I shouldn't have to expect to look for the itl specific adder to do this. This was the most common usage my applications had for populating interval containers using std::copy/transform from my applications existing data structures.
- What is your evaluation of the implementation?
Seems adequate. I was able to port to the portions I needed to
CodeWarrior 9.4 (from 2003?). IIRC, Joachim integrated the
changes(simplifications) that allowed itl to work in this environment.
Performance for the scale of problems I've been dealing with were fine,
but as has been discussed scaling to larger sizes of millions of items
may require some specialized containers. I don't see this as a problem,
it's analogous to std::map and std::unordered_map.
Hmm, It just occurs to me that for my application I could void the N^2
size issue of a split_interval_map
- What is your evaluation of the documentation?
I was able to get up and running fairly quickly. As discussed some of the colorful terminology got in the way rather than helping. Also I had expected to just be able to populate with std::inserter, and had trouble finding docs related to this. The example for copy/transform helped, but even during this review I have trouble finding the docs for itl::adder/inserter.
- What is your evaluation of the potential usefulness of the library?
Very, I've come across needs for this capability in 3 different jobs in 3 very different industries. I've had domain specific solutions and was looking to create just such a generic library when Joachim submitted his library.
- Did you try to use the library? With what compiler? Did you have any problems?
Yes, the library is being used in a commercial application built with CodeWarrior 9.4 on Mac/Windows and with VC8 and XCode 3.x. The library required some modifications to the template declarations for CodeWarrior.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I read through the documentation, integrated the library with production code, dug through compilation error messages and headers on CodeWarrior to port a portion of the library. I've profiled code using itl to understand performance implications.
- Are you knowledgeable about the problem domain?
Yes, I've designed/implemented/optimized integral open-closed Interval and IntervalSet classes for use in previous projects. Jeff

Hi Jeff,
thank you for your review of the ITL. You have been supportive in the
course of the development at different points giving feed back about
shortcomings and helping with code patches and useful suggestions. I
am glad to hear that you are actually using the library for real world
applications already.
2010/3/8 Jeff Flinn
Please always state in your review, whether you think the library should be accepted as a Boost library.
Yes, I think the library (with the interface improvements brought out during the review) should be accepted into boost. I've already successfully used the library with it's existing interface in a commercial product.
- What is your evaluation of the design?
As brought up during the course of the review period my main concerns for the library's interface are:
1) the bloated size of the interval class with it's dynamic bound data member. This is being addressed by Joachim providing an interface to allow use of static bound interval classes.
2) the fat interface on the interval class. In my view the myriad of mutators would best be replaced with free functions in terms of the interval's extents and return a interval passing the calculated extents to an interval constructor which enforces the interval's invariant(s). There should also be a method to construct an interval given unordered extents that effectively does interval(min(arg1, arg2), max(arg1, arg2)).
3) better integration of std and user defined data types en lieu of itl's specific types. Joachim has agreed to improvements in this area.
I think we have agreed in the various points involved.
4) I'd also like to see std::inserter "just work" with the containers.
I discovered that it does not just work, because for std::insert_iterator there is an invariant involved that expects the inserted to be a non empty element of the container, that the element is inserted into. This invariant is broken with interval containers because an interval/segment is not an element of an interval container.
I shouldn't have to expect to look for the itl specific adder to do this. This was the most common usage my applications had for populating interval containers using std::copy/transform from my applications existing data structures.
- What is your evaluation of the implementation?
Seems adequate. I was able to port to the portions I needed to CodeWarrior 9.4 (from 2003?). IIRC, Joachim integrated the changes(simplifications) that allowed itl to work in this environment.
I added all the changes suggested, but I remember that important parts of the code, itl global operators in particular, still did not compile with CW. I tried to set up CW myself, but encountered too many problems and gave up :(
Performance for the scale of problems I've been dealing with were fine, but as has been discussed scaling to larger sizes of millions of items may require some specialized containers. I don't see this as a problem, it's analogous to std::map and std::unordered_map.
I share this view.
Hmm, It just occurs to me that for my application I could void the N^2 size issue of a split_interval_map
> with a non_combining_map that allowed me to iterate through all items that overlap with an arbitrary interval in and ordered fashion. That's not much different from a std::map , but with some more interval specific wrapping of the upper/lower_bound/equal_range methods. - What is your evaluation of the documentation?
I was able to get up and running fairly quickly. As discussed some of the colorful terminology got in the way rather than helping. Also I had expected to just be able to populate with std::inserter, and had trouble finding docs related to this. The example for copy/transform helped, but even during this review I have trouble finding the docs for itl::adder/inserter.
True, escaped my notice. Has to be added.
- What is your evaluation of the potential usefulness of the library?
Very, I've come across needs for this capability in 3 different jobs in 3 very different industries. I've had domain specific solutions and was looking to create just such a generic library when Joachim submitted his library.
That's interesting. I had the basic ideas and first implementation in 1999 / 2000 and I had the immediate impression that it should be made available to the public domain. But due to other projects and commercial copyright of the first version, I did not have the opportunity to make it open source for years. I always thought someone else would be faster than me ;)
- Did you try to use the library? With what compiler? Did you have any problems?
Yes, the library is being used in a commercial application built with CodeWarrior 9.4 on Mac/Windows and with VC8 and XCode 3.x. The library required some modifications to the template declarations for CodeWarrior.
Is it now fully or still only partially compilable with CW?
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I read through the documentation, integrated the library with production code, dug through compilation error messages and headers on CodeWarrior to port a portion of the library. I've profiled code using itl to understand performance implications.
- Are you knowledgeable about the problem domain?
Yes, I've designed/implemented/optimized integral open-closed Interval and IntervalSet classes for use in previous projects.
Thanks again for your review and the support that you provided for the ITL library. Cheers, Joachim
participants (5)
-
Hartmut Kaiser
-
Jeff Flinn
-
Joachim Faulhaber
-
John Reid
-
Mathias Gaunard