
2008/5/16, vicente.botet <vicente.botet@wanadoo.fr>:
Hello,
I'm mainly interested in an implementation of set<T> using set<interval<T> and on how it comapres to the stl::set in space and time, when the set are dense or scattered.
I have just take a look to the documentation and the implementation, and I have some questions?
Thank you for looking at my code. I am quite thrilled to get so many good responses. Let me note, before answering your questions, that I followed the boost advice from the web page to post suggestions *before* investing work on adjusting or producing library code to conform the boost standards and requirements, in order to avoid unnecessary work. I know the library might not conform some boost requirements yet, but I am of course happy to adjust the code and design if there is significant interest in my work. To keep the text simple I will talk mostly of sets and interval_sets, when I want to say things about sets/maps and interval_sets/interval_maps because most statements apply to both.
* Why not follow the STL set and map interfaces as far as possible?
I intended to follow the STL and there is a large set of functions and operators common to std::set/map and itl::interval_sets/maps. But there are differences also. (1) One very interesting point is that interval_sets in the design presented do have a twofold character. An interval_set is intended to implement set. Yet at the same time it is a set of intervals. So the question how a function like e.g. find should be implemented has no obvious answer. (2) In my design I am exploiting concepts (Inplace)Addable and (Inplace)Subtractable in many ways. That is I do rely on the existence of a += operator for almost all and -= for many datatypes in my library. So I have added +=, -= and also *= (intersection) to interval_set/map. A propagation of += and -= through container operations is a core element of my design and a base of the useful 'aggregate on overlap' (aggrovering;) mechanics of interval_maps. (see examples party and history from http://sourceforge.net/projects/itl.) I have defined o= operators as memberfunctions because this seemed natural to me. But I think the whole library can be easily refactored, making them global function templates.
* Why do you use virtual functions? Are the classes sets, and interval_base_set public? I think this can and should be avoided.
The original design was classical OO-design, defining an interface class that allows different implementations being used uniformly by a weak polymophic interfacepointer. Furthermore there is a body of baseclass functions that are common to interval_set and split_interval_set. They only vary in the handling of interval borders. This has been expressed by basecalss interval_base_set, interitence and a virtual function 'handle_neighbours'. All those virtual functions can be devirtualized without problems. The interface class templates can be omitted and handle_neighbours might be expressed via a policy template parameter.
* Does it works with boost::interval?
Unfortunately not. Interval containers rely on the notion of open interval bounds. This was necessary to express operations for flotingpoint numbers: interval_set<double> is; is.insert([1,pi)); is.insert([pi,4]); // is=={[1,4]} merging INTENDED is.insert([1,pi)); is.insert((pi,4]); // is=={[1,pi), (pi,4]} must not merge As I've read in the archives an extension for open interval bounds have been suggested for boost::interval in the past.
The benefits of interval sets and maps are 1. The implementation of sets/maps via interval_set/map is very space and also time efficient.
Have you done some messures, in space and time? In the worst case {1, 11, 21...}?
Not yet. Space can be reckoned easily. Wost case is that every interval inserted has exactly one value element x per interval [x,x]. In this case you have double data content and one byte extra for borderinformation so roughly 2 times the space of stl::set. The 'nice' case is that you can represent a set of uncountable values like {[0.0, pi)} quite compact ;)
Do you plan to boostify your code?
I would very much like to boostify the code. I admit thou that I am not yet familiar with all of the boostification ;) I hope guidance will be provided.
I'm no sure interval_set is a model of Unique Associative Container.
Given the following : interval_set<int> is; //not quite: interval_set<interval<int>> is; is.insert([1,3)); is.insert([3,5));
What is the result of assert(find([1,3))!= end); And of assert(find([1,5))== end);
I thought about find more than twice and decided not to provide it. You can say interval_set<int> found; is.intersect(found, rightopen_interval([1,5]); // Intersect to find contents As interval_set<T> implements set<T> if find is implemented it should deliver an iterator pointing to a value, but such iterators are not available. The closest one can provide is an iterator that points to the interval containing the value or end(), if such interval does not exist.
I'm no sure interval_set is a model of Unique Associative Container. No, I'm afraid it is not.
Which operator would you use for set_symmetric_difference? This is a detail but it will be better to provided the function version of this operators as well. don't know, probably function name. Are there boost suggestions for that?
split_interval_map: A split_interval_map implements a map as a map of intervals and associated values. For dealing with the insertion of overlapping intervals the principle of "aggregation on overlap" has proved to be very useful: It is shown by example:
typedef set<string> guests; split_interval_map<time, guests> party; guests mary; mary.insert("Mary"); guests harry; harry.insert("Harry"); party.insert(make_pair(rightopen_interval(20:00, 22:00), mary)); party.insert(make_pair(rightopen_interval(21:00, 23:00), harry)); // party now contains [20:00, 21:00)->{"Mary"} [21:00, 22:00)->{"Harry","Mary"} //guest sets aggregated on overlap [22:00, 23:00)->{"Harry"}
As can be seen from the example a split_interval_map has both a decompositional behavior (on the time dimension) as well as a accumulative one (on the associated values).
Quite interesting!
Quality, validation, portability:
The library code of the itl has been validated with a law based test automaton (LaBatea;) which is included in the current release. It includes a testcase generator and law tester that checks laws like e.g. commutativity<interval_set<double>, +=>: the commutativity of interval_set<double> w.r.t. the operation +=.
The code has been compiled and tested with three compilers (gcc 3.4.4, conceptgcc-boostcon, msvc 8.0 sp1) and with three implementations of the stl (linux/gnu, stlport, msvc8) under suse linux 10.1. and ms-windows xp.
I'd be happy if you found interval conainers as presented in this post an interesting contribution to look at.
I'm sure that this library will be a good candidate to boost.
If so, I am looking forward to an inspiring discussion.
Best regards
Vicente Botet
Thank you for your good questions and valuable feedback on my work. Best regards Joachim Faulhaber