
Hi Zach, 2009/6/11 Zachary Turner <divisortheory@gmail.com>
I know there's over a zillion bitset implementations, but often times I have found myself in need of a bitset that is efficient (space-wise) at storing millions, billions, or even trillions (no joke) of bits. For this I have developed an in-house class which stores bits in a structure vaguely reminiscent of a "T-Tree" (http://en.wikipedia.org/wiki/T_tree). You can think of it basically as an std::map<boost::uint64_t, boost::uint64_t> where <key, value> represents the *interval* of bits [key, key+value-1]. So it's basically run-length encoded, with logarithmic lookups instead of linear lookups.
I have developed a very similar class coming form a different background. Working on problems of date and time, I came up with a collection of interval set and map class templates, that I have called Interval Template Library (ITL). http://www.herold-faulhaber.de/boost_itl/doc/libs/itl/doc/html/ It is interesting to see that the same useful data types occur and reoccur from different problem domains. Fortunately boost is a good place to collect, discuss and develop those useful data types. The compressed_bitset that you are presenting seems to be similar to the ITL's interval_set<T>. As with your class the idea is to represent large sets in a compact way using intervals that are efficiently organized using a balanced tree. May be, a look at my library can be helpful for your project. enjoy Joachim --- The current sources of the ITL are in the boost sandbox: https://svn.boost.org/svn/boost/sandbox/itl/ The latest release (3.0.0) is in the boost vault: http://sourceforge.net/projects/itl (section containers). Release 2.0.1 is available form sourceforge: http://sourceforge.net/projects/itl Online documentaiton at: http://www.herold-faulhaber.de/boost_itl/doc/libs/itl/doc/html/