[btree] In-memory B-tree experimental work

There was quite a bit of speculation at BoostCon about the performance of an in-memory B-tree based ordered associative container. I've done enough experimental work to report some real numbers for an in-memory B-tree map implementation vs std::map: inserting 10000000 elements... btree: 1985589.4 per sec, 81.7% faster than stl stl: 1092714.3 per sec, 45.0% slower than btree iterating over 6321389 elements... btree: 588200694.1 per sec, 2323.5% faster than stl stl: 24270489.8 per sec, 95.9% slower than btree finding 10000000 elements... btree: 2468384.9 per sec, 109.3% faster than stl stl: 1179177.8 per sec, 52.2% slower than btree erasing 10000000 elements... btree: 2444250.3 per sec, 150.7% faster than stl stl: 974793.1 per sec, 60.1% slower than btree Those timings are a simple map<long, int>, and are probably about the best case for the B-tree implementation. In a similar test of a map<char*, int>, i.e. an indirect map using pointers to strings containing the key, the memory B-tree was only slightly faster than the STL. A typical red-black container has an overhead of something like 3 pointers per node, IIRC, so memory use for the B-tree container is presumably a lot less that the STL container, even if the B-tree container has not been packed. Unlike the disk-resident B-tree implementation, the in-memory tree is much closer to the STL interface. The only significant differences are that insert/emplace/erase always invalidates iterators and Key must be Assignable, not just CopyConstructible. The code can be viewed or downloaded at https://github.com/Beman/memory-btree It is currently C++0x only, and tested only with VC++ 10 and GCC 4.6. Since this is the first time I've used move semantics, I may not be getting full mileage out of r-value references yet. There is no documentation yet. I need to set this work aside for awhile to get on with other projects. But I thought there might be interest in the timings. It would also be interesting to push the same set of random numbers into a std::vector, sort it, then run find, iterate, and erase, then look at those timings. --Beman
participants (1)
-
Beman Dawes