Interest in B-tree library for Boost?
A Boost B-tree library would provide disk-based associative containers that scale all the way from really, really, small to really, really, large. B-trees perform well on hardware ranging from ancient floppy disk drives all the way up to humongous disk arrays. They are the technology behind most high-performance disk file systems and databases. If you don't know what a B-tree is, or think that "B-tree" is an abbreviation for "Binary tree", you might want to read the Wikipedia B-Tree article: http://en.wikipedia.org/wiki/B-tree. Knuth and other computer science texts also supply descriptions. See http://mysite.verizon.net/beman/btree/index.html for a more complete description of the proposed library. A preliminary implementation is available. Download http://mysite.verizon.net/beman/btree/btree-preview-1.zip, and unpack into a current Boost distribution. A few existing Jamfiles will be overwritten. The only compilers tested and working so far are VC++ 10 and GCC/MinGW 4.5. You can view the library online at http://github.com/Beman/Boost-Btree There is no detailed documentation yet, but if you know how to use std::map and other standard library associative containers, you already know most of what you need to know. Critical differences are documented in the Library Characteristics section of the index.html page referenced above. Initial timing tests have been very encouraging. The time test compares the btree_map implementation to std::map. As expected, a btree_map is many times slower than a std::map, given a cold operating system file cache and constraining the btree_map's caching to a minimum. But at the other extreme, when the O/S file cache is warm and the btree_map's cache is as large as the file, the btree_map is very similar in speed to std::map.The btree library's code is still very new, so timings and everything else may be subject to lots of revision. Any interest? --Beman
On 09/15/2010 06:04 PM, Beman Dawes wrote:
A Boost B-tree library would provide disk-based associative containers that scale all the way from really, really, small to really, really, large. B-trees perform well on hardware ranging from ancient floppy disk drives all the way up to humongous disk arrays. They are the technology behind most high-performance disk file systems and databases.
Any interest?
--Beman _______________________________________________
The documentation says "Types supplied by the |Key| and |T| template
parameters must be memcpyable".
So I guess I could not use a nested btree like this
btree
Hello,
On Wed, Sep 15, 2010 at 10:41 PM, Roland Bock
On 09/15/2010 06:04 PM, Beman Dawes wrote:
A Boost B-tree library would provide disk-based associative containers that scale all the way from really, really, small to really, really, large. B-trees perform well on hardware ranging from ancient floppy disk drives all the way up to humongous disk arrays. They are the technology behind most high-performance disk file systems and databases.
Any interest?
--Beman _______________________________________________
The documentation says "Types supplied by the Key and T template parameters must be memcpyable".
So I guess I could not use a nested btree like this
btree
>
How about using regular b-tree and an allocator that gives you memory from a memory mapped file. More like boost::intrusive. For ACID properties, we are using BerkeleyDB 4.8+ which has an STL interface to hash/btree/recno/queue data structures on disk. Since BerkeleyDB is not free for commercial use, there is Tokyo Cabinet but lacks STL interface. IMHO, a good GSoc project would be to provide an STL interface to Tokyo Cabinet. With that, there will be a good alternative to BerkeleyDB based on LGPL (as Tokyo Cabinet is LGPL). -dhruva PS: I do not have any personal stake in either BerkeleyDB or Tokyo Cabinet. Just looking for a good alternative that I can develop expertise on and use it.
On Wed, Sep 15, 2010 at 9:04 AM, Beman Dawes
A Boost B-tree library would provide disk-based associative containers that scale all the way from really, really, small to really, really, large. B-trees perform well on hardware ranging from ancient floppy disk drives all the way up to humongous disk arrays. They are the technology behind most high-performance disk file systems and databases.
...
Any interest?
Absolutely! Some questions: - Is concurrent access okay if it's opened as read-only? - Right now it's using synchronous I/O. Can async support (via asio) be added? How about memory-mapped I/O for small B-trees or large address spaces? - Will this support multiple B-trees in one file? - Will this implement ACID properties? - Can this be adapted for in-memory use as well, with full non-POD support? Thanks! -- Cory Nelson http://int64.org
On Wed, Sep 15, 2010 at 7:26 PM, Cory Nelson
- Is concurrent access okay if it's opened as read-only?
Yes, as long as the file isn't being written by a different instance.
- Right now it's using synchronous I/O. Can async support (via asio) be added?
I don't know the answer to that. I would need help from someone who understands async I/O far better than I do.
How about memory-mapped I/O for small B-trees or large address spaces?
I've did some preliminary investigation about five years ago, and memory-mapped I/O looked like it might be cost effective. But I'd give it a lower priority than adding support for variable length keys/mapped_types or concurrency.
- Will this support multiple B-trees in one file?
Not currently. That would be relatively easy to support. What is the motivating use case?
- Will this implement ACID properties?
No. The model is a standard library associative container, not a database. ACID properties can be built on top of B-trees, but the B-trees themselves don't have ACID properties, other than a few aspects like atomic writes.
- Can this be adapted for in-memory use as well, with full non-POD support?
No current plans for that. Why wouldn't you just a standard library associative container for that? Cheers, --Beman
On Wed, Sep 15, 2010 at 6:08 PM, Beman Dawes
On Wed, Sep 15, 2010 at 7:26 PM, Cory Nelson
wrote: - Will this support multiple B-trees in one file?
Not currently. That would be relatively easy to support. What is the motivating use case?
Just ease of distribution, really. I'm currently using a SQLite database to distribute a few tables of read-only data that doesn't need SQL, relational, or ACID properties -- it's just the best thing available in terms of ease of distribution and portability. I'd love to remove all the unneeded layers and just use a B-tree directly.
- Will this implement ACID properties?
No. The model is a standard library associative container, not a database. ACID properties can be built on top of B-trees, but the B-trees themselves don't have ACID properties, other than a few aspects like atomic writes.
Alright, that clears up my understanding about the scope of this. Still sounds great!
- Can this be adapted for in-memory use as well, with full non-POD support?
No current plans for that. Why wouldn't you just a standard library associative container for that?
Storing keys together improves cache usage a lot for small keys, and allows the CPU to detect ordered iteration to prefetch the next cache line. It also reduces fragmentation and the maximum number of pages that must be touched to find a result, which in turn reduces TLB misses and page faults. The tests here indicate that this can be very significant with large collections: http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.htm... -- Cory Nelson http://int64.org
On Wed, Sep 15, 2010 at 10:40 PM, Cory Nelson
On Wed, Sep 15, 2010 at 6:08 PM, Beman Dawes
wrote: On Wed, Sep 15, 2010 at 7:26 PM, Cory Nelson
wrote: - Will this support multiple B-trees in one file?
Not currently. That would be relatively easy to support. What is the motivating use case?
Just ease of distribution, really.
I'm currently using a SQLite database to distribute a few tables of read-only data that doesn't need SQL, relational, or ACID properties -- it's just the best thing available in terms of ease of distribution and portability. I'd love to remove all the unneeded layers and just use a B-tree directly.
Sounds quite useful. I can think of several past apps that would have benefited. I've added this to the Wish List: Multiple B-trees in one file. Requested by Cory Nelson. Possible design: Initial btree_map in the file is an index of its children. Key: name, Data: page id of the child's header page. Recursion allowed; the header page pointed could itself be an index header. Current interface might gain additional constructor and open() overload taking a ref to the parent index and the desired name.
- Will this implement ACID properties?
No. The model is a standard library associative container, not a database. ACID properties can be built on top of B-trees, but the B-trees themselves don't have ACID properties, other than a few aspects like atomic writes.
Alright, that clears up my understanding about the scope of this. Still sounds great!
- Can this be adapted for in-memory use as well, with full non-POD support?
No current plans for that. Why wouldn't you just a standard library associative container for that?
Storing keys together improves cache usage a lot for small keys, and allows the CPU to detect ordered iteration to prefetch the next cache line. It also reduces fragmentation and the maximum number of pages that must be touched to find a result, which in turn reduces TLB misses and page faults. The tests here indicate that this can be very significant with large collections:
That's also the point made by the article Thorsten referenced, http://queue.acm.org/detail.cfm?id=1814327, although it is claiming much greater speedups IIRC.
http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.htm...
Interesting. So someone has already done the work for drop in STL associative container replacements. The timings only covered small trees. 16,000 was the number of elements mentioned. I'm testing with up to 100 million elements, and plan to expand that to a few billion elements. Thanks, --Beman
El 16/09/2010 3:08, Beman Dawes escribió:
- Can this be adapted for in-memory use as well, with full non-POD support?
No current plans for that. Why wouldn't you just a standard library associative container for that?
I think it's about performance/node overhead (less rebalances, you allocate arrays and not individual nodes). But for memory, T-Trees are the way to go. They are used by many in-memory DBs. http://en.wikipedia.org/wiki/T-tree "In computer science a T-tree is a type of binary tree data structure that is used by main-memory databases, such as Datablitz, eXtremeDB, MySQL Cluster, Oracle TimesTen and KairosMobileLite" Best, Ion
2010/9/16 Ion Gaztañaga
El 16/09/2010 3:08, Beman Dawes escribió:
- Can this be adapted for in-memory use as well, with full non-POD support?
No current plans for that. Why wouldn't you just a standard library associative container for that?
I think it's about performance/node overhead (less rebalances, you allocate arrays and not individual nodes). But for memory, T-Trees are the way to go. They are used by many in-memory DBs.
http://en.wikipedia.org/wiki/T-tree
"In computer science a T-tree is a type of binary tree data structure that is used by main-memory databases, such as Datablitz, eXtremeDB, MySQL Cluster, Oracle TimesTen and KairosMobileLite"
Correct me if I'm wrong, but it looks like a T-tree is just a plain binary tree where the data is referenced with pointers and not stored in the tree -- so it's designed to be memory-efficient when the key is larger than a pointer and is already stored elsewhere. Useful for databases that want low-overhead (in terms of key size, not tree implementation) indexes into tables. If that's the case, then it is not applicable here. -- Cory Nelson http://int64.org
El 16/09/2010 13:21, Cory Nelson escribió:
Correct me if I'm wrong, but it looks like a T-tree is just a plain binary tree where the data is referenced with pointers and not stored in the tree -- so it's designed to be memory-efficient when the key is larger than a pointer and is already stored elsewhere. Useful for databases that want low-overhead (in terms of key size, not tree implementation) indexes into tables.
If that's the case, then it is not applicable here.
Yes, you are right, but you could just store values and move semantics would make data movement quite fast (after all, std containers dont' store pointers, but values). Node overhead would be better than with std containers and searches faster (I think you need less comparisons). Insertion and erasure would be slower, I guess. Ion
On Wed, Sep 15, 2010 at 9:04 AM, Beman Dawes
A Boost B-tree library would provide disk-based associative containers that scale all the way from really, really, small to really, really, large. B-trees perform well on hardware ranging from ancient floppy disk drives all the way up to humongous disk arrays. They are the technology behind most high-performance disk file systems and databases.
Hi Beman, any updates on this? Or are you too busy with the coming Boost release? -- Cory Nelson http://int64.org
On Tue, Nov 16, 2010 at 7:04 AM, Cory Nelson
On Wed, Sep 15, 2010 at 9:04 AM, Beman Dawes
wrote: A Boost B-tree library would provide disk-based associative containers that scale all the way from really, really, small to really, really, large. B-trees perform well on hardware ranging from ancient floppy disk drives all the way up to humongous disk arrays. They are the technology behind most high-performance disk file systems and databases.
Hi Beman, any updates on this? Or are you too busy with the coming Boost release?
Haven't done anything in the last month, due mostly to a major house renovation in-process, and helping with the Library Working Group organization during the C++ committee meeting last week. You can see earlier progress by taking a look at https://github.com/Beman/Boost-Btree/commits/master. A lot of effort has gone into test and timing code, and it has been awhile now since any bugs surfaced. My next work item is support for variable length data. It will probably be another month before there is much to report on that. In the meantime, see boost/btree/detail/fixstr.hpp for a little string class that has a fixed maximum length. Although it wastes space compared to true variable length I/O-able data, it would be fine for loads of applications. --Beman
participants (5)
-
Beman Dawes
-
Cory Nelson
-
dhruva
-
Ion Gaztañaga
-
Roland Bock