
Has any disk Btree library been proposed to Boost ? Can you point out which Btree libraries do you like/use . Is there interest for adding such a library ? thanks jose

Hi, I was wondering the same thing a while ago, and began working on one based on a Btree class I had wrote several years ago. I was considering using the boost serialization library to store objects along with a key. I had it basically working but it needs more work and have been bust lately (moving, new job, etc). Joe On 10/24/06, Jose <jmalv04@gmail.com> wrote:
Has any disk Btree library been proposed to Boost ? Can you point out which Btree libraries do you like/use .
Is there interest for adding such a library ?
thanks jose _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

OK, yesterday the request for quad and oct trees. Today, binary trees. ;-) There was a generic tree lib created during the SOC. You can find it at http://boost-consulting.com/vault/index.php?directory=Containers Beware, this is not for binary tree only. Christian

What do you mean by disk one? On 10/24/06, Jose <jmalv04@gmail.com> wrote:
On 10/24/06, Christian Henning <chhenning@gmail.com> wrote:
OK, yesterday the request for quad and oct trees. Today, binary trees. ;-)
The SOC project is for a memory container not a disk one _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

I think he is talking about BTrees, as in http://en.wikipedia.org/wiki/B-tree, and not normal binary trees. These are used for example by RDBMS engines for index storage. They are self-balancing in that each node can contain a variable number of keys between a given range. Nodes split when they become "too full" (likewise they merge when sufficient space is available through removal of keys). Joe On 10/24/06, Christian Henning <chhenning@gmail.com> wrote:
What do you mean by disk one?
On 10/24/06, Jose <jmalv04@gmail.com> wrote:
On 10/24/06, Christian Henning <chhenning@gmail.com> wrote:
OK, yesterday the request for quad and oct trees. Today, binary trees. ;-)
The SOC project is for a memory container not a disk one _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Yes, Btree as in http://en.wikipedia.org/wiki/B-tree I think a Btree would be a nice addition to Boost. I am looking for a generic B-tree that I can use to store any C++ object, with variable length key and data fields On 10/24/06, Joe Drumm <joe.drumm@gmail.com> wrote:
I think he is talking about BTrees, as in http://en.wikipedia.org/wiki/B-tree, and not normal binary trees. These are used for example by RDBMS engines for index storage. They are self-balancing in that each node can contain a variable number of keys between a given range. Nodes split when they become "too full" (likewise they merge when sufficient space is available through removal of keys).
Joe

Jose wrote:
Yes, Btree as in http://en.wikipedia.org/wiki/B-tree I think a Btree would be a nice addition to Boost.
I am looking for a generic B-tree that I can use to store any C++ object, with variable length key and data fields
I considered submitting a Boost B-tree a long time ago, and did a trial implementation. But I ran into a lot of issues, so scraped it. At the beginning of the past summer I looked at B-trees again, this time using the Lehman and Yao locking algorithm. It looks doable, but would take more time than I have available. I also came to the conclusion that C++0x Concepts would be a great help in the design. There are a bunch of types involved, and I was having a trouble getting the interrelationships right. With Concepts, the compiler will tell me if I mess up, speeding the development process. Because my interest is high performance B-trees with binary data formats portable to any system (regardless of compiler, endianness, alignment, or other constraints), I was not trying to handle any random C++ object. I was, however, designing in the ability to handle variable length key and data fields. If you have experience implementing C++ B-trees and are interested in submitting something to Boost, I'd like to hear you ideas. --Beman

Beman Dawes <bdawes@acm.org> writes:
Jose wrote:
Yes, Btree as in http://en.wikipedia.org/wiki/B-tree I think a Btree would be a nice addition to Boost.
I am looking for a generic B-tree that I can use to store any C++ object, with variable length key and data fields
I considered submitting a Boost B-tree a long time ago, and did a trial implementation. But I ran into a lot of issues, so scraped it.
At the beginning of the past summer I looked at B-trees again, this time using the Lehman and Yao locking algorithm. It looks doable, but would take more time than I have available.
I also came to the conclusion that C++0x Concepts would be a great help in the design. There are a bunch of types involved, and I was having a trouble getting the interrelationships right. With Concepts, the compiler will tell me if I mess up, speeding the development process.
You can get pretty close using the concept checking library today. It's terribly underused, is capable of uncovering nearly all the same kinds of problems, and the syntax of use has recently become lots more like C++0x (not documented yet, though). -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
Beman Dawes <bdawes@acm.org> writes:
Jose wrote:
Yes, Btree as in http://en.wikipedia.org/wiki/B-tree I think a Btree would be a nice addition to Boost.
I am looking for a generic B-tree that I can use to store any C++ object, with variable length key and data fields I considered submitting a Boost B-tree a long time ago, and did a trial implementation. But I ran into a lot of issues, so scraped it.
At the beginning of the past summer I looked at B-trees again, this time using the Lehman and Yao locking algorithm. It looks doable, but would take more time than I have available.
I also came to the conclusion that C++0x Concepts would be a great help in the design. There are a bunch of types involved, and I was having a trouble getting the interrelationships right. With Concepts, the compiler will tell me if I mess up, speeding the development process.
You can get pretty close using the concept checking library today. It's terribly underused, is capable of uncovering nearly all the same kinds of problems, and the syntax of use has recently become lots more like C++0x (not documented yet, though).
Good point. Now when the docs get updated...:) --Beman

On 10/25/06, Beman Dawes <bdawes@acm.org> wrote:
Because my interest is high performance B-trees with binary data formats portable to any system (regardless of compiler, endianness, alignment, or other constraints), I was not trying to handle any random C++ object. I was, however, designing in the ability to handle variable length key and data fields.
I have the same interests/requirements. As for defining the classes, I like the approach followed by GigaBASE, where you use type descriptors for all classes used in the database (see example below or http://www.garret.ru/~knizhnik/gigabase/GigaBASE.htm ) class Contract { public: dbDateTime delivery; int4 quantity; int8 price; dbReference<Detail> detail; dbReference<Supplier> supplier; TYPE_DESCRIPTOR((KEY(delivery, INDEXED), KEY(quantity, INDEXED), KEY(price, INDEXED), RELATION(detail, contracts), RELATION(supplier, contracts))); }; If you have experience implementing C++ B-trees and are interested in
submitting something to Boost, I'd like to hear you ideas.
Although my experience is limited, I would be interested to contribute/test any new implementation you or others propose. I know of one good C++ implementation that requires XFS filesystem but I am looking for one implementation that works with any filesystem so that it is "portable to any system". Let me know if you plan to continue your work! regards jose

What do you mean by disk one?
I guess the OP is thinking of a particular use of B-tree's where each node of the B-tree is ~one disk cluster. I think this is commonly used to optimize disk access for data in databases, filesystems etc when the B-tree is very large and cannot completely be stored in-memory.

On 10/24/06, Christian Henning <chhenning@gmail.com> wrote:
What do you mean by disk one?
A B-tree is not your typical binary tree - it is a variation on that is usually only seen on disk (databases, filesystems, etc). It doesn't see much use in memory. See http://en.wikipedia.org/wiki/B-tree I would be interested in a nice B-tree implementation, though I'm not sure if it would belong in Boost.
On 10/24/06, Jose <jmalv04@gmail.com> wrote:
On 10/24/06, Christian Henning <chhenning@gmail.com> wrote:
OK, yesterday the request for quad and oct trees. Today, binary trees. ;-)
The SOC project is for a memory container not a disk one _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Cory Nelson http://www.int64.org

Jose wrote:
On 10/24/06, Christian Henning <chhenning@gmail.com> wrote:
OK, yesterday the request for quad and oct trees. Today, binary trees. ;-)
The SOC project is for a memory container not a disk one
Not strictly true. The SoC project is for consistent algorithms and concepts for any kind of tree. The idea is the same as for the current STL algorithms, of working with any form of container. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

Am Dienstag, den 24.10.2006, 20:40 -0500 schrieb Rene Rivera:
Jose wrote:
The SOC project is for a memory container not a disk one
Not strictly true. The SoC project is for consistent algorithms and concepts for any kind of tree. The idea is the same as for the current STL algorithms, of working with any form of container.
Admittedly, there isn't much B-tree related code in the repository yet; to the not-so involved, the current strategy is having a b_tree adaptor wrap around a multiway_tree adaptor wrapping around an nary_tree, and all that's there as of yet are some experiments implementing the latter two. The idea is, however, providing common bases to different applications (a B-tree, like a B*-tree, both being multiway trees) such that this framework maps relations between different trees somewhat consistently. That said, any contributions re the "disk" part would be very welcome, as I don't have any experience with all the locking/invalidating necessary. I was hoping this was/could be made a pure allocator issue, but I'd be glad if anybody enlightened me if there're obstructions to that approach. Bernhard

Bernhard Reiter wrote:
Am Dienstag, den 24.10.2006, 20:40 -0500 schrieb Rene Rivera:
Jose wrote:
The SOC project is for a memory container not a disk one Not strictly true. The SoC project is for consistent algorithms and concepts for any kind of tree. The idea is the same as for the current STL algorithms, of working with any form of container.
Admittedly, there isn't much B-tree related code in the repository yet; to the not-so involved, the current strategy is having a b_tree adaptor wrap around a multiway_tree adaptor wrapping around an nary_tree, and all that's there as of yet are some experiments implementing the latter two. The idea is, however, providing common bases to different applications (a B-tree, like a B*-tree, both being multiway trees) such that this framework maps relations between different trees somewhat consistently.
That said, any contributions re the "disk" part would be very welcome, as I don't have any experience with all the locking/invalidating necessary. I was hoping this was/could be made a pure allocator issue, but I'd be glad if anybody enlightened me if there're obstructions to that approach.
Managing the b-tree page cache is one of the prime difficulties with trying to reuse code developed for in-memory uses. Iterators have to be constant iterators and a separate update() member provided to change the contents of the tree. Otherwise there is no way to know when the page cache is dirty. And if the implementation doesn't know when the cache is dirty, all touched pages have to be rewritten to disk; that is a showstopper performance wise. An implementation using memory-mapped files could avoid that problem, but would be useless for a significant number of real-world apps on 32-bit platforms, because the file size would be limited. Multiple window memory-mapping might work, but that is a lot to bite off. --Beman

Christian Henning wrote:
OK, yesterday the request for quad and oct trees. Today, binary trees. ;-)
IIUC, the original poster didn't ask for a binary trees, but for B-trees (see http://en.wikipedia.org/wiki/B-tree ). Regards, m Send instant messages to your online friends http://au.messenger.yahoo.com

Sorry, I was missreading it. On 10/24/06, Martin Wille <mw8329@yahoo.com.au> wrote:
Christian Henning wrote:
OK, yesterday the request for quad and oct trees. Today, binary trees. ;-)
IIUC, the original poster didn't ask for a binary trees, but for B-trees (see http://en.wikipedia.org/wiki/B-tree ).
Regards, m Send instant messages to your online friends http://au.messenger.yahoo.com _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (10)
-
Beman Dawes
-
Bernhard Reiter
-
Christian Henning
-
Cory Nelson
-
David Abrahams
-
Joe Drumm
-
Jose
-
Martin Wille
-
Rene Rivera
-
Vijai Kalyan