
Beman et al, Is the Boost B-tree Library still on track to be proposed? The reception at BoostCON 2011 was quite warm, IIRC. http://github.com/boostcon/2011_presentations/raw/master/tue/proposed_b_tree... Activity on github seems to be dried up. https://github.com/Beman/Boost-Btree Also, how well do you think it would work with large keys and small payloads (512 byte keys, a few integers for payload). Thanks! Joel

On Tue, Feb 21, 2012 at 8:15 PM, Joel <jdy@cryregarder.com> wrote:
Beman et al,
Is the Boost B-tree Library still on track to be proposed? The reception at BoostCON 2011 was quite warm, IIRC.
http://github.com/boostcon/2011_presentations/raw/master/tue/proposed_b_tree...
Activity on github seems to be dried up.
I'm totally distracted by other priorities. The hangup is variable length data, particularly strings. I still don't think I've got the right solution to that problem.
Also, how well do you think it would work with large keys and small payloads (512 byte keys, a few integers for payload).
I haven't tested that particular case with the current implementation, but in general large keys relative to payload isn't a problem. Page sizes need to be set realistically of course. Cheers, --Beman

Beman Dawes <bdawes <at> acm.org> writes:
The hangup is variable length data, particularly strings. I still don't think I've got the right solution to that problem.
Is that really a critical shortfall for b-trees? I know it is important for general purpose containers. Also, do you see fixing variable length data storage as an interface changer? Could the library go in using just fixed length records? Thanks for the update! I hope it bubbles up to the top of your queue again some year :-) Joel

Beman Dawes <bdawes <at> acm.org> writes:
I'm totally distracted by other priorities.
No problem, I've been using the btree library heavily for the past week or so with boost 1.47 on a linux machine with 24G RAM. I'm using 128bit keys and 196bit payload. I've successfully built files of 100,000,000 records. Past that point it slows down dramatically because RAM is exhausted and the system spends all its time writing to disk. I don't view this as a problem with the library. What I am looking at doing is building the database in chunks on multiple machines, and then merging the chunks (and then perhaps sharding the results into smaller pieces again). Do you have any suggestions on the most efficient way of merging two btree files? In your expectation, is it better to merge big into small, small into big, or both into new? Or should I write a low level merge that operates at the node level? The goal is to be able to support 1-10 Gig of records. Also, is their a way to destroy the b-tree WITHOUT flushing the buffers to disk? There are times when I've built a temporary b-tree finished exploiting it and don't need to get it written out. I am enjoying the library. Thanks! Joel
participants (3)
-
Beman Dawes
-
Joel
-
Joel Young