On Wed, Sep 15, 2010 at 10:40 PM, Cory Nelson
On Wed, Sep 15, 2010 at 6:08 PM, Beman Dawes
wrote: On Wed, Sep 15, 2010 at 7:26 PM, Cory Nelson
wrote: - Will this support multiple B-trees in one file?
Not currently. That would be relatively easy to support. What is the motivating use case?
Just ease of distribution, really.
I'm currently using a SQLite database to distribute a few tables of read-only data that doesn't need SQL, relational, or ACID properties -- it's just the best thing available in terms of ease of distribution and portability. I'd love to remove all the unneeded layers and just use a B-tree directly.
Sounds quite useful. I can think of several past apps that would have benefited. I've added this to the Wish List: Multiple B-trees in one file. Requested by Cory Nelson. Possible design: Initial btree_map in the file is an index of its children. Key: name, Data: page id of the child's header page. Recursion allowed; the header page pointed could itself be an index header. Current interface might gain additional constructor and open() overload taking a ref to the parent index and the desired name.
- Will this implement ACID properties?
No. The model is a standard library associative container, not a database. ACID properties can be built on top of B-trees, but the B-trees themselves don't have ACID properties, other than a few aspects like atomic writes.
Alright, that clears up my understanding about the scope of this. Still sounds great!
- Can this be adapted for in-memory use as well, with full non-POD support?
No current plans for that. Why wouldn't you just a standard library associative container for that?
Storing keys together improves cache usage a lot for small keys, and allows the CPU to detect ordered iteration to prefetch the next cache line. It also reduces fragmentation and the maximum number of pages that must be touched to find a result, which in turn reduces TLB misses and page faults. The tests here indicate that this can be very significant with large collections:
That's also the point made by the article Thorsten referenced, http://queue.acm.org/detail.cfm?id=1814327, although it is claiming much greater speedups IIRC.
http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.htm...
Interesting. So someone has already done the work for drop in STL associative container replacements. The timings only covered small trees. 16,000 was the number of elements mentioned. I'm testing with up to 100 million elements, and plan to expand that to a few billion elements. Thanks, --Beman