Re: [boost] [BTREE] Get BTREE library into boost - Market Research

21 Dec 2012


      On 2012-12-20 16:45, Beman Dawes wrote:
...
...
2) Data compression for pages (less file sizes, less memory usage,
ordered data can be compressed very good)
I'm strongly against that. The full rationale for not doing
compression is lengthy research paper, but the bottom line is what
Rudolf Bayer said so many years ago with regard to prefix and suffix
key compression - the increased complexity and reduced reliability
makes compression very unattractive.
The problems of compression are closely related to the problems of
variable length data. If the application is willing to tolerate
sequential (rather than binary) search at the page level, or is
willing to tolerate an indexed organization on pages, or even (gasp!)
an additional disk access per comparison, these problems aren't
necessarily showstoppers. But if some applications won't tolerate even
a dispatch (either a virtual function or a hand-rolled dispatch) to
select the approach being employed, then the only choice I can see is
to provide essentially multiple sets of classes, and that gets complex
and messy.
At what point do you expect serialization to be executed? Do pages need 
to be kept in a serialized state in memory? Is it wrong to, say, 
represent a page with a std::map in memory, and serialize to a binary 
page when writing? In the latter case, compression seems to be less 
difficult.
...
...
3) Ability for user to provide custom read/write mutexes (fake mutex,
interprocess mutex, std::mutex)
There is a spectrum of needs. I've seen various designs that are
optimal for various points in that spectrum. Can you point to any
design that is optimal across the spectrum from single thread, single
process, single machine, on up through multi-thread, multi-process,
multi-machine?
MVCC for b-trees comes close. See, e.g., 
http://guide.couchdb.org/draft/btree.html

Cheers,

Rutger

Re: [boost] [BTREE] Get BTREE library into boost - Market Research

Rutger ter Borg