
On Fri, Dec 21, 2012 at 3:26 AM, Rutger ter Borg <rutger@terborg.net> wrote:
On 2012-12-20 16:45, Beman Dawes wrote:
2) Data compression for pages (less file sizes, less memory usage, ordered data can be compressed very good)
I'm strongly against that. The full rationale for not doing compression is lengthy research paper, but the bottom line is what Rudolf Bayer said so many years ago with regard to prefix and suffix key compression - the increased complexity and reduced reliability makes compression very unattractive.
The problems of compression are closely related to the problems of variable length data. If the application is willing to tolerate sequential (rather than binary) search at the page level, or is willing to tolerate an indexed organization on pages, or even (gasp!) an additional disk access per comparison, these problems aren't necessarily showstoppers. But if some applications won't tolerate even a dispatch (either a virtual function or a hand-rolled dispatch) to select the approach being employed, then the only choice I can see is to provide essentially multiple sets of classes, and that gets complex and messy.
At what point do you expect serialization to be executed?
It is a btree. That's a very specific data structure that is the same in memory and on disk. There is no serialization.
Do pages need to be kept in a serialized state in memory?
It makes no sense to talk about serializing a btree. If you serialized it, it would no longer be a btree. Is it wrong to, say, represent a
page with a std::map in memory, and serialize to a binary page when writing?
That would be very, very, slow because of serialization. Since Bayer invented both the btree and also the red-black tree used to implement maps, presumably he would have combined them if there was any advantage.
In the latter case, compression seems to be less difficult.
One commonly used approach to deal with large volumes of variable length data, such as compressed data, is to put the data in a sequential file that can be accessed randomly, and put the key into a btree that as a payload has only the position (and possibly length) of the actual data in the sequential file. In other words, use the btree as an index to the data rather than a container for the data itself.
3) Ability for user to provide custom read/write mutexes (fake mutex, interprocess mutex, std::mutex)
There is a spectrum of needs. I've seen various designs that are optimal for various points in that spectrum. Can you point to any design that is optimal across the spectrum from single thread, single process, single machine, on up through multi-thread, multi-process, multi-machine?
MVCC for b-trees comes close. See, e.g., http://guide.couchdb.org/draft/btree.html
That looks like the usual design for lock avoidance, but that exacts a performance penalty from apps that don't need locking at all. Also, lock avoidance designs often require writes be atomic, so they don't work in apps that must rely on non-atomic writes, unless some additional mechanism is added. In other words, that's design works well in part of the middle part of the spectrum, but not the whole spectrum. Thanks, --Beman