
Stefan Strasser wrote:
[...]
with very poor performance, but that's mostly due to the current storage backend, using one file for each object, so you end up with thousands of files, a few bytes each.
I think it is OK for a prototype to be slow. Postgres uses a hierarchy of caches for all the data stored it tables. And when it does read from the disk it reads a "page" that may contain data that belongs to another tuples in addition to the data for the target tuple. Other databases probably do something similar. The data read from the disk is not transformed unless some logic requires value of the exact attribute of a tuple. To get an attribute value you have to ask the cache subsystem for the attribute. During this call the value may be transformed from the on disk representation or may be just found in the cache. I know that writes are also done in pages. But I do not know the exact details. What I'm trying to say is that the performance can be considerably improved without changing the user interface but a fast implementation will probably require a lot of effort. Thought unless you track individual fields you will have to load objects as a whole. And you will probably have to write your own serialization layer as the boost::serialization library will not allow you to do this kind of optimizations. Ilya Bobir