
Robert Ramey wrote:
I guess I have two questions: 1. Won't serialization fail in some way if I just seek the stream to the position found in index and try reading. 2. For random access I need to make sure that all saved objects have export key. How do I do that? Not necessary out-of-the box but where I can plug the check?
Random access into an archive would require some thought. First, it's not clear how it would be used. In general archives are nested structures, one could de-serialize an inner piece - but to where? The original data structure was embedded in something else which is not now there. So this can really only consider in the context of a specific application. Here are a couple of plausible scenarios.
a) A general purpose archive browser - this could browse the archive in a random manner but wouldn't actually de-serialize the data. I don't see any real problem here. One would need to create an index either as a side effect of archive create or with a second pass over the final archive.
I'd rather like to deserialize the data when needed.
b) using serialization for a data-state logging.
Log(Archive ar, statedata){ // save seek point to index // append to archive ar << statedata; }
Recover(Archive ar, statedata &, seekpoint) // set stream seekpoint ar >> statedata }
I could envision something like this being made to work.
Yes, I look for something like this.
So I would say that generally, serialization would be expected to be a serial operation. (surprise!). On the other hand, in certain special situations it might be possible/convenient to provide for some random access but I would expect that to be application specific.
Hmm... gotta look into this. What about my second question:
2. For random access I need to make sure that all saved objects have export key. How do I do that? Not necessary out-of-the box but where I can plug the check?
one for dynamic arrays with element-wise save
I would envision one using
ar << element_count For(i = 0; I < element_count; i++) ar << element[i]
I'm not sure that its worth adding and documenting such an obvious and trivial thing as a separate wrapper.
But on loading you need to add 'new[]' call. So load and save become non-symmetric, and you need to split serialize, which is rather inconvenient.
and another for dynamic arrays with binary save.
I see the binary object as filling that role.
Ar << make_binary_object(dynamic_array_address, element_size * element_count);
I once did propose a manual section name "Serialization Wrappers" which would have a couple of examples and highlight the fact that nvp and binary_object are instances of this concept. The idea was received unenthusiastically at the time but now I'm becoming convinced It's more effective to explain and illustrate the concept rather than try to anticipate all the wrappers that users might need. Actually, I think having such a section would have avoided the confusion surrounding the intended usage and purpose of binary_object.
Sure, docs rarely hurt.
I agree. I actually have a *crazy* but cute idea that one can use file offset for object id. How object ids are assigned and can I customize that process? That would keep overhead at absolute minimum.
Object_id are assigned sequentially starting with 0. They are only used for classes which require tracking (e.g. when instances are serialized through pointers) they are used as indices into a vector so using these indices keeps overhead to a minimum. There are cases when an object id is assigned but not written to the archive I don't see these as having much utility outside of serialization.
So, it's not easily possible to plug a different algorithm?
Ah, I've missed that. Do I need to provide both 'type' and 'value'? Can't serialization library work with just one?
Could be. I just provided both so that I could interoperate with mpl without having to think about each specific case.
But for user this can be inconvenient.
Actually, my archive initially have only one (non-templated) 'save' for unsigned. I got compile error until I've declared 'save' for const char*. I'm not sure why.
Its probably because of the above (pointer to a primitive type). Attempts to serialize instances of types whose implementation level is set to not-serializable will result in compile time assertions. (these assertions are sometimes the reason for the deep nested mpl).
In fact, it looked like the library tried to save char* somewhere.... I'll take a second look.
Right. I think this problem can be addressed with a wrapper for dynamic array
char* str(0); ar >> make_dynarray_wrapper(str)
so that the library allocates the string itself.
How does it know what size to make the array? Maybe you meen
Ar >> make_dynarray_wrapper(str, number of elements * element size)
I actually meant. int size; char* str; ar & make_dynarray_wrapper(str, size); On load, the size is initialized to the size of data and 'str' is new[]-ed.
Of course for char arrays one could use
Ar << binary_object(str, element_count * element_size)
And it would still be portable.
Again, how would I load data? I'd need to new[] the array myself and this leads to split serialization.
I have recently got the polymorphic array working on my machine. Only a couple really small changes to the library code were required. For the list of primitive types I included all portable C++ primitive types. (that is no long long, __int64 etc.) I'm not sure how much interest the polymorphic archive will engender and it's a little hard to understand until you've actually spent enough time with the library to appreciate the limitations of templated code in certain scenarios. So although its clean (and clever) its not going to be easy to explain.
These are great news. It it possible that you make BOOST_EXPORT always register classes with the polymoprhic archive and use polymoprhic archive as fallback for serializing classes? - Volodya