
Vladimir Prus wrote:
Robert Ramey wrote:
-- (critical) The primary issue with the library is lack of reference documentation ...
OK I don't see a problem with this - Thanks to Dave for the hint on how to handle this.
For those stubs, requirements docs that Dave mentioned would be nice. However, there's code that user calls, and I'd like to see docs for that, too.
For example:
binary_object make_binary_object(void* ptr, unsigned size); Returns: binary_object(ptr, size);
binary_object::save(Archive& ar, unsigned version) Effects: Calls ar.save_binary(m_ptr, m_size);
binary_object::load(Archive& ar, unsigned version) Effects: Calls ar.load_binary(m_ptr, m_size);
Aside - I wouldn't expect load/save/load_binary/save_binary to ever be invoked by a library user.
It would be nice to avoid asking the user to do BOOST_CLASS_EXPORT for all possible argument types. What's desirable is:
template<class T> void register_rpc_function(const char* name, void (*f)(const T&)) { functions[name] = ... ; boost::serialization::export_class< function_call_1<T> >::instantiate(); }
I can't put BOOST_CLASS_EXPORT in register_rpc function now.
This situation is discussed in the documentation under the heading "Template Serialization Traits". There is an example that uses assignation of traits to the nvp<T> template. BOOST_CLASS_EXPORT is just a syntactic short hand to for the specialization above. Uh-oh - I just looked at the definition of BOOST_CLASS_EXPORT and I see it's not that obvious as it is in the other traits. I'll take a look at this.
I'm afraid the no matter how smart XML reader is, it still would have to scan the entire file.
So what's wrong with that? I would guess that such an XML reader already exists somewhere and you cat get everything thing for free. Assuming this doesn't address your need, and you want to do some programming, one can create another XML archive version. This would create two output files. One exactly as it is now will another "index". When I was considering the options regarding XML output I briefly considered the possibility of creating two files - one like we have the one now - along with an optional parallel file containing the corresponding XML schema describing the XML archive. I decided to keep things simpler. But I think you see the idea.
I can expand the doc a bit. binary_object is just a wrapper around a size and pointer to permit them to be handled as pair. It presumes the pointer already points to allocated storage.
This last sentence is exactly what I'd like added to docs.
OK that's easy. Sorry for the confusion.
It also presumes that the the size of thing its pointing to is the same on save and load. Its very lightweight.
I don't see a need for the others, but of course any user can make the wrappers he needs.
The reason why I think the other is important, it that's it's actually saving/loading support for plain C++ array -- which is rather basic thing.
Hmmm - the library already implements serialization of plane C++ arrays by serializing each element. This is the more general solution as it calls serialization for each element. The only time one might want to do save/load for a whole array is for a non-portable binary file. Re: save/load asymmetry for polymorphic pointers.
As I've said previously, this is a bug which is easy to make and very hard to debug. I think we can only wait for others to express an opinion, as we fail to convince each other.
Well, THAT we can agree on.
BTW, one usage of XML archives did occur to me. By checking the name tag on input, we can implement a crude check that save and load operations are synchronized. This would effectively a debug mode for archives and might be useful.
But you won't catch a case where you save one type and load another.
True, but I think it would help a lot - and practically free to implement.
So, the get zero overhead I need to tweak base class an disable tracking of pointers. Let me try that... yea, the results are nice. One one extra element (class id) per saved item.
If your not serializing pointers, the class_id isn't written to the archive even once. Object id is required for tracking, class id is required for pointers. If versioning is used, class-id is used once. I do not believe that there is any information stored in an archive which is not used.
BTW, how to I set tracking level and implementation level for a template class. I think I can partially specialize 'tracking_level', but it should be mentioned in the docs.
See docs section "Template Serialization Traits"
- The documentation should really state the minimal set of 'load'/'save' overloads which will make archive usable. For example, it's probably not necessary to provide separate overload for 'bool', right?
The document states:
" However, we're not quite done. The above code addresses serialization of all non-primitive types. To be complete, each primitive type must either be covered by a definition of template<typename T> void load(T & t); or an overload of the >> operator"
It also necessary to provide overload for char*. Does not count as primitive type?
Default implementation for char * will work as it does for other pointers. Which is probably not what one has in mind. I have code in there for handling it as a c string (Its commented out). I tested it and it worked but I came to conclude it presented a big security risk. The problem is the following: char * str = "abc" ar << str; // no problem the (text) archive looks like 3 abc later: char str[MAX_STRING_SIZE] ar >> static_cast<char *>(str); // to avoid str being treated as an array suppose the text archive gets corrupted to: 3000 abc............ The archive will load with a buffer overrun - a security risk. So then one should dynamically allocate the storage according to the size - that is one should be using the std::string . So I decided to comment out the code that handles char *.
" If all primitive types have been accounted for, any program with serialization defined should work with the new archive."
Maybe I can expand upon that a little to something like:
"Any program with serialization defined should work with the new archive as long as every primitive type has a matching save/load function prototype or template."
Can I define only one 'load' for unsigned int?
As oppose to ? They way I implemented the included archives I specified load for those requiring special treatment and used a template as a fall back for the rest. BTW this provided a huge benefit. In the original version of last year I got into a never-ending battle to specify virtual functions which was dependent on the compiler - long long, etc. It was hopeless - moving to templates solved that.
I've concluded this myself. Its pretty easy given the bjam setup.
Maybe
it can be made even easier with a bjam argument or environmental variable. It just needs to be explained.
Right.
BTW as a bjam expert, you might want to suggest how to do this in cool way. I would love to have a shell script which lets me do test_archive <toolset> <archive> But I don't see how to fix up the bjam files to support this. As a bjam expert, maybe you could look into this.
Do you mean you plan to always serialize char* as arrays and remove 'load'/'save' overloads for char*? Again, I'm not sure I understand the motivation.
See above Robert Ramey

Robert Ramey wrote:
For example:
binary_object make_binary_object(void* ptr, unsigned size); Returns: binary_object(ptr, size);
binary_object::save(Archive& ar, unsigned version) Effects: Calls ar.save_binary(m_ptr, m_size);
Aside - I wouldn't expect load/save/load_binary/save_binary to ever be invoked by a library user.
But make_binary_object is likely to be called, and to understand what it does you need docs for binary_object::save.
It would be nice to avoid asking the user to do BOOST_CLASS_EXPORT for all possible argument types. What's desirable is:
template<class T> void register_rpc_function(const char* name, void (*f)(const T&)) { functions[name] = ... ; boost::serialization::export_class< function_call_1<T> >::instantiate(); }
I can't put BOOST_CLASS_EXPORT in register_rpc function now.
This situation is discussed in the documentation under the heading "Template Serialization Traits". There is an example that uses assignation of traits to the nvp<T> template. BOOST_CLASS_EXPORT is just a syntactic short hand to for the specialization above. Uh-oh - I just looked at the definition of BOOST_CLASS_EXPORT and I see it's not that obvious as it is in the other traits.
Exactly. Not to mention that BOOST_CLASS_EXPORT 1. Instantiates a function 2. Instantiates a helper class Which is considerably more complex than specializing a trait class and cannot be done in arbitrary place.
I'm afraid the no matter how smart XML reader is, it still would have to scan the entire file.
So what's wrong with that? I would guess that such an XML reader already exists somewhere and you cat get everything thing for free. Assuming this doesn't address your need, and you want to do some programming, one can create another XML archive version. This would create two output files.
I guess I have two questions: 1. Won't serialization fail in some way if I just seek the stream to the position found in index and try reading. 2. For random access I need to make sure that all saved objects have export key. How do I do that? Not necessary out-of-the box but where I can plug the check?
The reason why I think the other is important, it that's it's actually saving/loading support for plain C++ array -- which is rather basic thing.
Hmmm - the library already implements serialization of plane C++ arrays by serializing each element.
For *fixed-size* arrays. But not for dynamically allocated arrays. BTW, it seems we need two wrappers for completeness: one for dynamic arrays with element-wise save and another for dynamic arrays with binary save.
But you won't catch a case where you save one type and load another.
True, but I think it would help a lot - and practically free to implement.
That would be good.
So, the get zero overhead I need to tweak base class an disable tracking of pointers. Let me try that... yea, the results are nice. One one extra element (class id) per saved item.
If your not serializing pointers, the class_id isn't written to the archive even once. Object id is required for tracking, class id is required for pointers. If versioning is used, class-id is used once. I do not believe that there is any information stored in an archive which is not used.
I agree. I actually have a *crazy* but cute idea that one can use file offset for object id. How object ids are assigned and can I customize that process? That would keep overhead at absolute minimum.
BTW, how to I set tracking level and implementation level for a template class. I think I can partially specialize 'tracking_level', but it should be mentioned in the docs.
See docs section "Template Serialization Traits"
Ah, I've missed that. Do I need to provide both 'type' and 'value'? Can't serialization library work with just one?
It also necessary to provide overload for char*. Does not count as primitive type?
Default implementation for char * will work as it does for other pointers.
Eh... I though that serialization of pointers to builtin types is just not allowed. Actually, my archive initially have only one (non-templated) 'save' for unsigned. I got compile error until I've declared 'save' for const char*. I'm not sure why.
I came to conclude it presented a big security risk. The problem is the following:
later:
char str[MAX_STRING_SIZE]
ar >> static_cast<char *>(str); // to avoid str being treated as an array
suppose the text archive gets corrupted to:
3000 abc............
The archive will load with a buffer overrun - a security risk.
Right. I think this problem can be addressed with a wrapper for dynamic array char* str(0); ar >> make_dynarray_wrapper(str) so that the library allocates the string itself.
Can I define only one 'load' for unsigned int?
As oppose to ?
As opposed to overloads for all builtin types. For polymorphic_archive we can't have templated function which 'fallbacks' anywhere. We need to have a closed set of virtual functions and I wonder what's the minimal set.
the rest. BTW this provided a huge benefit. In the original version of last year I got into a never-ending battle to specify virtual functions which was dependent on the compiler - long long, etc. It was hopeless - moving to templates solved that.
:-( I guess we're back to those problems.
I've concluded this myself. Its pretty easy given the bjam setup. Maybe it can be made even easier with a bjam argument or environmental variable. It just needs to be explained.
Right.
BTW as a bjam expert, you might want to suggest how to do this in cool way.
I would love to have a shell script which lets me do
test_archive <toolset> <archive>
But I don't see how to fix up the bjam files to support this. As a bjam expert, maybe you could look into this.
Maybe it could be possible. - Volodya
participants (2)
-
Robert Ramey
-
Vladimir Prus