
Matthias Troyer wrote:
On Oct 10, 2005, at 6:48 PM, Robert Ramey wrote:
That sounds very good to me. Maybe I spoke too soon. I don't see how this would require and changes at all to the serialization library. I don't see why has_fast_array_serialization has to be part of the serialization library. Maybe all the code can be included in boost/serialization/ fast_arrary.hpp ? This header would be included by all the classes that use it and no others.
[snip]
If has_fast_array_serialization<Archive,Type> is defined boost/serialization/fast_arrary.hpp I'm still OK with it.
That's where it is defined in my proposal.
The only thing I'm missing here is why the serialization library itself has to be modified to support all this. It seems that all this could easily be encapulated in one (or more) separate optional headers. This would be the best of all possible worlds.
There are actually only very few modifications:
1. boost/archive/detail/oserializer.hpp and iserializer.hpp require modifications for the serialization of C-arrays of fixed length. In my version, the class save_array_type is modified to dispatch to save_array when fast_array_serialization is possible. The underlying problem here is that oserializer.hpp implements the serialization of a type here (the C array!). The optimal solution to this problem would be to move the array serialization to a separate header boost/ serialization/array.hpp, as is done for all C++ classes.
My intention was to include all types "built-in" to the C++ language in ?serializer.hpp so that's why it includes C++ arrays. Separating this into another header would break this "concept" and would result in a small additional *.hpp file that would have to be included explicitly. So I would be against it in this case. On the other hand, I do appreciate ideas that remove pieces from the "core" of the library and turn them into more independent modules so don't get discouraged here.
2. boost/serialization/vector.hpp is also modified to dispatch to save_array and load_array where possible. I don't think that this is a problem?
That would mean that users are including save/load_array even if they don't want them or want to use their own versions. Oh - then documentation has to be enhanced to explain all this internal behavior. I would prefer something like the following: class my_class { stl::vector<int> m_vi; ... }; template<class Archive> my_class::serialize(Archive &ar, const unsigned int version){ // standard way ar & m_vi; // or fast way which defaults to standard way in appropriate cases. save_array(ar, m_vi); } This a) keeps the stl portion of the library smaller b) leave the user in control of what's going on c) permits development of save/load array to be on an independent parallel track with everyting else. If we eventually discover that that everyone is always using save/load array then we can study whether we want to just enhance stl::vector, etc to include it as default functionality.
3. I had to introduce a new strong typedef in basic_archive.hpp:
BOOST_STRONG_TYPEDEF(std::size_t, container_size_type) BOOST_CLASS_IMPLEMENTATION(boost::archive::container_size_type, primitive_type)
...
I don't think that you will object to this.
... I looked at this and right away I noticed that, as written this would make all existing binary archives unreadable. The real fix for is: a) bump up the library version from 3 to 4 b) alter save/load collections to use unsigned int or size_t depending upon he library version. It has to be done this way since stl collections are marked as non-versioned. I was sort of reluctant to do this as it seemed to me that it would be yet a while before anyone starts to save more than 2,000,000,000 objects with the serialization library and it seemed to me wasteful for the native binary archive to included an extra 4 bytes containing 0's for every collection serialzed. But maybe now (I mean 1.34) is the time to bite the bullet on this. I don't have a strong prefererence for either changing it or leaving it the same. If you need it, I would say fine - we'll do it.
4. boost/archive/basic_binary_[io]archive.hpp serialize container_size_type as an unsigned int as done till now. It might be better to bump the file version and serialize them as std::size_t.
agreed - no point in using size_t in the collection code just to throw it away.
All the other changes were to modify the binary archives and the polymorphic archive to support fast array serialization. In contrast to the above points this is optional. Instead we could provide fast_binary_[io]archive and fast_polymporphic_[io]archive, that differ from their normal versions just by supporting fast array serialization. I could live with this as well, although it makes more sense in my opinion to just addd the save_array/load_array features to the existing archives.
sounds like you can go along with me on this.
Of all the points above, I believe that you will not have anything against points 3 and 4 since you proposed something similar already in the past if I remember correctly.
I don't see a problem here.
Issue 2 should also be noncontroversial, and the main discussion should thus be on issue 1, on how one can improve the design of [io] serializer to move the the implementation of array serialization into a separate header.
looks like we'll have to arm wrestle a little more here. In a way you've got me over a barrel. I continually advocate keeping the "core" of the library small by factoring out everything that can be factor out. Now you've got me on C++ array. sort of. I'll re-iterate my view that the "core" library address serialization of all "built-in" language types - and no others. The other part of your proposal really is an enhancement of the stl serializations which are not part of the "core" part of the library. I'm confident that your enhancement will be quite useful to many people and I'm very happy to see that you've done it. That's not the same as forcing it on all users of the library. So I prefer to see your enhancement as an option explicitly invoked by the user. This has a number of advantages: a) the user can see what he is doing. No hidden complex behavior. b) some users might want just minimal code since thier collections are small c) your enhancement will require significant documentation. This will be much easier if it is an optional add-on to the serialization library. d) parallel or layered developement is facilitated. So to summarize the issues a) should C++ array serialization be in a separate header? I say no, you say yes. b) should save/load array be incorporated into stl collection serialization to make its usage oblicatory? I say no, you say yes. Regardless of how these questions are answered, its clear to me that your enhancements to stl containers will be available to users. Assuming that this is packaged as an optional add-on as I would hope, the only questions remaining will be: a) Should this be a totally separate library with its own documentation/tests/directory tree etc? It should be separate but not totally so: Maybe a files or maybe directory within serialization for save/load array and within archive for fast...archive adaptor. A group of its own tests - just like we have tests for all other combinations of serializations and archives - I can hear the howling already. We'll have to see what to do about this. A separate documenation section in the documenation of the serialization library. Similar to the miscelleneas. But miscellaneas holds things that are really separate so we'll find a good place for it. Maybe a section titled something like "Special Considerations When Serializing Collections" (but shorter). Note that your fast...archive will really be a new creature - an "archive adaptor". This merits an new section in the documentation in any case. This is a cool and useful technique and will encourage future great ideas which enhance the serialization library without making the core bigger. b) Should such an optional enhancement be subject to some sort of review? I'm agonostic on this. I would be happy to just be the gate keeper and accept it or make you alter it to my taste. Of course, no guarentee that anyone else would be happy with this. I'm willing to go along with whatever the consensus is regarding if and/or how this thing should be reviewed. =============== A final note to others wishing to participate in the serialization library. Welcome and good luck. Having said that, I will give a little advice on how to be most successful in getting my cooperation. (encouragement you get for free) a) Notice that I make huge efforts to keep the "core" library from growing. Anything that adds to it makes my job bigger - I have to make this job smaller. b) I am out of the business of writing serializations for specific classes. Its up to those who need it to get it done. Of course I'm always full of advice - sorry about that. c) I am out of the business of writing new archive types. Instead, I want to improve the documentation to make this easier. d) My efforts will be focused on implemenation aspects of the "core" library. This includes: i) pending issue regarding dynamic loading/unloading of code that contains serialization for specific types. This includes testing of support for plug-ins. ii) making a good performance test and correcting any performance bottlenecks in the core library. iii) making testing more efficient. iv) fixing bugs I am getting a little "burned out" on the serialization library. Its only my obsessive nature that makes me continue to the bitter end which I'm hoping I am approaching - at least asymptotically. I am extremely gratified by your efforts and those of others to enhance the library and will do everything (subject to the above reservations) I can to encourage and support them. Nothing would make me happier to see people spinning off their own improved serializations and archive classes and making the package the default choice for C++ object serializaition. Robert Ramey