
I was thinking about adding serialization to some times I've been working on in the sandbox. First I tried to recall how Mr. Ramey said serialization can be tested. I couldn't find the specific post I was thinking about, but others that were found gave me the answer. Reading other posts in that search prompted me to ask more questions. I could reduce the classes I'm working with to: //============================================= class computer; class context { public: typedef boost::array<uint_least32_t, 4> value_type; context(); // use auto copy-ctr, copy-=, dtr void operator ()( bool ); // consumer bool operator ==() const; // equals bool operator !=() const; // not-equals value_type operator ()() const; // producer private: friend class computer; boost::uint_fast64_t length; boost::array<uint_fast32_t, 4> buffer; boost::array<bool, 512> queue; template < class Archive > void serialize( Archive &ar, const unsigned int version ); }; class computer : public convenience_methods_base<context> { // An object of type "context" is incorporated in this object // due to the base class. A mutable/const pair of non-static // member functions named "context()" gives access to the inner // context object. public: typedef context::value_type value_type; // Put various access member functions here that forward to the // internals of the "context" type, which work because of the // friend declaration. private: template < class Archive > void serialize( Archive &ar, const unsigned int version ); }; //============================================= I initially planned to have serialization functions for these two classes, the "convenience_methods_base" base class template, plus two other class templates (a base class and a support class) that "convenience_methods_base" uses. But the e-mail search I mentioned found a thread from May 2007 (on the main Boost list) the suggested that the serialization of a non-primitive should match the user's external representation of the type, and not the type's particular internal structure. So I decided to keep the serialization protocol just for the two public-facing classes, "context" and "computer." I figured that the "computer" object can be serialized like: //============================================= template < class Archive > inline void computer::serialize( Archive &ar, const unsigned int version ) { ar & boost::serialization::make_nvp("context", this->context()); } //============================================= Which leaves how "context" objects are serialized. After thinking about it for hours, I decided to just whip out something quick & dirty and refine it later. So: //============================================= template < class Archive > inline void context::serialize( Archive &ar, const unsigned int version ) { ar & BOOST_SERIALIZATION_NVP( length ) & BOOST_SERIALIZATION_NVP( buffer ) & BOOST_SERIALIZATION_NVP( queue ); } //============================================= would give a final serialization, in my test file, of: //============================================= <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>1</length> <buffer class_id="2" tracking_level="0" version="0"> <elems> <count>4</count> <item>1732584193</item> <item>4023233417</item> <item>2562383102</item> <item>271733878</item> </elems> </buffer> <queue class_id="3" tracking_level="0" version="0"> <elems> <count>512</count> <item>1</item> <item>0</item> <!-- I'll spare you, and the mail server, of 509 more "<item>0</ item>" lines --> <item>0</item> </elems> </queue> </context> </test> </boost_serialization> //============================================= Now I started refining, keeping the principle of not leaking implementation details in mind. The problem here is the array- counts, which I don't need since they'll never change. The first one I can fix by writing each element separately: //============================================= <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>1</length> <buffer-A>1732584193</buffer-A> <buffer-B>4023233417</buffer-B> <buffer-C>2562383102</buffer-C> <buffer-D>271733878</buffer-D> <message-tail class_id="2" tracking_level="0" version="0"> <elems> <count>512</count> <item>1</item> <item>0</item> <!-- 509 more "<item>0</item>" lines --> <item>0</item> </elems> </message-tail> </context> </test> </boost_serialization> //============================================= I've always wanted to use something like a base-64 string encoding of the bit array, because it's cool and it'd save space. I added conversion functions to/from the bit array and a std::string, and then (de)serialized the string. I also had to separate "serialize" into "save" and "load" since conversion is complementary, not identical. So now I have: //============================================= <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>1</length> <buffer-A>1732584193</buffer-A> <buffer-B>4023233417</buffer-B> <buffer-C>2562383102</buffer-C> <buffer-D>271733878</buffer-D> <message-tail>g</message-tail> </context> </test> </boost_serialization> //============================================= Then I added tests for: exactly 6 bits (i.e. one base-64 letter); a sextet (actually two) and a partial sextet together; filling a queue to capacity (actually one short of that since a full queue automatically activates a turnover); and going past capacity resulting in a new hash buffer and an empty message-tail. //============================================= <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>1</length> <buffer-A>1732584193</buffer-A> <buffer-B>4023233417</buffer-B> <buffer-C>2562383102</buffer-C> <buffer-D>271733878</buffer-D> <message-tail>g</message-tail> </context> </test> </boost_serialization> <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>6</length> <buffer-A>1732584193</buffer-A> <buffer-B>4023233417</buffer-B> <buffer-C>2562383102</buffer-C> <buffer-D>271733878</buffer-D> <message-tail>q</message-tail> </context> </test> </boost_serialization> <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>14</length> <buffer-A>1732584193</buffer-A> <buffer-B>4023233417</buffer-B> <buffer-C>2562383102</buffer-C> <buffer-D>271733878</buffer-D> <message-tail>qQg</message-tail> </context> </test> </boost_serialization> <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>511</length> <buffer-A>1732584193</buffer-A> <buffer-B>4023233417</buffer-B> <buffer-C>2562383102</buffer-C> <buffer-D>271733878</buffer-D> <message- tail>ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789- _AAAAAAAAAAH__________g</message-tail> </context> </test> </boost_serialization> <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE boost_serialization> <boost_serialization signature="serialization::archive" version="5"> <test class_id="0" tracking_level="0" version="0"> <context class_id="1" tracking_level="0" version="0"> <length>512</length> <buffer-A>2631642121</buffer-A> <buffer-B>80961853</buffer-B> <buffer-C>4033330630</buffer-C> <buffer-D>497373075</buffer-D> <message-tail></message-tail> </context> </test> </boost_serialization> //============================================= If you want to see the actual work, look at revision/change-set #48131 in Boost's Subversion set-up. Now to the actual questions: 1. If there's only one sub-object, base or member, that has any significant data, could someone call the "serialize" member function of that sub-object directly in the wrapping class's "serialize"? (This assumes that friendship is set up.) This would make the wrapping class look identical to the sub-object's class, right? Is this a good idea? 2. Before actually trying to serialize a string, I was worried that the string's serialization would include a length count. This would be unnecessary because the object's "length" attribute already implies the length of the string (int( ceil( double( length % 512 ) / 6.0 ) )). Here, we see that the string's length isn't explicitly included in the XML archive, so I have no worries. But what about non-XML archives? Will be string's length be directly serialized, wasting space? If so, how can I fix that? 3. Having to add std::string to support serialization makes my class header heavier. My class uses fixed-sized arrays, so is there any way that I can avoid allocating a string? For writing out, could I set up a char-array with the encoding and write that out? For reading in, can I read the string in piecemeal to a char-array just in case someone added more characters than required. My converter currently ignores illegal characters and stops when enough legal characters have been read. If what I ask is possible, would the reading routine have to seek to the end of the entry so further serialization isn't messed up? -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com