[Serialization] Questions / Comments
After having used the serialization library for quite a while, it seems that I'm approaching a point where I are seeing lots duplicate code which could be inside a serialization function. In turn, this head lead me to some questions / comments. The use of serialization in my case is both file and network oriented. This means reading and writing a great deal of short-lived archives. Works fine in between c++ programs, but outside that it's quickly becoming an infeasible task. Creating a custom archive could be an option, however, I would need to spend serious time on understanding the code due to the heavy use of macros. The question to be answered first is whether the current serialization architecture is able to fulfill my needs, or whether I should consider using other options. Considering the following two scenarios / questions: 1) Suppose I would like to create a custom XML archive which can both read shuffled XML data (its attributes/values in arbitrary order), and serialize to child attributes as well as named child nodes. Currently, name-value pairs are in the Serializable concept. I think they belong in the Archive Concept. Otherwise, it may be impossible to read XML- formatted data where the nodes are not in the same order as in the C++ code? In addition, suppose serializing a program structure is like a tree or graph, then name-value-pairs do not allow for any other nesting than the C++ code (i.e., it assumes everything is a child node). To allow for making distinction between nesting levels, one could think of, e.g., template< typename Archive > void serialize( Archive& ar, const unsigned int version ) { ar.serialize_node( "subtree", m_contained_class ); ar.serialize_attribute( "use_count", m_int ); } Is something like this possible? To this end, would it be possible to plug- in another parser, e.g., RapidXML? 2) I would like to create an archive that speaks the Action Message Format (AMF). Although I do have trouble finding a clear distinct parsing layer in the archives, I think this is possible to do. However, perhaps you think archive/serialization should not be used in conjunction with these kinds of uses. Many thanks in advance, Kind regards, Rutger ter Borg
Rutger ter Borg wrote:
After having used the serialization library for quite a while, it seems that I'm approaching a point where I are seeing lots duplicate code which could be inside a serialization function. In turn, this head lead me to some questions / comments.
The use of serialization in my case is both file and network oriented. This means reading and writing a great deal of short-lived archives. Works fine in between c++ programs, but outside that it's quickly becoming an infeasible task. Creating a custom archive could be an option, however, I would need to spend serious time on understanding the code due to the heavy use of macros.
The question to be answered first is whether the current serialization architecture is able to fulfill my needs, or whether I should consider using other options. Considering the following two scenarios / questions:
1) Suppose I would like to create a custom XML archive which can both read shuffled XML data (its attributes/values in arbitrary order), and serialize to child attributes as well as named child nodes.
Currently, name-value pairs are in the Serializable concept. I think they belong in the Archive Concept. Otherwise, it may be impossible to read XML- formatted data where the nodes are not in the same order as in the C++ code?
named-value pairs are dependent only upon <utility>. it is a "wrapper". in the library, "wrapper" is a way of attaching extra data to a datum - in his case it attaches the variable name which C++ doesn't provide a mechanism to determine. It is easy to make special purpose wrappers.
In addition, suppose serializing a program structure is like a tree or graph, then name-value-pairs do not allow for any other nesting than the C++ code (i.e., it assumes everything is a child node). To allow for making distinction between nesting levels, one could think of, e.g.,
template< typename Archive > void serialize( Archive& ar, const unsigned int version ) { ar.serialize_node( "subtree", m_contained_class ); ar.serialize_attribute( "use_count", m_int ); }
Is something like this possible? To this end, would it be possible to plug- in another parser, e.g., RapidXML?
I'm not exactly sure what you have in mind but I can say the following: The word "parser" is a red flag in this context. The "grammar" of an archive is determined by the C++ data structure being serialized. This is the source of it's power and simplicity of usage. The idea of an independently defined "grammar" conflicts with the concept and implemenation of the serialization library in a fundamental way.
2) I would like to create an archive that speaks the Action Message Format (AMF). Although I do have trouble finding a clear distinct parsing layer in the archives, I think this is possible to do. However, perhaps you think archive/serialization should not be used in conjunction with these kinds of uses.
XML serialization is an interesting case to look at. It does use spirit to "parse" little bits of the archive. But it can't read any arbitrary xml archive into a predefined C++ structure. Of course it never could. However, we were able to generate a file following XML grammar and read back THAT SAME FILE. So the serialization library uses XML in only a very limited sense. I suppose it would be possible to make the xml archive more elaborate so that it could adjust to some limited editing of an xml archive - and many people have requested that. I haven't done it for a couple of reasons. a) Personally, I think XML is a dead end and it doesn't hold much interest for me. b) I don't think that those who desire this facility would be satisified with the result. I think they expect some sort of module which will permit a lot of editing to be read back into the C++ structure. If you think about this, this effectively means that serialization would have to handle an arbitrary xml file - and that just is not doable. c) It would be a lot of work - with poor results in my opinion. Having said that, it's still a free country and anyone who thinks differently about this is free to take a shot at it. Good Luck. Perhaps what you have in mind is one of the following: a) A program which takes an XML schema and generates equivent C++ data structures and code which implements the transformation between them. I believe there are commercial products which do this. b) There is spirit karma an qi - or whatever they are called. Given a grammar - one of these generates a parser while the other will generate C++ code which will write data in accordance with that grammar. I don't know the current state of these libraries. Given how simple it is to explain what the serialization library does - it turns out to be quite a rich subject.
Many thanks in advance, Kind regards,
Rutger ter Borg
Robert Ramey wrote:
named-value pairs are dependent only upon <utility>. it is a "wrapper". in the library, "wrapper" is a way of attaching extra data to a datum - in his case it attaches the variable name which C++ doesn't provide a mechanism to determine. It is easy to make special purpose wrappers.
Thanks for the direction. So, given the wrappers it is possible to instantiate different code paths of the archive, to make the serialization operator behave differently? I'll look into it.
2) I would like to create an archive that speaks the Action Message Format (AMF). Although I do have trouble finding a clear distinct parsing layer in the archives, I think this is possible to do. However, perhaps you think archive/serialization should not be used in conjunction with these kinds of uses.
XML serialization is an interesting case to look at. It does use spirit to "parse" little bits of the archive. But it can't read any arbitrary xml archive into a predefined C++ structure. Of course it never could. However, we were able to generate a file following XML grammar and read back THAT SAME FILE. So the serialization library uses XML in only a very limited sense.
I see. I didn't mean arbitrary, I meant with the nodes and attributes shuffled. I.e., with a map-like interface to the serialized datums vs. a stack-based interface. I.e., in such way that the order of elements does not matter. The name in the name-value pair is then used as a key for lookup.
b) I don't think that those who desire this facility would be satisified with the result. I think they expect some sort of module which will permit a lot of editing to be read back into the C++ structure. If you think about this, this effectively means that serialization would have to handle an arbitrary xml file - and that just is not doable.
I guess I'm not looking for the arbitrary case. Serialization can already be used as a working mechanism for reading a well-crafted file that exploits exported class definitions to create a vector of arbitrary objects (which could be requests, commands, active objects, etc.). [snip] Thanks for the examples, but these were probably not what I meant. I'm looking for a bit more flexibility and customization for the archive formats. The second example I mentioned, the action message format, is more a binary serialization format.
Given how simple it is to explain what the serialization library does - it turns out to be quite a rich subject.
Indeed it is. I think the line between serialization and program options is also a thin one. Cheers, Rutger
Robert Ramey wrote:
The word "parser" is a red flag in this context. The "grammar" of an archive is determined by the C++ data structure being serialized. This is the source of it's power and simplicity of usage. The idea of an independently defined "grammar" conflicts with the concept and implemenation of the serialization library in a fundamental way.
Out of curiosity, why is this a red flag, given that "archives are not streams"? Kind regards, Rutger ter Borg
participants (2)
-
Robert Ramey
-
Rutger ter Borg