[serialization] customize xml archive
hello, i try to create a xml archive that fit our needs. the default xml serializer handles not the format i would like to implement. some differences are: - primitive types (int, float, ...) should be written as xml attributes - array types should not provide any parent xml element, all array elements will be serialized as childs from the element containing the array - tagnames should always contain the classname - parent classes should create a new xml child element - and some more, but so far as an example the default xml code looks like this: <foo class_name="Class1" class_id="1"> <name>example</name> <container class_id="2"> <count>2</count> <item class_id_reference="2"> <a>123</a> </item> <item class_id_reference="3" class_name="Class3"> <b>456</b> </item> </foo> i would like this output: <Class1 role="foo" name="example"> <Class2 role="container" a="123" /> <Class3 role="container" b="456" /> </Class1> is that possible with the current boost::serialization and boost::archive libraries? i tried to derive all baseclasses of xml_oarchive and reimplemented it, without success. maybe a first question: why don't i get a classname for each serialized object in common_oarchive::save_override(const class_name_type & t, int)? sometimes i get a class_id, sometimes a class_reference_id and so on. classnames are offered rarely. regards, jabe
Your idea of what an XML archive should look like is quite different than the one I implemented. So I don't think you would want to derive from xml_[i/o]archive. Take basic_xml_oarchive<Archive> as a starting point and make your own version. Note that my version was very much focused on permitting xml archives of arbitrary size without having to load the whole xml tree. I would hope that its possible to implement your own with the facilities that the library includes for creating archives. Of course no one can know until you try. Robert Ramey Jabe wrote:
maybe a first question: why don't i get a classname for each serialized object in common_oarchive::save_override(const class_name_type & t, int)? sometimes i get a class_id, sometimes a class_reference_id and so on. classnames are offered rarely.
In general, archives include only that which is necessary for de-serialization. There is a type "class_name_optional" which is serialized with the class name for every type. The default implementation is to ignore it - and none of the included archives override the default. class_id etc are only applicable to classes - not primitives. They might not be available for all types - depending on the implementation level - I would have to investigate that. Robert Ramey
Robert Ramey wrote:
Your idea of what an XML archive should look like is quite different than the one I implemented. So I don't think you would want to derive from xml_[i/o]archive. Take basic_xml_oarchive<Archive> as a starting point and make your own version.
Note that my version was very much focused on permitting xml archives of arbitrary size without having to load the whole xml tree. I would hope that its possible to implement your own with the facilities that the library includes for creating archives. Of course no one can know until you try.
I try. But up to the moment i'm quite confused :)
maybe a first question: why don't i get a classname for each serialized object in common_oarchive::save_override(const class_name_type & t, int)? sometimes i get a class_id, sometimes a class_reference_id and so on. classnames are offered rarely.
In general, archives include only that which is necessary for de-serialization.
There is a type "class_name_optional" which is serialized with the class name for every type. The default implementation is to ignore it - and none of the included archives override the default.
Where is this type defined and which method do i have to override?
class_id etc are only applicable to classes - not primitives. They might not be available for all types - depending on the implementation level - I would have to investigate that.
All classes i use are exported by BOOST_CLASS_EXPORT_GUID. Is there a chance
to fetch the classname for a given class_id, class_id_reference or
class_id_optional on the level of archive derivation?
Until now I've derived three classes from boost::archive:
class my_basic_xml_oarchive : public detail::common_oarchive<Archive>
class my_xml_oarchive_impl : public basic_text_oprimitivestd::ostream,
public my_basic_xml_oarchive<Archive>
class my_xml_oarchive : public my_xml_oarchive_impl
There is a type "class_name_optional" which is serialized with the class name for every type. The default implementation is to ignore it - and none of the included archives override the default.
Where is this type defined and which method do i have to override?
I double checked - I was wrong about this. there is no class_name_optional - it's class_id_optional.
class_id etc are only applicable to classes - not primitives. They might not be available for all types - depending on the implementation level - I would have to investigate that.
All classes i use are exported by BOOST_CLASS_EXPORT_GUID. Is there a chance to fetch the classname for a given class_id, class_id_reference or class_id_optional on the level of archive derivation?
There is no function to do this. This would have to be aded to basic_[i/o]archive.hpp. It's never been needed - but perhaps its worthy of consideration.
Until now I've derived three classes from boost::archive:
class my_basic_xml_oarchive : public detail::common_oarchive<Archive> class my_xml_oarchive_impl : public basic_text_oprimitivestd::ostream, public my_basic_xml_oarchive<Archive> class my_xml_oarchive : public my_xml_oarchive_impl
And more general: How can i create objects from the registered information about classes using the extended type info? When using the boost::archive is not practical for my xml format i would like to use as much as possible from boost to port the already implemented solution by myself without boost. Therefore i need to replace my own classfactory if possible with boost's one.
The class factory used by boost serialization is coded into iserializer.hpp. This is sort of "ad hoc". It relies upon facilities of extended_type_info. I would like to see it "factored out" and move to extended_type_info which would be a natural place for it. I don't know for a fact that that wouldn't create some other difficulties but it would be a better conceptual fit. I would also like to see this used to create a "C++ / poor persons COM" which would have the facility of Microsoft COM and CORBA but only address the needs of of C++ programmers which would make it very simple to use - unlike the language independent methods. So we're on the same page here as far as what we would like to see done - of course - getting it done is another thing entirely.
I found the struct "save_array_type" in detail/oserializer.hpp. If i could replace the invoke method with my own (and so for all other savers) i could remove e.g. the "item" xml tags used with array elements. Any chance?
Boost serialization as implemented in the HEAD is being enhanced by Mathias Troyer to permit extension and customization of serialization of arrays/collections according to the type of archive. His motivation is to maximize performance of serialization of arrays on for cluster computing. The method that his been chosen is to create an "array_wrapper" whose serialization can be varied according to the type of archive. This will be useful in your case. However, It won't be part of boost serialization until 1.35. It is not under going "finishing touches". Note there are other improvements comming in the serialization library which are none of my doing. Aside from Mathias work on making serialization of arrays more customizable, David Abrahams has improved the export code to eliminate header ordering requirements and refined tricky code in [i/o]serializer so that its simpler, neater, and most importantly, allows for more compilers to pass more tests. It would be my hope to make a generalized GUI editor which can be used to edit any serialization files. For me - this is much easier and more useful than trying to make XML for this purpose. Robert Ramey
Thank you very much for your answers.
All classes i use are exported by BOOST_CLASS_EXPORT_GUID. Is there a chance to fetch the classname for a given class_id, class_id_reference or class_id_optional on the level of archive derivation?
There is no function to do this. This would have to be aded to basic_[i/o]archive.hpp. It's never been needed - but perhaps its worthy of consideration.
Maybe...in my case, yes it is.
And more general: How can i create objects from the registered information about classes using the extended type info? When using the boost::archive is not practical for my xml format i would like to use as much as possible from boost to port the already implemented solution by myself without boost. Therefore i need to replace my own classfactory if possible with boost's one.
The class factory used by boost serialization is coded into iserializer.hpp. This is sort of "ad hoc". It relies upon facilities of extended_type_info. I would like to see it "factored out" and move to extended_type_info which would be a natural place for it. I don't know for a fact that that wouldn't create some other difficulties but it would be a better conceptual fit. I would also like to see this used to create a "C++ / poor persons COM" which would have the facility of Microsoft COM and CORBA but only address the needs of of C++ programmers which would make it very simple to use - unlike the language independent methods.
So we're on the same page here as far as what we would like to see done - of course - getting it done is another thing entirely.
Yes we are. After some investigation into the source i was able to retrieve classnames from object pointers and to get an extended_type_info from a classname. But it only works when using archives. Without them, no chance in my opinion. I would really like to see the complete extended_type_info part factored out.
I found the struct "save_array_type" in detail/oserializer.hpp. If i could replace the invoke method with my own (and so for all other savers) i could remove e.g. the "item" xml tags used with array elements. Any chance?
Boost serialization as implemented in the HEAD is being enhanced by Mathias Troyer to permit extension and customization of serialization of arrays/collections according to the type of archive. His motivation is to maximize performance of serialization of arrays on for cluster computing. The method that his been chosen is to create an "array_wrapper" whose serialization can be varied according to the type of archive. This will be useful in your case. However, It won't be part of boost serialization until 1.35. It is not under going "finishing touches".
I got it working by providing an overloaded member to save named std::vector's.
Note there are other improvements comming in the serialization library which are none of my doing. Aside from Mathias work on making serialization of arrays more customizable, David Abrahams has improved the export code to eliminate header ordering requirements and refined tricky code in [i/o]serializer so that its simpler, neater, and most importantly, allows for more compilers to pass more tests.
It would be my hope to make a generalized GUI editor which can be used to edit any serialization files. For me - this is much easier and more useful than trying to make XML for this purpose.
I need to read and write a XML format that is defined by a standard. I've got no chance to change it in any way. Furthermore i have to read XML files that were exported by other applications. Parsing the file should be done by an xml library that supports validation and so on. All I need from the archive is an interface providing all available information. Before investigating more work into this, I think it cannot be done in the next couple of hours. All the "tracking_id", "object_id" and so on attributes cannot be read from the XML file (they aren't there). So I really don't know if the XML input ever will work even when the xml output is working. Nevertheless, boost is a great library that earns much of my respect. I think everytime I read a new line of boost's sourcecode I learn more about templates. Regards, Jabe
Jabe wrote:
Before investigating more work into this, I think it cannot be done in the next couple of hours. All the "tracking_id", "object_id" and so on attributes cannot be read from the XML file (they aren't there). So I really don't know if the XML input ever will work even when the xml output is working.
note that there is a fundemental mismatch here - which I've discussed before. Boost serialization generated from the class hierarchy. The XML which is generated reflects that. There is no way that an arbitrary xml schema can be loaded via boost serialization. Either the schema is provided by a C++ program (boost serialization) or by an XML schema. There exist other tools for with do the latter. That is, given an XML schema - they generate a group of C++ classes that match (in some sense) that schema. It sounds like one of these tools might be better address the needs of the application you're crafting. Robert Ramey
Before investigating more work into this, I think it cannot be done in the next couple of hours. All the "tracking_id", "object_id" and so on attributes cannot be read from the XML file (they aren't there). So I really don't know if the XML input ever will work even when the xml output is working.
note that there is a fundemental mismatch here - which I've discussed before. Boost serialization generated from the class hierarchy. The XML which is generated reflects that. There is no way that an arbitrary xml schema can be loaded via boost serialization. Either the schema is provided by a C++ program (boost serialization) or by an XML schema.
Yes, I know. XML cannot be called "serialization", because of its free ordering of elements and attributes. In the case you change the order of boost's serialized elements inside a XML file, in deserialization will throw a stream exception. Because I've touched the serialization part of boost the first time this was what I wanted to know exactly. Nevertheless i wouldn't say it's impossible. When giving access to the classnames in boost whenever needed, the serialization to an arbitrary XML scheme (at least that one I need) can be done. The deserialization is another thing which I haven't tried to touch yet. I guess that one additional method can make it possible :) So far, I take this experiences with boost::serialization with me and go for another solution. Thank you Jabe
Getting ready to investigate this and wanted to check if anyone had already done it.
We are using boost serialization in our multithreaded application and have encountered some problems when multiple threads deserialize the same kind of object at the same time. The problems went away when we added a global lock prior to the deserialization statement. There might be other issues involved so to clarify the situtation: Is boost serialization supposed to be thread safe? Sigurd
Sigurd Saue wrote:
We are using boost serialization in our multithreaded application and have encountered some problems when multiple threads deserialize the same kind of object at the same time.
The same object? - would this be different archives into different objects or the same archive into different objects from different threads. It's not clear what is being done here.
The problems went away when we added a global lock prior to the deserialization statement. There might be other issues involved so to clarify the situtation: Is boost serialization supposed to be thread safe?
Sigurd
The "serial" nature of the process excludes a number of mult-thread scenarios by definition.I believe that serialization is thread safe in all contexts where it makes sense to access serialization from different threads. Robert Ramey.
We use boost serialization for sending notifications from a server to multiple clients. The problematic scenario is when two client threads within the same process deserializes the same notification. In your words: the same archive into different object from different threads. Sigurd Saue -----Opprinnelig melding----- Fra: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org]Pa vegne av Robert Ramey Sendt: 29. juni 2006 16:41 Til: boost-users@lists.boost.org Emne: Re: [Boost-users] Is serialization thread safe? Sigurd Saue wrote:
We are using boost serialization in our multithreaded application and have encountered some problems when multiple threads deserialize the same kind of object at the same time.
The same object? - would this be different archives into different objects or the same archive into different objects from different threads. It's not clear what is being done here.
The problems went away when we added a global lock prior to the deserialization statement. There might be other issues involved so to clarify the situtation: Is boost serialization supposed to be thread safe?
Sigurd
The "serial" nature of the process excludes a number of mult-thread scenarios by definition.I believe that serialization is thread safe in all contexts where it makes sense to access serialization from different threads. Robert Ramey. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Sigurd Saue wrote:
We use boost serialization for sending notifications from a server to multiple clients. The problematic scenario is when two client threads within the same process deserializes the same notification. In your words: the same archive into different object from different threads.
Hmm this same archive or same archive instance? In any case I believe the following should work and would be very interested in knowing that it doesn't. If it doesn't work, I would be very interested in knowing where the problem arises. std::ifstream is1("file); boost::archive::text_iarchive ia1(is1); std::ifstream is2("file); boost::archive::text_iarchive ia2(is2); // from thread one object o; ia >> o // from thread two object o1; ia1 >> o1; Of course if the archives or their input streams are shared among threads this can't work as the state of both the archives and streams are changed by the action of deserialization. Robert Ramey
Thanks a lot for your input. We will investigate this further and test the different scenarios. I will return with some conclusions later (after the summer holidays) Sigurd Saue -----Opprinnelig melding----- Fra: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org]Pa vegne av Robert Ramey Sendt: 29. juni 2006 17:57 Til: boost-users@lists.boost.org Emne: Re: [Boost-users] Is serialization thread safe? Sigurd Saue wrote:
We use boost serialization for sending notifications from a server to multiple clients. The problematic scenario is when two client threads within the same process deserializes the same notification. In your words: the same archive into different object from different threads.
Hmm this same archive or same archive instance? In any case I believe the following should work and would be very interested in knowing that it doesn't. If it doesn't work, I would be very interested in knowing where the problem arises. std::ifstream is1("file); boost::archive::text_iarchive ia1(is1); std::ifstream is2("file); boost::archive::text_iarchive ia2(is2); // from thread one object o; ia >> o // from thread two object o1; ia1 >> o1; Of course if the archives or their input streams are shared among threads this can't work as the state of both the archives and streams are changed by the action of deserialization. Robert Ramey _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (4)
-
Ivan
-
Jabe
-
Robert Ramey
-
Sigurd Saue