
Hi All in the list,
I'm basically new to boost so I hope to find support here to integrate my own serialization
library within the boost archive logic. I would say as premise that I started to have a look
in all the boost archive implementation so binary, xml and text trying to understand the logic
but that increased my confusion about how to approach the solution of my problem.
Confusion that is increased by the lack of my boost knowledge and it's related template
programming, of course.
By the way I found really instructive the simple_log_archive example that cleanly shows
how to "walk" the full tree of the data representation in terms of serialization chain.
Furthermore, I would say that my serialization framework does already handle all the
representation formats like binary, xml, text, python and many others, so from the boost
archive point of view I need just to know very few things:
- how to walk the tree representing the serialization chain
- how to detect that a new type is being serialized
- how to get the type name as string
- how to define primitive types
- how to manage vectors/arrays
- be sure that everything is passed as nvp
About the two first, they seems related because the code below seems responsible
to walk the tree and call all the serialize functions
template< class Archive >
struct save_only
{
template< class T >
static void invoke( Archive & ar, const T & t )
{
// make sure call is routed through the highest interface that might
// be specialized by the user.
boost::serialization::serialize_adl(
ar,
const_cast< T & >( t ),
boost::serialization::version< T >::value
);
}
};
So I've put my two functions that are used to defines types:
myserialize_beginType( myserialize &s, char *fieldName, char *type )
myserialize_endType( myserialize &s )
Such as the serialization walk chain looks like:
template< class Archive >
struct save_only
{
template< class T >
static void invoke( Archive & ar, const T & t )
{
// make sure call is routed through the highest interface that might
// be specialized by the user.
-->> myserialize_beginType( ar.serializer, ar.fieldName, ???type??? );
boost::serialization::serialize_adl(
ar,
const_cast< T & >( t ),
boost::serialization::version< T >::value
);
-->> myserialize_endType( ar.serializer );
}
};
But actually I don't know how to determine the type name as string to pass to my beginType function.
About the place to define primitives, I'm not really sure if it's the right place where I have to put
my serialization primitives that maps all my basics types. The prototype of such overloaded
function is like
myserialize_serialize( "basetype" &t, char * fieldName, myserialize &s )
Actually I've defined something like:
#define BOOSTSERIALIZEARCHIVE_PRIMITIVE( __T ) \
simple_log_archive & operator<<( const boost::serialization::nvp< __T > & t )\
{\
myserialize_serialize( t.value(), (char*)t.name(), this->serializer );\
return *this;\
}
BOOSTSERIALIZEARCHIVE_PRIMITIVE( char )
BOOSTSERIALIZEARCHIVE_PRIMITIVE( unsigned char )
which seems to work.
About managing vectors and array, actually I tried to define something like:
template< class Archive >
struct use_array_optimization
{
template < class T >
static void invoke( Archive & ar, const T & t )
{
myserialize_serialize( (T*)t.address(), ar.fieldName , t.count() * sizeof( T ), ar.serializer );
}
};
template< class T >
void save( const T &t )
{
typedef
BOOST_DEDUCED_TYPENAME boost::mpl::eval_if

Roberto Fichera wrote:
Hi All in the list,
I'm basically new to boost so I hope to find support here to integrate my own serialization library within the boost archive logic.
I looked at your post and had a lot of difficulty in figuring out how I could help you. I think the above statement is the source of my difficulty. If you have you're own serialization library, what do you need the boost one for? If you want to use the boost one, what does your own library have to do with anything? I should say that creating one's own archive implementation is harder than it should be. This is due to the fact that documentation and implementation of some details of this process are ambiguous and non-obvious. None-the-less a number of people have managed to create their own archive implementations without too much problem - os it IS doable. It does require knowledge of template meta-programming and investigation of ohw the current libraryies are implemented. If you want to make an archive implementation which some format not supported by the current ones, that would be a quesion we might be able to help with. If this is what you want to do, you might look at simple log archive and trivial archive in thedocumentation. Also there are a few other archives around which illustrate how this ahs been done. There is one for yaml and one for json these are in the sandbox and/or vault. There is also included in the package a portable_binary_archive. Also special archive implementations have been created to support MPI. Looking at your specific question, I can only say I think you need to invest more time in studying the above mentioned subjects before I could be of help. Robert Ramey

On 02/13/2012 05:13 PM, Robert Ramey wrote:
Roberto Fichera wrote:
Hi All in the list,
I'm basically new to boost so I hope to find support here to integrate my own serialization library within the boost archive logic. I looked at your post and had a lot of difficulty in figuring out how I could help you. I think the above statement is the source of my difficulty.
If you have you're own serialization library, what do you need the boost one for? If you want to use the boost one, what does your own library have to do with anything?
Well! I'll try to explain you a little bit more my problem. Since we are going to use some libraries that are based on boost, including their related serialization logic, instead to rewrite their serialization method I was thinking to make an integration between the two serialization logics. Since the boost serialization is mostly based like mine on two main concept: Archive and data representation( via serialize method), I was approaching the possibility to still use the boost serialization method as it is but instead provide a new boost archive just for modeling the serialized data in terms of the my serialization.
I should say that creating one's own archive implementation is harder than it should be. This is due to the fact that documentation and implementation of some details of this process are ambiguous and non-obvious. None-the-less a number of people have managed to create their own archive implementations without too much problem - os it IS doable. It does require knowledge of template meta-programming and investigation of ohw the current libraryies are implemented
Yep! I know. But as you certain know a little help might really speedup the learning phase ;-) !
If you want to make an archive implementation which some format not supported by the current ones, that would be a quesion we might be able to help with.
It's just an integration or a transformation if you prefer
If this is what you want to do, you might look at simple log archive and trivial archive in thedocumentation. Also there are a few other archives around which illustrate how this ahs been done. There is one for yaml and one for json these are in the sandbox and/or vault. There is also included in the package a portable_binary_archive.
As I already said, I started from the simple_log_archive which looks pretty straight to read since it's one single file implementing the whole tree walking and representation. So, starting from that file I wrote the posted email to know only certain details that I need in order to progress into the implementation. I'll have a look into the json and yaml archive format for sure, since they could be certain helpful to me.
Also special archive implementations have been created to support MPI.
Yep! I also had a look on this but I never used the C++ MPI interface. I only used the plain C interface to implement my serialization format over MPI.
Looking at your specific question, I can only say I think you need to invest more time in studying the above mentioned subjects before I could be of help.
Yep! This is what I'm actually doing, but I'm still missing something that's why I posted my email ;-) !
Robert Ramey
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

On 02/13/2012 05:13 PM, Robert Ramey wrote:
Roberto Fichera wrote:
Hi All in the list,
I'm basically new to boost so I hope to find support here to integrate my own serialization library within the boost archive logic.
Just to say that I've solved all my problems. My own boost archive it's finally able to serialize everything that is defined in terms of boost serialization into my serialization infrastructure which finally maps everything into our supported formats. I had to do so because I need keep compatibility across different supported languages, framework and architecture using formats like matlab native mxArray or python native pyobject. Especially matlab was really important because now I can "materialize" boost object in the matlab workspace very easily. I'll check the possibility to freely release a boost archive that maps object defined in terms of boost serialization, without using my serialization framework of course, into matlab workspace using a set of Mex functions or a very small API to use inside a Mex. So far so good! Since I need to get dynamically the type name as string, I have actually implemented the "stringify" of the type name via the code below: template< typename T > inline const char* type_name() { return (char*) #if defined(__GNUC__) && defined(__cplusplus) abi::__cxa_demangle( typeid( T ).name(), 0, 0, NULL ); #else typeid( T ).name(); #endif } which of course uses the specific g++ demangler that follow a non standard way (I'd to follow a similar approach for the MSVC). So, my question is does the boost library provide a standard way to get the type name as string?
I looked at your post and had a lot of difficulty in figuring out how I could help you. I think the above statement is the source of my difficulty.
If you have you're own serialization library, what do you need the boost one for? If you want to use the boost one, what does your own library have to do with anything?
I should say that creating one's own archive implementation is harder than it should be. This is due to the fact that documentation and implementation of some details of this process are ambiguous and non-obvious. None-the-less a number of people have managed to create their own archive implementations without too much problem - os it IS doable. It does require knowledge of template meta-programming and investigation of ohw the current libraryies are implemented.
If you want to make an archive implementation which some format not supported by the current ones, that would be a quesion we might be able to help with. If this is what you want to do, you might look at simple log archive and trivial archive in thedocumentation. Also there are a few other archives around which illustrate how this ahs been done. There is one for yaml and one for json these are in the sandbox and/or vault. There is also included in the package a portable_binary_archive. Also special archive implementations have been created to support MPI.
Looking at your specific question, I can only say I think you need to invest more time in studying the above mentioned subjects before I could be of help.
Robert Ramey
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Roberto Fichera wrote:
On 02/13/2012 05:13 PM, Robert Ramey wrote:
Roberto Fichera wrote:
Hi All in the list,
I'm basically new to boost so I hope to find support here to integrate my own serialization library within the boost archive logic.
Just to say that I've solved all my problems. My own boost archive it's finally able to serialize everything that is defined in terms of boost serialization into my serialization infrastructure which finally maps everything into our supported formats.
Congratulations! I'm aware that doing a complete/correct job on such a task is trickier than it first appears. I would suggest that you run the test suite on your new archive. To see how to do this, look at the way the boost serialization archives (text binary, xml, etc) are tested. Plug in your own archive and you can run 50 different tests on your archives without writing even one line of code !
I had to do so because I need keep compatibility across different supported languages, framework and architecture using formats like matlab native mxArray or python native pyobject. Especially matlab was really important because now I can "materialize" boost object in the matlab workspace very easily.
hmmm - a archive implementation which serializes to/from matlab? That might be interesting for someone.
So far so good! Since I need to get dynamically the type name as string, I have actually implemented the "stringify" of the type name via the code below:
template< typename T > inline const char* type_name() { return (char*) #if defined(__GNUC__) && defined(__cplusplus) abi::__cxa_demangle( typeid( T ).name(), 0, 0, NULL ); #else typeid( T ).name(); #endif }
which of course uses the specific g++ demangler that follow a non standard way (I'd to follow a similar approach for the MSVC). So, my question is does the boost library provide a standard way to get the type name as string?
Doing something like this was considered from the very beginning but was explicitly rejected for a number of reasons. Here are main points. a) We need to relate a type to an external string in order to implement "export" functionality where by a polymorphic pointer is saved/loaded to the correct true type. b) I considered an approach to the above, but it would result in in non portable code and give me a task of trying to re-implement this for each different compiler/version. That is, it would create a secure job for me for the rest of my natural life. This would have been OK if the job were paid. But since it isn't this was unattractive. c) Even worse, the aproach above would make even text archives non-portable between platforms. Of course this is a non-starter. So you might want to re-think your approach above. Robert Ramey

On 02/15/2012 06:01 PM, Robert Ramey wrote:
Roberto Fichera wrote:
On 02/13/2012 05:13 PM, Robert Ramey wrote:
Roberto Fichera wrote:
Hi All in the list,
I'm basically new to boost so I hope to find support here to integrate my own serialization library within the boost archive logic. Just to say that I've solved all my problems. My own boost archive it's finally able to serialize everything that is defined in terms of boost serialization into my serialization infrastructure which finally maps everything into our supported formats. Congratulations! I'm aware that doing a complete/correct job on such a task is trickier than it first appears.
Yes! Indeed!
I would suggest that you run the test suite on your new archive. To see how to do this, look at the way the boost serialization archives (text binary, xml, etc) are tested. Plug in your own archive and you can run 50 different tests on your archives without writing even one line of code !
I had to do so because I need keep compatibility across different supported languages, framework and architecture using formats like matlab native mxArray or python native pyobject. Especially matlab was really important because now I can "materialize" boost object in the matlab workspace very easily. hmmm - a archive implementation which serializes to/from matlab? Yes! Actually the whole process maps the C++ object's hierarchy (in terms of serialization) into the corresponding matlab's lowlevel types which are represented by an opaque mxArray. The whole process is driven by using the lowlevel matlab sdk primitives. Finally,
Thanks for the suggestion, I'll have a look. the process is even more optimized since the matlab's object is being created once, and then updated only.
That might be interesting for someone. I can imagine ;-) .
So far so good! Since I need to get dynamically the type name as string, I have actually implemented the "stringify" of the type name via the code below: template< typename T > inline const char* type_name() { return (char*) #if defined(__GNUC__) && defined(__cplusplus) abi::__cxa_demangle( typeid( T ).name(), 0, 0, NULL ); #else typeid( T ).name(); #endif }
which of course uses the specific g++ demangler that follow a non standard way (I'd to follow a similar approach for the MSVC). So, my question is does the boost library provide a standard way to get the type name as string? Doing something like this was considered from the very beginning but was explicitly rejected for a number of reasons. Here are main points.
a) We need to relate a type to an external string in order to implement "export" functionality where by a polymorphic pointer is saved/loaded to the correct true type.
b) I considered an approach to the above, but it would result in in non portable code and give me a task of trying to re-implement this for each different compiler/version. That is, it would create a secure job for me for the rest of my natural life. This would have been OK if the job were paid. But since it isn't this was unattractive.
c) Even worse, the aproach above would make even text archives non-portable between platforms. Of course this is a non-starter. So you might want to re-think your approach above.
I agree totally with you! Actually my constraint is to use gnu toolchain, MSVC and pathscale compilers and eventually LLVM, but this one is not a priority at moment. But, any way still the problem of the demangling portability. By the way, how did you solve the problem ... if you solved it, indeed ;-) ?
Robert Ramey
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Roberto Fichera wrote:
On 02/15/2012 06:01 PM, Robert Ramey wrote:
c) Even worse, the aproach above would make even text archives non-portable between platforms. Of course this is a non-starter. So you might want to re-think your approach above.
I agree totally with you! Actually my constraint is to use gnu toolchain, MSVC and pathscale compilers and eventually LLVM, but this one is not a priority at moment. But, any way still the problem of the demangling portability.
By the way, how did you solve the problem ... if you solved it, indeed ;-) ?
It is already solved portably via the "export" functionality. Look it up in the documentation. Robert Ramey

On 02/15/2012 07:38 PM, Robert Ramey wrote:
Roberto Fichera wrote:
c) Even worse, the aproach above would make even text archives non-portable between platforms. Of course this is a non-starter. So you might want to re-think your approach above. I agree totally with you! Actually my constraint is to use gnu toolchain, MSVC and pathscale compilers and eventually LLVM, but this one is not a priority at moment. But, any way still the problem of the demangling
On 02/15/2012 06:01 PM, Robert Ramey wrote: portability.
By the way, how did you solve the problem ... if you solved it, indeed ;-) ? It is already solved portably via the "export" functionality. Look it up in the documentation.
Ok! Thanks again! I'll have a look tomorrow!
Robert Ramey
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

On 02/15/12 12:38, Robert Ramey wrote:
Roberto Fichera wrote:
On 02/15/2012 06:01 PM, Robert Ramey wrote:
c) Even worse, the aproach above would make even text archives non-portable between platforms. Of course this is a non-starter. So you might want to re-think your approach above.
I agree totally with you! Actually my constraint is to use gnu toolchain, MSVC and pathscale compilers and eventually LLVM, but this one is not a priority at moment. But, any way still the problem of the demangling portability.
By the way, how did you solve the problem ... if you solved it, indeed ;-) ?
It is already solved portably via the "export" functionality. Look it up in the documentation.
Robert Ramey Hi Robert,
Looking at: http://www.boost.org/doc/libs/1_48_0/libs/serialization/doc/serialization.ht... and reading: The system "registers" each class in an archive the first time an object of that class it is serialized and assigns a sequential number to it. Next time an object of that class is serialized in that same archive, this number is written in the archive. So every class is identified uniquely within the archive. When the archive is read back in, each new sequence number is re-associated with the class being read. Note that this implies that "registration" has to occur during both save and load so that the class-integer table built on load is identical to the class-integer table built on save. In fact, the key to whole serialization system is that things are always saved and loaded in the same sequence. This includes "registration". And paraphrasing part as: So every class is identified by this uniquely assigned sequential number within the archive. Then I'd infer that this "uniquely assigned sequential number" play a role similar to the role of the result of: abi::__cxa_demangle( typeid( T ).name(), 0, 0, NULL ); in Roberto's post: http://article.gmane.org/gmane.comp.lib.boost.user/72805 As you mention, Roberto's method is not portable between platforms (where, of course, a different compiler or even different version of the same compiler would be a different platform). OTOH, as you also mention in the above quote, boost serialization requires: things are always saved and loaded in the same sequence. Roberto, would that be a problem for you? -regards, Larry

Larry Evans wrote:
Looking at:
http://www.boost.org/doc/libs/1_48_0/libs/serialization/doc/serialization.ht...
and reading:
The system "registers" each class in an archive the first time an object of that class it is serialized and assigns a sequential number to it. Next time an object of that class is serialized in that same archive, this number is written in the archive. So every class is identified uniquely within the archive. When the archive is read back in, each new sequence number is re-associated with the class being read. Note that this implies that "registration" has to occur during both save and load so that the class-integer table built on load is identical to the class-integer table built on save. In fact, the key to whole serialization system is that things are always saved and loaded in the same sequence. This includes "registration".
And paraphrasing part as:
So every class is identified by this uniquely assigned sequential number within the archive.
Then I'd infer that this "uniquely assigned sequential number" play a role similar to the role of the result of:
abi::__cxa_demangle( typeid( T ).name(), 0, 0, NULL );
in Roberto's post:
http://article.gmane.org/gmane.comp.lib.boost.user/72805
As you mention, Roberto's method is not portable between platforms (where, of course, a different compiler or even different version of the same compiler would be a different platform).
Just a slight clarification. Both the "registration" method and the "export" method address the exact same problem: How does one know what kind to object to create when it is serialized through a base class? The library implements two different methods. Each method has it's own advantages and disadvantages and I believe that both methods are commonly used.
OTOH, as you also mention in the above quote, boost serialization requires:
things are always saved and loaded in the same sequence.
Roberto, would that be a problem for you?
Note that is a fundamental feature/restriction of this serialization library. The archive format is driven by the C++ class data structure. This is what gives it it's simplicity of use and high performance. A different library - the closest boost library is spirit - would take an externally defined syntax and map any file in that syntax to some C++ data structure. This is an entirely different problem. It's easy to fall in the trap of confusing these two jobs. Of course if one had nothing else to do, he could make an archive class which would generate an archive (data file) along with spirit karma/qi grammars which would support editing of that archive. But of course that would be - as they say - beyond the scope of this course. Robert Ramey
participants (3)
-
Larry Evans
-
Robert Ramey
-
Roberto Fichera