
"Russell Hind" <rh_gmane@mac.com> wrote in message news:cl7qf0$b1a$1@sea.gmane.org...
Robert Ramey wrote:
It would seem that you're using the xml archive for purporses other than
for
serialization. Of course I don't see any problem with this (until one decides to edit it and change its schema). But I am curious what use you've found fot it. I originally did it only to satisfy boost nit-pickers as I felt it was an inefficient way to implement serialization. I've since found it useful for debugging archives. I seems to be compatile with xml viewers so its useful for rendering archives in a visible way. So, after all I have to concede that the nit-picker do have a point. I have a sneaking suspicion that it will turn up in all kinds of unexpected places and I'm wonder what those might be.
I've been using our in-house implemented serialiazation stuff for a few years which offsers similar functionality to yours. Unforunately ours was very geard towards quickly dealing with large (>1Gb) files that have 100,000's pointer-based objects stored in them so was tied to a specific app.
The systems we are dealing at the moment only generates smaller files (20Mb or so) so boost::serialization will hopefully support it nicely. It also gives the advantage of XML/text archives as well as binary.
We have an R&D group who only use python for testing purposes and want to read in our data files for extra processing and trying out new ides. Binary files are by far the most efficient, but describing the structure of a binary archive to someone who only uses python isn't easy at all. So XML seems like the way to go as they can visually look at it and see the information they want to pick out easily.
Can't you use boost python to call boost serialization from within python via some wrapper function? wouldn't this make the whole process totally painless? I believe someone else, (I forgot whom) was doing this with good success.
Our data consists of many settings, 3d model information uses comments etc, all which are textual so XML/text supports them well, but three quarters of the data is vectors floating point scan data. Writing these textually would lead to an over-top archive. Complete binary would mean passing it to python users would be a pain, so XML with encoding seems like a good solution.
When the files get bigger, we can put them through a zip because the python lot could still handle un-zipping and then reading xml so that isn't an issue.
If it wasn't for the need to let our R&D group have access to data in this way, then I would go for a binary format but I'm hoping that ultimately zipped XML won't be a lot larger for our files (hoping to test in the next few days).
The urgency of getting serialization up and running is that I've shyed away from introducing our serialization stuff in to the project and generating files in its format because I was hoping that boost serialization would be out in time (we ship in December) and could move to that as it is a much more flexible system than our in house one.
you have a couple of options:
a) Make your own derivation of xml_(i/o)archive which uses your own
version
of write/read_binary. Advantage - wouldn't touch the current archive classes. The manual describes how to do this. b) Just fix the current code that does the read/write_binary text data. You could roll this in to your own version of 1.32 and be on your way. This is implemented as part of the dataflow iterators and I don't think this is very difficult except that that understanding my dataflow iterator idea would take some investment of effort that might not be worthwhile. There is already a test for serialization of binary data so even that is done. The reason I don't do it now is that it starts a whole chain reaction regarding testing on all the platforms that boost supports and it is a very inconvenient time to do this. Also no one raised the issue until now.
Fixing the current code would be my ideal solution, I'll just have to see how much time I get to look in to this. If not, for now, I'm sure the python lot can handle adding the necessary padding characters in.
I take it the archive version will be increased for the next release if something like this changes so current files will be compatible?
Thanks
Russell
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost