Boost Serialization make_binary & XML/ASCII

I need to parse an XML file generated by Boost Serialization, part of the file is a section generated by make_binary. I was wondering which scheme was used to convert binary to ASCII, uuencode, Base64? Obviously I need to know in order that I can convert back. TIA Dr. Terence Wilson Principal, Bayside Technical email: tez@latte.com web: http://latte.com

Base64.
Code for translating back and forth between binary and base64 can be found
in
boost/archive/iterators
This code is not dependent upon the serialization library or archive classes
so it can be used on its own.
Robert Ramey
"Terence Wilson"

And another point.
If you want to parse the xml generated by boost serialization, the easy way
would be just to use the spirt parser that is generated just for that
purpose and is included in the library. This is also independent of the
serialization library itself. The serialization library is really an "use
case" of the spirit library for xml parsing.
And it already includes the base64 to binary stuff.
So, using the stuff that's already in there, the whole job is probably under
100 new lines of code.
Good Luck
Robert Ramey
"Terence Wilson"

Robert, The utility I am writing needs to be able to extract a small portion from a large XML file generated by your library. Since it is performance sensitive I chose to use a SAX parser in order to avoid reading the whole file. Would it be much work to do this with the Spirit parser? As always, thanks for the super-fast response. Best regards, Terence

Well, now I'm out of my depth. Some have commented that the spirit parser is slower than other xml parsers.I don't know. I would have hoped that since spirit does a lot of the heavy lifting at compile time, it would be pretty fast. I haven't seen too much data on this so I really don't know. Any parser has to scan every character in the file so its not clear to me that a SAX parser or any other can be know a priore to be faster than any other one. My reason for using spirit was a) it was already part of boost b) it was - after some learning curve - a good fit with what I wanted to do. c) well documented. d) customizable - serialization only uses a portion of the full xml so it seemed the most efficient. e) all done at compile time so it wouldn't include dead code. f) portability to all compilers boost supports. g) By exercising a little care in code organization I was able to arrange things so that the module containing the parsing didn't depend on the rest of the program. So the long compile time is not an issue. It is in the library and is only recompiled when the grammar changes. It is the last feature that suggests that you can easily use this to do your own actions upon parsing the serialization library. After some initial pain figuring out how to use it, I have to say I have been extremely pleased with this application of spirit. I never wanted to do xml serialization as I felt it was a pain in the neck and of relatively little utility in my view. I had anticipated a maintainence nightmare so more and more obscure corners of xml syntax were touched. I'm pleased to say this thing has been fantastic as far as I'm concerned. After the intial one time pain - I haven't had to touch it since 2002 - and this (through spirt 1.6x - still available) is still compatible with Borland 5.51. And all the hacks required to make this so portable are only compiled into the platforms that need them. This has been one of the most significant implementations in making the serialization library possible. (the other one would probably be mpl). So if this were my problem I would: a) Include the xml grammar and parser from the serialization library - add my own actions. b) finish my code. Really this I would expect it would be 100 lines. c) If its too slow - and if profiling suggests that the spirit parser is the bottleneck - then I would look at tweaking the grammar to speed up parsing or replacing the spirit parser with a faster one. This is my rule: "First make it work ASAP - then make it faster if necessary" But I already am somewhat familiar with spirit so it might not be an interesting option for you. But then yo might be able to use the current parser unchanged. Of course this would bring the huge benefit that if the xml_archive parser is tweaked for some reason (there are a couple of issues with special characters), you would automatically inherit these changes and still be in sync. I made the choice to invest the effort to figure out spirit rather than write my 10,000th file parser. Of course that was my decision and may not be everyone's preference. Good Luck Terence Wilson wrote:

Robert, XML is normally parsed using a DOM or SAX parser. DOM reads the whole file into memory, SAX behaves like a recursive descent parser with callbacks to the client application. By placing the data block at the start of the file I should be able to get good performance from SAX or Spirit. Both would be good choices, however, I want to write some 'reference' code using standard tools since my work will be part of an SDK. Regards, Terence
participants (2)
-
Robert Ramey
-
Terence Wilson