Re: [Boost-users] Boost Serialization make_binary & XML/ASCII

19 Dec 2006

      Terence,

We use Spirit in a similar way to SAX: we read binary data from a serial
port, assemble it into discrete units (messages) and push them through
parser generated by Spirit, which in turn calls a function upon finding
a match.  Some of our messages are multi-part, so we need to keep some
form of state that allows us to assemble the final message.  I would
imagine that there would be a way to break from the parsing.

-----Original Message-----
From: boost-users-bounces@lists.boost.org
[mailto:boost-users-bounces@lists.boost.org] On Behalf Of Terence Wilson
Sent: Saturday, December 09, 2006 2:55 PM
To: boost-users@lists.boost.org
Subject: Re: [Boost-users] Boost Serialization make_binary & XML/ASCII

Robert,

XML is normally parsed using a DOM or SAX parser. DOM reads the whole
file into memory, SAX behaves like a recursive descent parser with
callbacks to the client application. By placing the data block at the
start of the file I should be able to get good performance from SAX or
Spirit. Both would be good choices, however, I want to write some
'reference' code using standard tools since my work will be part of an
SDK.

Regards,

Terence
...
-----Original Message-----
From: boost-users-bounces@lists.boost.org [mailto:boost-users- 
bounces@lists.boost.org] On Behalf Of Robert Ramey
Sent: Saturday, December 09, 2006 2:10 PM
To: boost-users@lists.boost.org
Subject: Re: [Boost-users] Boost Serialization make_binary & XML/ASCII
Well, now I'm out of my depth.  Some have commented that the spirit 
parser is slower than other xml parsers.I don't know.  I would have 
hoped that since spirit does a lot of the heavy lifting at compile 
time, it would be pretty fast.  I haven't seen too much data on this
so I really don't know.
Any parser has to scan every character in the file so its not clear to
...
me that a SAX parser or any other can be know a priore to be faster 
than any other one.
My reason for using spirit was
a) it was already part of boost
b) it was - after some learning curve - a good fit with what I wanted 
to do.
c) well documented.
d) customizable - serialization only uses a portion of the full xml so
...
it seemed the most efficient.
e) all done at compile time so it wouldn't include dead code.
f) portability to all compilers boost supports.
g) By exercising a little care in code organization I was able to 
arrange things so that the module containing the parsing didn't depend
...
on the rest of the program.  So the long compile time is not an issue.
...
It is in the library and is only recompiled when the grammar changes.
It is the last feature that suggests that you can easily use this to 
do your own actions upon parsing the serialization library.
After some initial pain figuring out how to use it, I have to say I 
have been extremely pleased with this application of spirit.  I never 
wanted to do xml serialization as I felt it was a pain in the neck and
...
of relatively little utility in my view. I had anticipated a 
maintainence nightmare so more and more obscure corners of xml syntax 
were touched.  I'm pleased to say this thing has been fantastic as far
...
as I'm concerned.  After the intial one time pain - I haven't had to 
touch it since 2002 - and this (through spirt 1.6x - still available) 
is still compatible with Borland 5.51.  And all the hacks required to 
make this so portable are only compiled into the platforms that need 
them.
This has been one of the most significant implementations in making 
the serialization library possible.  (the other one would probably be
mpl).
So if this were my problem I would:
a) Include the xml grammar and parser from the serialization library -
...
add my own actions.
b) finish my code.  Really this I would expect it would be 100 lines.
c) If its too slow - and if profiling suggests that the spirit parser 
is the bottleneck - then I would look at tweaking the grammar to speed
...
up parsing or replacing the spirit parser with a faster one.  This is 
my rule:  "First make it work ASAP - then make it faster if necessary"
But I already am somewhat familiar with spirit so it might not be an 
interesting option for you.  But then yo might be able to use the 
current parser unchanged.  Of course this would bring the huge benefit
...
that if the xml_archive parser is tweaked for some reason (there are a
...
couple of issues with special characters), you would automatically 
inherit these changes and still be in sync.
I made the choice to invest the effort to figure out spirit rather 
than write my 10,000th file parser. Of course that was my decision and
...
may not be everyone's preference.
Good Luck
Terence Wilson wrote:
...
Robert,
The utility I am writing needs to be able to extract a small portion
...
...
from a large XML file generated by your library. Since it is 
performance sensitive I chose to use a SAX parser in order to avoid 
reading the whole file. Would it be much work to do this with the 
Spirit parser?
As always, thanks for the super-fast response.
Best regards,
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Re: [Boost-users] Boost Serialization make_binary & XML/ASCII

Javier Estrada