Re: [boost] [xml] XML Reader Interface

30 Oct 2006


      loufoque wrote:
...
Sebastian Redl wrote:
...
Once again I'm turning to the list for discussion about a design issue
in the XML library. This time I hope to avoid any discussion about the
implementation on the library and focus on interface only.
Have you thought about asynchronous parsing?
How could that be available?
I have thought about it. A pull interface is not very suited for
asynchronous parsing, but it will provide non-blocking parsing
(returning a "would-block" event if not enough input is available).
The push interface will provide asynchronous parsing by somehow
registering with an ASIO io_service. I'll have to take a closer look at
ASIO to find out how exactly to realize this, though.
...
There are of course variations, like the one Matt Gruenke revealed.
You could provide the inheritance interface but with the objects 
actually owned by the parser (making it kind of like the monolithic 
interface), and use variant to store those objects on the stack.
This idea doesn't look so bad actually, since you have the second 
solution without its drawbacks and that you only gain the advantages of 
the first solution (if you provide the appropriate tools to allow copy 
construction of the referenced objects, that is).
Yes, that sounds like a good solution indeed.
...
I don't understand, though, if you mean that the parser containing its 
state is a good thing or not.
Neither. Both modes have advantages and disadvantages.
...
Examples of how some basic operations could be done with those 
interfaces would come in handy to compare them for the ones, like me, 
that don't have much experience with parsing XML.
Yes, good idea. I'll work something up.
...
Validation is quite costly: a way to prevent it would be nice. And it's 
not just DTD, there are other validation means.
Like Relax NG and Schema. I know. But as Stefan Seefeld correctly
posted, this is not about validation. This is about what to do with
errors that come up during validation and/or well-formedness checking.
...
However, without validation you don't know what the `id' attribute is, 
which is quite annoying. It seems that's why they introduced xml:id.
Browser engines like Gecko don't validate but they know what the id 
attributes are for each namespace that they handle. Maybe something 
similar could be done, be it with static data or user input.
I plan to support the xml:id specification, but not store any knowledge
of specific namespaces. Of course, there will be a way to feed the
validator programmatically, so this could be implemented easily on top
of that.
...
...
Should errors be reported as error events, or as
exceptions?
We expect errors to happen, so we shouldn't use exceptions.
Do we? A SOAP server typically expects to receive programmatically
generated XML, so it ought to be error-free.
On the other hand, an XML-aware editor fully expects errors, because
they're guaranteed to be there in incomplete documents.
...
We could allow them to be toggled on though, for users that don't want 
to check for such things and are not looking for super efficiency.
That's what I think, too.
Maybe they should be using a higher level API then though.
Perhaps, but some people might have memory as their main constraint, not
speed. They would still want to use a low-level interface, yet not
expect errors.

Sebastian Redl