Re: [boost] Proposal: XML APIs in boost

11 Nov 2005

      On Wed, Nov 09, 2005 at 09:21:45PM -0500, Stefan Seefeld wrote:
...
Graham Bennett wrote:
...
Hi Doug,
On Tue, Nov 08, 2005 at 09:28:09PM -0500, Douglas Gregor wrote:
...
...
Readers are important for some things, DOM is important for other
things, but there's no reason to tie the two together in one library
or predicate one on the other.
Well, there is at least one reason - if the DOM is built on top of a
reader interface then the DOM library doesn't have to know how to
parse XML, and is not tied to any particular parser.  Even if you
don't agree with using a reader interface for this separation layer,
I'd hope you would agree that some separation is at least necessary.
I wish people would stop being so parser-focussed. I reiterate: the
API I suggest is about manipulating a DOM tree. The fact that you
*might* want to construct it from an XML file by means of a parser is
almost coincidental.
I agree that the way the DOM is created doesn't really have anything to
do with a parser or anything else.  It's perfectly possible to put the
DOM together any way you want.  I think people have expressed concern
that the intention might be to ship the library with a libxml2 (or any
specific parser) implementation for building the DOM from text, which I
don't think would be a good idea.  I was suggesting that having a way to
build the DOM from a standardised interface, like a reader, would be a
way to separate these concerns.
...
Yes indeed, an implementation of such an XML parser will most likely
use either a SAX or an XmlReader layer beneath, and in fact, libxml2
does exactly that and it would be quite natural to expose those APIs
...
to C++ in a similar way I propose the DOM wrapper.
Ok, I agree.
...
...
...
We can have a XML DOM library that  allows reading, traversing,
modifying, and writing XML documents,  then later turn the reading
part into a full-fledged streaming  interface for those
applications.
Can you elaborate on how you would enable a DOM structure to present
a streaming interface?
Not the DOM structure, but the parser ! It's exactly what you are
saying above: Each sensible XML parser will use an API underneath that
can be used to build a public SAX or XmlReader (or both) on top of.
But instead of requiring the parser to be built on such a C++ API I
use a C implenentation that already contains multiple APIs, and I wrap
them *separately* into C++ APIs. For a user of the C++ DOM API it is
totally irrelevant whether the implementation is based on the C++ SAX
API or an internal C SAX API, as long as it adhers to the
specification.
...
Are you talking about lazy tree building or something else?  In any
case, I would think it's inherantly difficult to retrofit a
streaming interface.  Much better to build the streaming interface
from the start, and build the DOM on top of it.  This can only be
good for both sides - the reader gets to just be a reader, and the
DOM gets to just be a DOM.
You haven't talked about the DOM yet, only about a parser.
I think I wasn't clear in my previous mail.  I'm not at all concerned
with parsers, there are plenty of them and they do a good job.  I'm not
suggesting a parser should be implemented.  The only thing I am
concerned about is that Boost define a standard streaming XML reader
API.  That is where I think there is a distinct need in C++ at the
moment.
...
You still need to provide all the other missing bits, such as an XPath
lookup mechanism, XInclude processing, http support for URI lookup,
etc., etc.  I can't stress it enough: the parser is really just a tiny
bit of it all.
Agreed that the parser is a small part, but so is the DOM.  All of the
things you mention above can and should be implemented independently of
a DOM model, IMO.

Please don't think that I'm against a Boost DOM implementation, I think
it's a worthy effort and what you have submitted is a good start.  I
just think that a standardised reader interface is a much more important
integration point than DOM, and I'm suggesting that it would be
worthwhile putting effort into that area sooner rather than later.

cheers,

Graham

-- 
Graham Bennett