Re: [boost] Proposal: XML APIs in boost

9 Nov 2005

      On Fri, Nov 04, 2005 at 08:47:53AM -0500, Stefan Seefeld wrote:
...
Jose wrote:
...
On 11/3/05, Stefan Seefeld <seefeld@sympatico.ca> wrote:
...
If there is enough interest I could add a boost::xml::reader API,
though dom and reader are completely independent, as far as the API
itself is concerned.
I am interested in the reader API !!
Fine. I eventually will add that API, though I prefer to restrict the
current discussion (and submission, if we ever get this far) to be
about boost::xml::dom, to avoid unrelated issues getting into the way.
Both APIs are orthogonal, so there is no reason to bundle them
together.
Now I hope that someone will actually start to send comments about the
actual API / code I propose. ;-)
Hi,

Here are my thoughts on the subject, somewhat disorganised:

I would also be very interested in seeing a boost XML reader API.
Having worked with such APIs I think a 'standard' pull parser model for
C++ would be really beneficial - you only have to look at .NET to see
how much having a streaming interface to XML has influenced the design
of code that builds on top of it, usually in a good way. I think a C++
reader interface should draw a lot from the basic iterator ideas
employed in the STL in its interface.

IMO a streaming interface is much more important than DOM as a starting
point - one can easily and efficiently build a DOM from a stream, but
starting with an in-memory representation of a document usually
precludes streaming.  There are a number of XML applications where it is
not desirable or possible to hold the entire document in memory at once.
A reader interface has advantages over SAX in that it is much easier to
program with.  It's very easy to do things like implement decorators
around readers, and to write generic code that just understands how to
use a reader and doesn't care how the XML is actually stored.  

That's not to say I don't think a Boost DOM implementation is a good
idea.  One thing I would like to see from such an implementation is for
it to be policy based, since there are many different use cases for a
DOM library.  For example some scenarios might only need a read-only
tree, which means optimisations can be made in how the nodes are stored.
Others might call for efficient access to child elements of a node (e.g.
by index) for query, such as when XPath is used.  If this kind of thing
could be extracted into policies I think it would differentiate such a
library from the others that exist already.

An XPath implementation should be completely separated from the XML
representation, since it's effectively just an algorithm that can be
applied to anything that has the correct data model and iterator
interface.

Thanks,

Graham

-- 
Graham Bennett