Re: [boost] [GSOC] XML library of Boost

9 May 2013

      This is a fun topic.  How should c++ play 'catchup' to other languages 
on xml handling.

What applications will develop from such an XML API?  Xml editors and 
xml creators/modifiers? Data flow and communications between apps, web 
services?
What can be leveraged in c++ to do something new/faster with xml? If 
there was a way to dynamically load a shared library(compiled at 
runtime) at run time then some pretty nitfy things could be achieved 
with metaprogramming and expression templates.

I'm not sure there are any strong backend candidates to provide 
satisfaction to c++ developers and users at this time but there has to 
be needs besides mine.

Xerces is poor at large xml documents. As far as DOM is rearranging xml 
elements/attributes being pursued? http://xalan.apache.org/ is xslt 1.0 
and after 2.0 noone wants to go back to 1.0.

Binding is an important area for me. xmlbeanscxx which is based on 
Xerces couldn't satisfy for binding(because the underlying DOM wasn't 
helpful in the task of binding) data into my applications. Xml schema 
constraints are a must for binding.  The 
http://sourceforge.net/projects/pion/ could really use a binder inside 
it's RESTful web service. In other languages compact http://relaxng.org/ 
is getting addressed too.

I just saw http://code.google.com/p/xplus-xsd2cpp/ recently and have yet 
to test it. (If you do try it, do so outside of any of your own code and 
in its own folder)

To give examples, I use cml, mathml, graphml, svg, bibtexml and a number 
of custom xml formats.  Each of these have their quirks and are 
difficult to bind.

Haven't tried http://vtd-xml.sourceforge.net/ for a while because its 
license doesn't work for my company. With custom code I've been doing 
something similar for simply reading data from xml documents.

On 05/09/2013 10:26 AM, Stefan Seefeld wrote:
...
Bjorn,
we are going in circles, which is in part because we still are talking
past each other.
In particular, it seems you aren't distinguishing between users and
developers.
On 05/09/2013 06:00 AM, Bjorn Reese wrote:
...
On 05/08/2013 02:08 PM, Stefan Seefeld wrote:
...
You are evading the question. A user may not even care how boost.xml is
implemented, as long as the functionality is there. If I'm such a user,
I don't want to be confronted with the question of what backend to pick.
Then create a 'boost-xml-standalone' package without dependencies, and
let the 'boost-xml' package depend on the 'boost-xml-standalone' and
'libxml2' packages. Problem solved.
Sorry, what problem is solved ?
...
...
Right. But again, I think you are making life much harder than it needs
to be for users. As a user I want to use the boost.xml library in my own
project. Do you really anticipate there to be a bunch of different
backends being offered to end-users to pick from, depending on what
functionality he requires ? What a drag ! Just give him a a single
I thought that this was part of the GSoC proposal, which states:
[...]
You are citing out of context. Implementing multiple backends has many
benefits for *developers*, for example as it helps to guarantee that the
API isn't tied to a particular backend. It should not affect in any way
*users*, who will only use the boost.xml API (and library), without any
concern for any particular implementation choice.
...
Having said that, with the proper defaults, the user do not have to do
anything. Only if he wants to do something different does he need to
include another header, pass an extra argument, or whatever. This is
how the rest of Boost handles variation. Why has this suddenly become
much harder?
It hasn't, and when expressed that way, I actually agree. What I don't
agree with is this:
...
Start with an XML lexer. This simply returns the next token (start tag,
attribute, data, etc.) when called.
Put the XML lexer in a loop, and you get a SAX parser.
Pair the XML lexer with a parent stack, and you get an XmlReader.
Base the DOM parser on the SAX parser to create its tree. This is how
libxml2 does it, and how it reuses the tree generator for parsing other
formats such as HTML and DocBook.
By default, I would provide our own tree, although this is not terribly
important.
While the layering you describe pretty much matches a typical
implementation, this doesn't have any consequences for users, as these
layers can't be exchanged. You can't mix a layer from one backend and
combine it with another layer from a different backend. So why care, on
an API level ?
I believe your point was that you want to be able to implement only the
"XML lexer", but neither the SAX nor DOM APIs, and still be able to call
the result "boost.xml", yes ? I still think this is a bad idea.
Otherwise, as long as the full functionality is provided, I don't care
about the implementation, and in particular, whether someone will fancy
to rewrite it "natively" instead of building on top of existing
third-party libs.
Stefan

Re: [boost] [GSOC] XML library of Boost

Roger Martin