Re: [boost] Proposal: XML APIs in boost

6 Nov 2005


      * Stefan Seefeld <seefeld@sympatico.ca> [2005-11-04 11:39]:
...
Anthony Williams wrote:
...
It is far easier to write a parser that calls user code (push
model) than write a parser that can be continued (pull model),
since in the pull model you have to save all the internal state
in order to return to the user with each token; you basically
have to write a "continuations" mechanism.
Fair enough. But here we are (or should be) focussed on the API,
i.e. the user. The question is whether to put the parser in
control of the data flow or the application. While the latter is
harder to implement it is also far more convenient for users.
Harder to implement could also imply a complexity that effects
    performance. If the user is consuming a document object model,
    whether that document is build via a push parser or a pull
    parser is moot, and the overhead of maintaining pull parser
    state is nothing but a penalty.
...
...
...
As it happens, the implementation I have in mind uses libxml2, a C
library. As such between the application calling 'parse()' and the
callbacks are two language boundaries (C++ -> C and C -> C++), so
you couldn't even throw exceptions from inside the callbacks and
catch them in the main application.
...
...
That's one of my main criticisms of your suggested API --- it's
too tightly bound to libxml, and doesn't really allow for
substitution of another parser.
...
Could you substantiate your claim ?
Sorting out exception handling, though and event framework like
    a push parser framework is no small challenge.
    
    I've always been critical of the Java SAXException, it is
    checked, and it cannot wrap a runtime expcetion, two choices
    that maximize the chanllenges of tunneling exceptions.
...
...
My other criticism so far is the node::type() function. I really
don't believe in such type tags; we should be using virtual
function dispatch instead, using the Visitor pattern. Your
traversal example could then ditch the traverse(node_ptr)
overload, and instead be called with
document->root.visit(traversal)
Node types aren't (runtime-) polymorphic right now, but is that
really a big deal ?
...
Polymorphism is important for extensibility. However here the set
of node types is well known (and rather limited).
What about a Post-Schema Valiation Infoset PSVI? With XMLSchema
    the types of nodes are unlimited.

--
Alan Gutierrez - alan@engrm.com - http://engrm.com/blogometer/