On Wed, May 21, 2014 at 7:57 PM, Boris Kolpackov
In fact, I believe such an API should be robust enough to be able to wrap different backends, rather than depending on a particular implementation choice.
I don't think it will be robust. I think it will be awful and inconvenient. Try to adapt straight SAX API to anything other than callback-based with inversion of control (i.e., SAX again).
SAX is not that bad, once you have a layer on top to push/pop handlers for various XML elements. That's the technique the Java Ant build tool used (on top of the Java SAX APIs), and I've adapted the same technique on top of Qt's SAX API, before Qt's pull-parser came along. Basically each function of a recursive descent parser is replaced by a handler instance, and the C/C++ function stack is replaced by an explicit stack. But that's beside the point, I also agree with you that a pull-parser is much nicer to program against, and the DOM-like APIs can easily be layered on top of those. But it's actually harder that it looks to properly implement a standard compliant XML parser dealing correctly with DTDs, character and system entities, encodings, namespaces, space normalizations, default attributes from inline or out-of-line DTDs, etc, etc... That you base your library on the long established Expat parser, from James Clark, one of the world's XML expert, is probably a good thing, although the fact it hasn't seen any release since 2007 is a bit worrying (and the license might indeed be an issue). Many people don't care about these XML "details", but any library worthy of boost that wants to be a foundational building block (in Niall's term) at the bottom of a Boost/C++ XML ecosystem should strive for full conformance IMHO, or at least provide all the low-level tools to allow another library on top to be conformant. Some apps want very-low level knowledge of the structure of an XML document, including all low level irrelevant whitespace, processing instructions, character entities, etc... (something even the XML standards don't necessarily allow), while others don't care and want the XML *InfoSet* as specified in the XPath/XSL standards. Then there's also schema-aware processing which associates XSD types to elements, validating parsers (DTDs or XSDs), etc... Sounds like your library targets the lower-level parsing part, but even that is non-trivial and rarely truly conformant in the many XML libraries out there, so hopefully you're aware of all this, and will explicitly document your conformance level, or lack thereof. My $0.02. --DD