Hi Stefan, In gmane.comp.lib.boost.devel you write:
Does it support a DOM-like API, i.e. an in-memory representation of the document ?
No, it does not. I spent quite a bit of time on the in-memory vs streaming debate in my talk. How I wish the video was already available... Until then, to summarize the key points: * Most people think they need DOM. I believe it is not because in-memory is conceptually better but because of the really awful and inconvenient streaming APIs (like SAX). So I tried to convince the audience that a well designed streaming pull API is actually sufficient for the majority of cases. I didn't hear many objections. Take a look at the API Introduction[1], it shows how to handle everything from converters/filters that don't care about the data, to applications that process the data without creating any kind of in-memory object model, to C++ classes that know how to persist themselves in XML. * On that last point (C++ class persistence) a lot of applications extract XML data into some kind of object model (C++ classes that correspond to the XML vocabulary). Creating an intermediate representation of XML (DOM) just to throw it way moments later seems kind of pointless. * Of course there will always be applications that need to revisit the bulk of raw XML data and for them in-memory would probably always be a better choice. * Which brings us to this point: it is easy to go from streaming to in-memory but not the other way around. * In fact, an even better approach would be to support hybrid, partially streaming/partially in-memory parsing and serialization (also discussed in the talk). Then, the fully in-memory would simply be a special case. * libstudxml has the ‘hybrid’ example which shows how to implement this hybrid approach. You would be shocked how short and simple the code is (I know I was once I wrote it ;-)). [1] http://www.codesynthesis.com/projects/libstudxml/doc/intro.xhtml#2
I have always strongly argued against the idea that an "XML API" was only about parsing XML data, as there are many useful features that involve manipulation of XML data (including transformations between documents, xpath-based search, etc.).
You need to start somewhere. And support for (relatively) low-level XML parsing and serialization seems like a good place.
In fact, I believe such an API should be robust enough to be able to wrap different backends, rather than depending on a particular implementation choice.
I don't think it will be robust. I think it will be awful and inconvenient. Try to adapt straight SAX API to anything other than callback-based with inversion of control (i.e., SAX again). Boris