
It seems to me that Boost lacks "typical XML parser" (by that I mean something offering DOM and SAX parsing as well as validation and maybe some other features). I did not look at serialization that much but I suspect that it does not offer such properties. Why is that? Any technical reasons? Or maybe "political" reasons? Or perhabs simply no one did it? Adam Badura

Adam Badura wrote:
There is extensive discussion of this issue in the archives. Try searching for "Boost.Xml"
Adam Badura
-- Jonathan Turkanis CodeRage http://www.coderage.com

Jonathan Turkanis wrote:
...and then check out the boost.xml sandbox project: http://svn.boost.org/trac/boost/browser/sandbox/xml I'd be glad to get some motivation to work on it some more. :-) Regards, Stefan -- ...ich hab' noch einen Koffer in Berlin...

Adam Badura wrote:
You need a subversion client, and then check out the code using this URL: http://svn.boost.org/svn/boost/sandbox/xml
for appropriate link (this seems moste logical) however did not found any. How many people work on this project?
I have written it alone, but I'm happy to collaborate. It's a relatively thin layer on top of libxml2 (http://xmlsoft.org/) that offers a DOM-like and an XMLReader-like interface. Look at the examples (http://svn.boost.org/trac/boost/browser/sandbox/xml/libs/xml/example) to see the functionality that is already implemented. Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin...

There is extensive discussion of this issue in the archives. Try searching for "Boost.Xml"
I did some searching on the archive however found not that much. Yes. Sure. There were few discussion however thay were mainly lists of wishes and arguing which technology and methodology would be best. I did not found any actual reaosn of the library not being in the boost still. It seems however (after Stefan's post) that some work is done right now on this subject. Good to hear (read) that. Adam Badura

Barco You wrote:
There are xerces and miniXML ... I think it's the real reason not to do so much redundance. :)
There's also RapidXML by Marcin Kalicinski (Boost license), which I wasn't aware of when Stefan presented his libxml2-based library: http://rapidxml.sourceforge.net/ http://rapidxml.sourceforge.net/manual.html Quote: "RapidXml is an attempt to create the fastest XML DOM parser possible, while retaining useability, portability and reasonable W3C compatibility. It is an in-situ parser written in C++, with parsing speed approaching that of strlen() function executed on the same data." It achieves its high performance, IIUC, by not copying the XML as it parses; instead it records pointers into the source text. This is an approach that I have used with other data formats - I recently mentioned a const_string_facade class that I have written for this - and it works well for me. It would be great to see some real-life feature-set, performance and usability comparisons of this approach and a more traditional parser. (Actually there are some numbers in the rapidxml manual linked above, but they don't include libxml2). Regards, Phil.

Phil Endecott <spam_from_boost_dev <at> chezphil.org> writes:
I've been doing a bit of testing recently with Arabica (http://www.jezuk.co.uk/cgi-bin/view/arabica). It's a bit more 'heavy duty' than some of the previously mentioned libs, but it can be configured to use a number of different XML parsers including libxml2, xerces and MSXML. It also uses Boost internally and uses a BSD type license.

Hi, I have used both rapidxml and pugixml (which inspired rapidxml) http://code.google.com/p/pugixml/ and they are the best c++ xml libs I found (if you don't need an xml validating parser). RapidXML will be supported or is supported by the boost property-tree library but that does not mean it can be included without a review. It would be great if the authors can put forward one of the libraries (or a combined one) for a Boost review. It would be a great addition ! regards jose On Jan 9, 2008 2:28 PM, Stefano Delli Ponti <stefano.delliponti@gmail.com> wrote:

Phil Endecott wrote:
Yes, being able to compare side-by-side would certainly help. Please note that my goal in writing the boost.xml API was not to endorse one particular backend API or another, but rather to use an existing library (since, as we discussed numerous times, reinventing the wheel would be rather naive) and hook it up to a *backend-independent* API. The API itself must not rely on any backend-specific details ! Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin...

On 09/01/2008, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
Phil - I did a quick perf test of libxml2 vs rapidxml 1.1 today. I used a 12MB XML file, which I pre-loaded before doing an in-memory parse with both libraries. rapidxml was repeatably 20x faster than libxml2. Scarily quick, in fact - it parsed my 12MB file in about 100ms... I do need to verify that they both present the same set of nodes, attributes etc, but it's a promising showing by rapidxml... Stuart Dootson
participants (9)
-
Adam Badura
-
Barco You
-
Jonathan Turkanis
-
Jose
-
Phil Endecott
-
Richard Webb
-
Stefan Seefeld
-
Stefano Delli Ponti
-
Stuart Dootson