Re: [boost] [xml] Brainstorming / Request for Comments, Suggestions, Opinions ...

7 Sep 2006


      Sebastian Redl wrote :
...
1) API TYPE
Pull-API (StAX), Push-API (SAX), Object-Model-API (DOM)?
- All of them, of course! The main question is, which one is the base API?
- DOM is out of the question (performance/memory overhead).
- Implementing a push parser on top of a pull parser is trivial:
  while(fetchEvent()) pushEvent()
- Implementing a pull parser on top of a push parser requires at least
  generator-style coroutines. This occurs a performance overhead at best,
  unusability at worst (in limited environments).
- It is therefore best to use a pull model at the lowest model, although this
  makes the parser implementation more complex.
It could also be possible to make the push and pull parsers more or less 
independant, so that each one can be as efficient as it can be.
...
3) Input/Output System
How does the library access underlying storage?
...
- Since it needs to access resources from various sources, typically
specified
  as URLs, it needs a flexible and runtime-switchable input system.
- In particular, it should be possible to plug schema resolvers in at
runtime,
  so that program extensions can provide support for, say, the ftp: schema.
That would be the work of another library, that would provide a way to 
read any kind of resource from an URL, a bit like what PHP has.
That kind of library would be very useful too outside of the XML library.
...
- Two basic options:
  - Iterator-based approach.
  - Stream-based approach.
  - Other?
Maybe a more low-level approach like what boost asio provides could be 
interesting, especially since this models also provides asynchronous I/O.
...
4) Integration With Other Boost Libraries
What other Boost libraries should Xml work/integrate with?
Since XML needs good Unicode support and the like, maybe there is work 
to be done in that area first in boost.
...
- For example, does it make sense to provide an interface to the parser
that can
  be used for parsing streaming content? Either non-blocking, with the option
  to parse partial data and hop back on missing content, or a completely
  asynchronous implementation that dispatches SAX events through e.g. ASIO?
The ability to parse partial content would be a great plus.
...
5) Parser Back-End / Library Organization
- Should Boost.Xml be a complete XML solution, with a parser, DOM
implementation
  and everything?
Writing a complete XML solution is a lot of work, especially if you want 
to support all XML technologies (XMLSchema, RelaxNG, XPath, XLink, 
XInclude, XPointer...)
Maybe it could be interesting to reuse libxml2, which is under the MIT 
license, to build something on top of it. Of course first we need to 
weight the gains behind a new C++ implementation.
...
- Or should it be split into two parts, one being a parser, the other a DOM
  implementation with various construction modes?
- Or should even the core parser be split into the actual text parser and the
  event/pull/whatever interface, so that an HTML or YAML or PYX parser or even
  an algorithmic content generator can be placed behind?
- What, then, is the interface between that parser and the user interface?
6) Other Issues ???
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] [xml] Brainstorming / Request for Comments, Suggestions, Opinions ...

loufoque