
Sebastian Redl wrote :
1) API TYPE Pull-API (StAX), Push-API (SAX), Object-Model-API (DOM)?
- All of them, of course! The main question is, which one is the base API? - DOM is out of the question (performance/memory overhead). - Implementing a push parser on top of a pull parser is trivial: while(fetchEvent()) pushEvent() - Implementing a pull parser on top of a push parser requires at least generator-style coroutines. This occurs a performance overhead at best, unusability at worst (in limited environments). - It is therefore best to use a pull model at the lowest model, although this makes the parser implementation more complex.
It could also be possible to make the push and pull parsers more or less independant, so that each one can be as efficient as it can be.
3) Input/Output System How does the library access underlying storage?
- Since it needs to access resources from various sources, typically specified as URLs, it needs a flexible and runtime-switchable input system. - In particular, it should be possible to plug schema resolvers in at runtime, so that program extensions can provide support for, say, the ftp: schema.
That would be the work of another library, that would provide a way to read any kind of resource from an URL, a bit like what PHP has. That kind of library would be very useful too outside of the XML library.
- Two basic options: - Iterator-based approach. - Stream-based approach. - Other?
Maybe a more low-level approach like what boost asio provides could be interesting, especially since this models also provides asynchronous I/O.
4) Integration With Other Boost Libraries What other Boost libraries should Xml work/integrate with?
Since XML needs good Unicode support and the like, maybe there is work to be done in that area first in boost.
- For example, does it make sense to provide an interface to the parser that can be used for parsing streaming content? Either non-blocking, with the option to parse partial data and hop back on missing content, or a completely asynchronous implementation that dispatches SAX events through e.g. ASIO?
The ability to parse partial content would be a great plus.
5) Parser Back-End / Library Organization
- Should Boost.Xml be a complete XML solution, with a parser, DOM implementation and everything?
Writing a complete XML solution is a lot of work, especially if you want to support all XML technologies (XMLSchema, RelaxNG, XPath, XLink, XInclude, XPointer...) Maybe it could be interesting to reuse libxml2, which is under the MIT license, to build something on top of it. Of course first we need to weight the gains behind a new C++ implementation.
- Or should it be split into two parts, one being a parser, the other a DOM implementation with various construction modes? - Or should even the core parser be split into the actual text parser and the event/pull/whatever interface, so that an HTML or YAML or PYX parser or even an algorithmic content generator can be placed behind? - What, then, is the interface between that parser and the user interface?
6) Other Issues ???
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost