
The approach I have taken is to provide a (thin) wrapper around an existing (well tested and supported, and very fast) library (libxml2), so I don't have to reinvent those wheels again. Please note that none of this should leak through the API, i.e. the API can be re-implemented differently without users having to notice.
This is similar to the direction of the BigInt proposals, which also suffers bikeshed discussions :) I think think that this is the right direction. Abstracting the interface leaves room for lots other (more user-specific) parsers or frameworks. If you would like to get involved, I would be happy. But I shall retract it
from the list of GSoC ideas, for the aforementioned reasons.
I hope you'll reconsider. I think having a student build another back end or two and work on polishing the interface would make a pretty good summer project. Andrew Sutton andrew.n.sutton@gmail.com