
Anthony Williams wrote:
That's one of my main criticisms of your suggested API --- it's too tightly bound to libxml, and doesn't really allow for substitution of another parser.
Could you substantiate your claim ?
Really, it was just a feeling from looking at the API docs. However:
In order to use a particular external type, such as std::string, the user has to supply a specialization of converter<> for their type, which converts to and from the libxml xmlChar type.
Correct. That's the price to pay for not forcing any particular unicode library on users who want to use the XML API.
Also, there are lots of constructors around that take xmlNode pointers, or xmlDoc pointers, or similar. It may be that these are intended as private implementation details, but they show up in the documentation.
Ok, let's fix the documentation. :-) (You are correct, these are all private constructors. As I already mentioned, users never instantiate nodes explicitely, but use various factory methods for that purpose.)
My other criticism so far is the node::type() function. I really don't believe in such type tags; we should be using virtual function dispatch instead, using the Visitor pattern. Your traversal example could then ditch the traverse(node_ptr) overload, and instead be called with document->root.visit(traversal)
Node types aren't (runtime-) polymorphic right now, but is that really a big deal ? Polymorphism is important for extensibility. However here the set of node types is well known (and rather limited).
Polymorphism is not just important for extensibility of the polymorphic set, but also for type-safe, convenient handling of objects whose exact type is only known at runtime. Check-type-flag-and-cast is a nasty smell that I would rather avoid where possible.
I understand and agree with you...in principle. There are a number of situations where a method call will return a node_ptr, and the user typically wants to cast to the exact type. Examples include element child iteration, node_sets (resulting from xpath lookup, etc.). However, doing this with a visitor-like visit/accept pair of methods incures two virtual method calls, just to get hold of the real type. That's a lot ! As my node implementations already know their type (in terms of an enum tag), casting is a simple matter of rewrapping the implementation by a new proxy. Using RTTI to represent the node's type is definitely possible. I'm just not convinced of its advantages.
One additional comment on re-reading the samples --- having to instantiate every template for the external string type seems rather awkward.
One alternative is to accept and return an internal string type, and provide conversion functions to/from the user's external string type. This way, the library is not dependent on the string type, but it does add complexity to the interface.
Right, I considered that. One has to be careful with those string conversions, though, to avoid unnecessary copies.
Another alternative is to make the functions that accept or return the user's string type into templates, whilst leaving the enclosing class as a non-template, since there are no data members of the user's string type. Template parameter type deduction can be used to determine the type when it is given as a parameter, and explicit specification can be used when it is needed for a return type.
Actually the use I have in mind is really what I demonstrate in the examples: Developers who decide to use boost::xml::dom fix the unicode binding by specifying the string type, conversion, etc. in a single place (such as my 'string.hpp'), such then all the rest of the code doesn't need to be aware of this mechanism. In your case I couldn't encapsulate the binding in a single place, as you mention yourself. What would be possible, though, is to put all types into a single parametrized struct: template <typename S> struct types { typedef typename document<S> document_type; typedef typename node_ptr<element<S> > element_ptr; ... }; and then let the user simply write: typedef dom::types<my_unicode_string> types; ... std::auto_ptr<types::document_type> doc = dom::parse_file("input.xml"); types::element_ptr root = doc->root(); ... Regards, Stefan