
Stefan Seefeld wrote:
The reason to delegate to a factory is to let it do a lot of resource management that thus can be hidden from the user. There is a lot to be considered, as each node lives in a particular context given by the document as well as its position in it (think of namespaces, for example).
OK, makes sense.
Sorry, that argument I don't accept. Yes, I deliberately chose not to use the API as obtained from using the CORBA C++ bindings of the OMG IDL DOM. The hope is to get something better, much more naturally tied to modern C++ idioms. Whether or not I achieve that is to be discussed, and can be criticized, but the lack of conformance to existing DOM APIs in itself is hardly an argument worth debating.
But I don't mean the argument to stand by itself. As I said, you diverged slightly from the DOM API. The problem that I see is that it neither matches existing knowledge, nor brings sufficient advantage to justify this loss. In other words, if you leave the beaten track, you shouldn't walk right alongside it, among stones but without any advantage - you should at least take a shortcut through the woods.
OK, the API represents the Infoset, and thus has no idea of what an entity is. I'm not sure whether that would be worth adding. And if, it may be some hook into the XML writer (the XML parser already has it).
Possibly, yes. Such low-level functionality might indeed better be limited to the low-level stream representation. On the other hand, rare as their use is, there are also unparsed entities. The very much exist at the high-level representation. And not all parsed entities can be expanded, especially by non-validating parsers.
I don't understand what you are aiming at in your comment about the 'document schema'.
I mean that there is no way in the current API to obtain information about the document type, beyond its name and location.
That's an implementation detail (IMO). No, it's not. It's a matter of public derivation of classes and thus very much an interface issue. Semantically, a text node and a cdata node are distinct, Also not true, at least as far as Infoset is concerned. See also Appendix D of the Infoset spec, item 19. and so visitors shouldn't give users access to a cdata node as a text node. (And what else would the ISA relationship be good for ?)
But that's exactly what they should do (if the user wants to ignore the difference). CDATA, as I said before, is a serialization issue, and completely irrelevant to a user who just wants to know what text the document contains.
I'm sure this can be refined. (In fact, I don't think DTDs will play any significant role in the future, as other document type definitions become more popular, such as relaxng).
True. Perhaps an API for generalized schema access can be devised. Sebastian Redl