Re: [boost] [Boost Review] Property Tree Library

- was the library's design flexible enough? If not, how would you suggest it should be redesigned to broaden its scope of use?
Very promising looking library. I think we've all had to roll some sort of config parser in the past, so there is a lot of potential for this to gain widespread adoption if its functionality exceeds home-grown versions. Two things that I had to roll into my config system: a) ability to include other files b) inline variable expansion, e.g. site.root = c:\dev site.home = ${site.root}\home The former is important for me as I'll often have a lot of common configuration files shared between products, but I don't want client code having to care about opening half a dozen conf files to find a particular setting. Instead, client code opens '<productname>.conf', and that file happens to 'include(...)' all the other config files necessary, The latter is very Java-.properties-esque, but I don't see any reason why it couldn't be used here. It's extremely flexible, especially if you leave evaluation of inline properties to the very last minute, such that you can consult the entire property tree for the corresponding value (rather than expanding the value as soon as you come into contact with it, restricting you to reference values that have been defined before your reference occurs). What are your thoughts on this being included in your library? Anyone else need to do this type of stuff, or would find this useful? Trent. -- http://www.onresolve.com

Two things that I had to roll into my config system: a) ability to include other files
This is a parser job to handle includes. Of the parsers included with the library only INFO supports them, but I believe other formats (XML, JSON, INI) do not have a notion of include at all.
b) inline variable expansion, e.g. site.root = c:\dev site.home = ${site.root}\home
Yeah, long time ago the library supported that. There are actually 2 types of "dynamic values", or inline-expanded values as you call them: 1. Values which are expanded statically during parsing 2. Values which are fully dynamic - i.e. lookup is performed everytime you access the value (so if you change the original, the dependent changes as well). #1 is just a parser issue - something along the lines of statically expanded entity references in XML. #2 requires support infrastructure from the basic_ptree class - namely a pointer to parent. And a way to describe relative paths, so more extra characters would be needed than just the separator.
What are your thoughts on this being included in your library? Anyone else need to do this type of stuff, or would find this useful?
If there is enough interest I may try to revive it. On the other hand, a better solution might be to build another class on top of ptree that supports dynamic values. Marcin

Marcin Kalicinski wrote:
Two things that I had to roll into my config system: a) ability to include other files
This is a parser job to handle includes. Of the parsers included with the library only INFO supports them, but I believe other formats (XML, JSON, INI) do not have a notion of include at all.
XML itself may not have a notion of inclusions, but you could still emulate them. To use the five minute tutorial's example: <debug> <filename>debug.log</filename> <pt:include src="debug-modules.conf" /> <level>2</level> </debug> and in debug-modules.conf, you'd have: <modules> <module>Finance</module> <module>Admin</module> <module>HR</module> </modules> This could pose problems with interoperability, depending what other programs and libraries you include to use the XML file with. Another problem is that while this could be done in the parser, it may be more suited elsewhere. -Lee

On Wed, 19 Apr 2006 23:22:42 +0100, Lee Houghton wrote
Marcin Kalicinski wrote:
Two things that I had to roll into my config system: a) ability to include other files
This is a parser job to handle includes. Of the parsers included with the library only INFO supports them, but I believe other formats (XML, JSON, INI) do not have a notion of include at all.
XML itself may not have a notion of inclusions,
It has more than a notion of includes, there's a whole spec: http://www.w3.org/TR/xinclude/ We use this all the time in BoostBook xml docs -- example from the date_time docs... <xi:include href="conceptual.xml"/> <xi:include href="usage_examples.xml"/> <xi:include href="gregorian.xml"/> .... Jeff

XML itself may not have a notion of inclusions,
It has more than a notion of includes, there's a whole spec:
http://www.w3.org/TR/xinclude/
We use this all the time in BoostBook xml docs -- example from the date_time docs...
<xi:include href="conceptual.xml"/> <xi:include href="usage_examples.xml"/> <xi:include href="gregorian.xml"/>
It seems XML is even more complicated I thought it is, then ;-) If this is standard, it may be quite easy to implement includes in XML parser. Best regards, Marcin

Marcin Kalicinski wrote:
XML itself may not have a notion of inclusions,
It has more than a notion of includes, there's a whole spec:
http://www.w3.org/TR/xinclude/
We use this all the time in BoostBook xml docs -- example from the date_time docs...
<xi:include href="conceptual.xml"/> <xi:include href="usage_examples.xml"/> <xi:include href="gregorian.xml"/>
It seems XML is even more complicated I thought it is, then ;-) If this is standard, it may be quite easy to implement includes in XML parser.
So, are you gonna roll namespace-aware XML parser? This is not trivial either, since there are things like default namespace and so on. - Volodya

It seems XML is even more complicated I thought it is, then ;-) If this is standard, it may be quite easy to implement includes in XML parser.
So, are you gonna roll namespace-aware XML parser? This is not trivial either, since there are things like default namespace and so on.
I was not aware that includes are connected with namespaces. I think my XML knowledge is not does not stand up to the need. I based my parser on Dan Nuffer's sources, and hope he got the corner cases right :-) On the other hand, if boost gets its XML library, my XML parser will be nothing more but obsolete, so it may not be worthwhile to spend too much time on it at the moment. Kind regards, Marcin

Marcin Kalicinski wrote:
It seems XML is even more complicated I thought it is, then ;-) If this is standard, it may be quite easy to implement includes in XML parser.
So, are you gonna roll namespace-aware XML parser? This is not trivial either, since there are things like default namespace and so on.
I was not aware that includes are connected with namespaces. I think my XML knowledge is not does not stand up to the need. I based my parser on Dan Nuffer's sources, and hope he got the corner cases right :-) On the other hand, if boost gets its XML library, my XML parser will be nothing more but obsolete, so it may not be worthwhile to spend too much time on it at the moment.
There are a number of applications that just needs a simple xml-parser. For example, most of the xml-files in my company are simple recursive structures with a single id="42" attribute here and there. But that id could easily be put into a tag instead. I have not yet understood why xml needs to be so sophisticated, and will probably continue to ignore all those wierd an advanced xml-features. Anyway, Marcin's parser also builds a ptree which is a great benefit compared to the DOM parsers I have seen. Just make sure it is well-defined what subset of xml that can be parsed and what cannot. In spirit of the library, anything that is not really easy and simple to support, should be rejected. -Thorsten

There are a number of applications that just needs a simple xml-parser. For example, most of the xml-files in my company are simple recursive structures with a single id="42" attribute here and there. But that id could easily be put into a tag instead.
I must say this is also true in my case. I've never needed to use the dark side of XML, and I hope I will not need in the future. But I recognize there are some people who have different needs, and for them ptree will be too simplistic. For all the others, it should be fine. Best regards, Marcin

Thorsten Ottosen wrote:
Marcin Kalicinski wrote:
I was not aware that includes are connected with namespaces. I think my XML knowledge is not does not stand up to the need. I based my parser on Dan Nuffer's sources, and hope he got the corner cases right :-) On the other hand, if boost gets its XML library, my XML parser will be nothing more but obsolete, so it may not be worthwhile to spend too much time on it at the moment.
There are a number of applications that just needs a simple xml-parser. For example, most of the xml-files in my company are simple recursive structures with a single id="42" attribute here and there. But that id could easily be put into a tag instead.
I have not yet understood why xml needs to be so sophisticated, and will probably continue to ignore all those wierd an advanced xml-features.
Anyway, Marcin's parser also builds a ptree which is a great benefit compared to the DOM parsers I have seen.
Just make sure it is well-defined what subset of xml that can be parsed and what cannot. In spirit of the library, anything that is not really easy and simple to support, should be rejected.
I agree. I have the same observation. Most practical uses of XML are actually very simple. I too do not understand why XML needs to be so sophisticated. In light of this discussion and with the development of Spirit-2, I'm interested to write such a "minimal" XML parser. What I would like to know is, what that well-defined subset of xml is (as simple as it can get but still practically useful). I bet the result would be a lean-mean machine. (I'm cross posting to the spirit list) Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

On 4/22/06, Joel de Guzman <joel@boost-consulting.com> wrote:
Thorsten Ottosen wrote:
Marcin Kalicinski wrote:
I was not aware that includes are connected with namespaces. I think my XML knowledge is not does not stand up to the need. I based my parser on Dan Nuffer's sources, and hope he got the corner cases right :-) On the other hand, if boost gets its XML library, my XML parser will be nothing more but obsolete, so it may not be worthwhile to spend too much time on it at the moment.
There are a number of applications that just needs a simple xml-parser. For example, most of the xml-files in my company are simple recursive structures with a single id="42" attribute here and there. But that id could easily be put into a tag instead.
I have not yet understood why xml needs to be so sophisticated, and will probably continue to ignore all those wierd an advanced xml-features.
Anyway, Marcin's parser also builds a ptree which is a great benefit compared to the DOM parsers I have seen.
Just make sure it is well-defined what subset of xml that can be parsed and what cannot. In spirit of the library, anything that is not really easy and simple to support, should be rejected.
I agree. I have the same observation. Most practical uses of XML are actually very simple. I too do not understand why XML needs to be so sophisticated.
All the significant projects I know of that use XML tend to use namespaces and schemas, for example Mozilla, Gnome, OpenOffice. Namespaces are useful in XML for the same reason their useful in C++: modularity, which is good if you're dealing with a project maintained by more than one author with shared components that encode data as XML. Schemas give you data types and type checking, which obviously is nice to have when you're dealing with data. I think XML schema validation is one of the most import features of XML for the same reason that I like C++ templates and type-safe compile time polymorphism: making sure your data types are correct before hand gives you one less thing to worry about. For anyone interested in becoming convinced of the usefulness of these XML features I would suggest the tutorials at http://www.w3schools.com. For a real world example, check out Mozilla's XML User Interface Language (XUL) at http://developer.mozilla.org/en/docs/XUL which uses XML to configure the appearance of application GUIs. Of course there are also tons of books about this stuff as well. I don't think this has any repercussions for property_tree other than to recognize that for the initial release it won't scale beyond trivial application configurations. That may be fine to begin with, but at some point Boost users may have higher expectations. We're spoiled rotten by Boost.Regex and others. Daniel Walker

Daniel Walker wrote:
On 4/22/06, Joel de Guzman <joel@boost-consulting.com> wrote:
Thorsten Ottosen wrote:
I have not yet understood why xml needs to be so sophisticated, and will probably continue to ignore all those wierd an advanced xml-features.
I agree. I have the same observation. Most practical uses of XML are actually very simple. I too do not understand why XML needs to be so sophisticated.
All the significant projects I know of that use XML tend to use namespaces and schemas, for example Mozilla, Gnome, OpenOffice. Namespaces are useful in XML for the same reason their useful in C++: modularity, which is good if you're dealing with a project maintained by more than one author with shared components that encode data as XML.
Schemas give you data types and type checking, which obviously is nice to have when you're dealing with data. I think XML schema validation is one of the most import features of XML for the same reason that I like C++ templates and type-safe compile time polymorphism: making sure your data types are correct before hand gives you one less thing to worry about.
Why is that better than a run-time exception when loading the file?
For anyone interested in becoming convinced of the usefulness of these XML features I would suggest the tutorials at http://www.w3schools.com.
There's like 17-18 tutorials on XML. I rest my case.
I don't think this has any repercussions for property_tree other than to recognize that for the initial release it won't scale beyond trivial application configurations.
I think it is important that it never scales beyond simply things.
That may be fine to begin with, but at some point Boost users may have higher expectations. We're spoiled rotten by Boost.Regex and others.
That's where a full xml-library comes in handy. -Thorsten

On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Daniel Walker wrote:
On 4/22/06, Joel de Guzman <joel@boost-consulting.com> wrote:
Thorsten Ottosen wrote:
I have not yet understood why xml needs to be so sophisticated, and will probably continue to ignore all those wierd an advanced xml-features.
I agree. I have the same observation. Most practical uses of XML are actually very simple. I too do not understand why XML needs to be so sophisticated.
All the significant projects I know of that use XML tend to use namespaces and schemas, for example Mozilla, Gnome, OpenOffice. Namespaces are useful in XML for the same reason their useful in C++: modularity, which is good if you're dealing with a project maintained by more than one author with shared components that encode data as XML.
Schemas give you data types and type checking, which obviously is nice to have when you're dealing with data. I think XML schema validation is one of the most import features of XML for the same reason that I like C++ templates and type-safe compile time polymorphism: making sure your data types are correct before hand gives you one less thing to worry about.
Why is that better than a run-time exception when loading the file?
Why is what better? Maybe I wasn't clear. When an XML file includes a schema and fails validation when loaded, you do get a run-time exception. I was trying to say that's a good thing. An XML validating parser is similar to a compiler for a strongly typed language: it catches type errors (in addition to syntax errors) immediately before you actually try to use the file.
For anyone interested in becoming convinced of the usefulness of these XML features I would suggest the tutorials at http://www.w3schools.com.
There's like 17-18 tutorials on XML. I rest my case.
I don't think this has any repercussions for property_tree other than to recognize that for the initial release it won't scale beyond trivial application configurations.
I think it is important that it never scales beyond simply things.
That may be fine to begin with, but at some point Boost users may have higher expectations. We're spoiled rotten by Boost.Regex and others.
That's where a full xml-library comes in handy.
I agree. I'm just saying that when people looking for XML tools see the words "XML," "parser" and "tree" in a description of a Boost library, it may give a certain initial impression or raise expectations. "Boost" has become synonymous for "high quality" for a lot of C++ programmers. So, some of them may think this library includes a "high quality XML parser" that generates "high quality trees." Upon closer examination, potential users will find that property_tree does nothing of the sort, but it does something else well. However, property_tree's users would certainly benefit from a real W3C standard XML parser. Perhaps, their simple projects will grow into more complicated projects over the years, and additional standard XML features would help them manage the complexity. Of course, additional features means additional learning curve, which means additional tutorials may help them along. Even if the current XML (or is it really SGML) parser included with property_tree never matures into a W3C standard XML parser, I wouldn't rule-out the possibility that one day Boost users may want property_tree or program_options or some other Boost program configuration system to let them validate their XML config files, which are after all human editable, susceptible to human errors, and in need of validation. Pardon the rant. Actually, is the parser just an SGML parser? Having just argued for more features, maybe it's actually better to have less. If really all you ever want for property_tree is a simple mark-up format for config files, maybe all this confusion could be avoided by renaming XML to SGML (without DTD support). XML is a subset of SGML in the sense that SGML is less restrictive, so an SGML parser can accept XML files. SGML (ISO 8879) may be a better mark-up standard for property_tree, or at least more suited to property_tree's current intended use-cases and feature set. Daniel Walker

On 4/23/06, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
Daniel Walker wrote:
Actually, is the parser just an SGML parser?
XML is a subset of SGML. SGML is faaaar more complex.
That's what I said: "XML is a subset of SGML in the sense that SGML is less restrictive, so an SGML parser can accept XML files." My point is that we know property_tree doesn't have a full XML parser. Is it accepting files it should reject, rejecting files it should accept, is it closer to permisive SGML or restrictive XML? Daniel Walker

On 4/23/06, Daniel Walker <daniel.j.walker@gmail.com> wrote:
On 4/23/06, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
Daniel Walker wrote:
Actually, is the parser just an SGML parser?
XML is a subset of SGML. SGML is faaaar more complex.
That's what I said: "XML is a subset of SGML in the sense that SGML is less restrictive, so an SGML parser can accept XML files." My point is that we know property_tree doesn't have a full XML parser. Is it accepting files it should reject, rejecting files it should accept, is it closer to permisive SGML or restrictive XML?
Never mind. I knew there was a standard for SGML, but I hadn't really looked at it. It's completely impractical. I always thought SGML was basically just XML without balanced tags, namespaces and all the other XML features; i.e. a combination of HTML and XML without any keywords/semantics. I know lots of people who mark up data that way and call it SGML. That's similar to what property_tree expects: the only overt XML requirement is the <?xml ?> declaration. So, to me it was sounding more like SGML But yeah, you're right, writing a parser that supports ISO SGML would be a pain. Daniel Walker

Daniel Walker wrote:
On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Schemas give you data types and type checking, which obviously is nice to have when you're dealing with data. I think XML schema validation is one of the most import features of XML for the same reason that I like C++ templates and type-safe compile time polymorphism: making sure your data types are correct before hand gives you one less thing to worry about.
Why is that better than a run-time exception when loading the file?
Why is what better? Maybe I wasn't clear. When an XML file includes a schema and fails validation when loaded, you do get a run-time exception. I was trying to say that's a good thing. An XML validating parser is similar to a compiler for a strongly typed language: it catches type errors (in addition to syntax errors) immediately before you actually try to use the file.
Ok, why is that exception better than the one I generate if I don't meet the tag I expect, of if the contents of a tag is not of the type I expect? -Thorsten

On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Daniel Walker wrote:
On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Schemas give you data types and type checking, which obviously is nice to have when you're dealing with data. I think XML schema validation is one of the most import features of XML for the same reason that I like C++ templates and type-safe compile time polymorphism: making sure your data types are correct before hand gives you one less thing to worry about.
Why is that better than a run-time exception when loading the file?
Why is what better? Maybe I wasn't clear. When an XML file includes a schema and fails validation when loaded, you do get a run-time exception. I was trying to say that's a good thing. An XML validating parser is similar to a compiler for a strongly typed language: it catches type errors (in addition to syntax errors) immediately before you actually try to use the file.
Ok, why is that exception better than the one I generate if I don't meet the tag I expect, of if the contents of a tag is not of the type I expect?
Well, you don't have to write C++ code to check if the tag/content is what you expect. You declare the acceptable content (tags, attribute/values, branch structures/sub-trees, etc) for your config file in a schema, and the parser determines whether or not the XML file conforms to the format declared in the schema. It has time-saving, organization, and code-reuse advantages among others. Most importantly, it has correctness advantages because one bug you don't have to worry about when you're traversing the tree is running into a tag you don't expect. By the time the file is loaded, all the tags, tree structure, etc. has been validated by the parser and is guaranteed to be the format you expect.

Daniel Walker wrote:
On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Daniel Walker wrote:
Why is what better? Maybe I wasn't clear. When an XML file includes a schema and fails validation when loaded, you do get a run-time exception. I was trying to say that's a good thing. An XML validating parser is similar to a compiler for a strongly typed language: it catches type errors (in addition to syntax errors) immediately before you actually try to use the file.
Ok, why is that exception better than the one I generate if I don't meet the tag I expect, of if the contents of a tag is not of the type I expect?
Well, you don't have to write C++ code to check if the tag/content is what you expect. You declare the acceptable content (tags, attribute/values, branch structures/sub-trees, etc) for your config file in a schema, and the parser determines whether or not the XML file conforms to the format declared in the schema. It has time-saving, organization, and code-reuse advantages among others. Most importantly, it has correctness advantages because one bug you don't have to worry about when you're traversing the tree is running into a tag you don't expect. By the time the file is loaded, all the tags, tree structure, etc. has been validated by the parser and is guaranteed to be the format you expect.
I wouldn't omit the checks in the code, even when using a validating parser. When the schema and the program logic disagree, the program logic "wins". Or crashes. Either way, the schema loses. :-)

On 4/23/06, Peter Dimov <pdimov@mmltd.net> wrote:
Daniel Walker wrote:
On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Daniel Walker wrote:
Why is what better? Maybe I wasn't clear. When an XML file includes a schema and fails validation when loaded, you do get a run-time exception. I was trying to say that's a good thing. An XML validating parser is similar to a compiler for a strongly typed language: it catches type errors (in addition to syntax errors) immediately before you actually try to use the file.
Ok, why is that exception better than the one I generate if I don't meet the tag I expect, of if the contents of a tag is not of the type I expect?
Well, you don't have to write C++ code to check if the tag/content is what you expect. You declare the acceptable content (tags, attribute/values, branch structures/sub-trees, etc) for your config file in a schema, and the parser determines whether or not the XML file conforms to the format declared in the schema. It has time-saving, organization, and code-reuse advantages among others. Most importantly, it has correctness advantages because one bug you don't have to worry about when you're traversing the tree is running into a tag you don't expect. By the time the file is loaded, all the tags, tree structure, etc. has been validated by the parser and is guaranteed to be the format you expect.
I wouldn't omit the checks in the code, even when using a validating parser. When the schema and the program logic disagree, the program logic "wins". Or crashes. Either way, the schema loses. :-)
Good point about crashes. It is still possible to write programs that crash even when using a schema. There is no silver bullet. Whether or not you want to double check the parser/scheme is the same sort of decision as whether or not you define NDEBUG in a released/deployed system. Sure, the asserts are still useful, but do you really need the extra check? The answer probably depends on the specific circumstances and is somewhat a matter of taste. I think the more common case is that you forget to manually check a constraint on the XML in your code, in which case you would be glad if you had a validating parser. Daniel Walker

Daniel Walker wrote:
On 4/23/06, Peter Dimov <pdimov@mmltd.net> wrote:
Daniel Walker wrote:
On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Ok, why is that exception better than the one I generate if I don't meet the tag I expect, of if the contents of a tag is not of the type I expect?
Well, you don't have to write C++ code to check if the tag/content is what you expect. You declare the acceptable content (tags, attribute/values, branch structures/sub-trees, etc) for your config file in a schema, and the parser determines whether or not the XML file conforms to the format declared in the schema.
I wouldn't omit the checks in the code, even when using a validating parser. When the schema and the program logic disagree, the program logic "wins". Or crashes. Either way, the schema loses. :-)
Good point about crashes. It is still possible to write programs that crash even when using a schema. There is no silver bullet. Whether or not you want to double check the parser/scheme is the same sort of decision as whether or not you define NDEBUG in a released/deployed system. Sure, the asserts are still useful, but do you really need the extra check? The answer probably depends on the specific circumstances and is somewhat a matter of taste. I think the more common case is that you forget to manually check a constraint on the XML in your code, in which case you would be glad if you had a validating parser.
At least for my code, a crash is not acceptable compared to a nice popup message: "Error in xml-file: foo"; -Thorsten

On 4/24/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Daniel Walker wrote:
On 4/23/06, Peter Dimov <pdimov@mmltd.net> wrote:
Daniel Walker wrote:
On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Ok, why is that exception better than the one I generate if I don't meet the tag I expect, of if the contents of a tag is not of the type I expect?
Well, you don't have to write C++ code to check if the tag/content is what you expect. You declare the acceptable content (tags, attribute/values, branch structures/sub-trees, etc) for your config file in a schema, and the parser determines whether or not the XML file conforms to the format declared in the schema.
I wouldn't omit the checks in the code, even when using a validating parser. When the schema and the program logic disagree, the program logic "wins". Or crashes. Either way, the schema loses. :-)
Good point about crashes. It is still possible to write programs that crash even when using a schema. There is no silver bullet. Whether or not you want to double check the parser/scheme is the same sort of decision as whether or not you define NDEBUG in a released/deployed system. Sure, the asserts are still useful, but do you really need the extra check? The answer probably depends on the specific circumstances and is somewhat a matter of taste. I think the more common case is that you forget to manually check a constraint on the XML in your code, in which case you would be glad if you had a validating parser.
At least for my code, a crash is not acceptable compared to a nice popup message: "Error in xml-file: foo";
Me too. I've just been trying to say that your code is more likely to crash without validation than with because there's more opportunity for human error if you're solely relying on hand written, ad hoc error checking of the XML. So, if crashes are unacceptable, I'd highly recommend a validating parser... though apparently it's not currently necessary for property_tree or Boost.Serialization. Daniel Walker

Daniel Walker wrote:
On 4/24/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
At least for my code, a crash is not acceptable compared to a nice popup message: "Error in xml-file: foo";
Me too. I've just been trying to say that your code is more likely to crash without validation than with because there's more opportunity for human error if you're solely relying on hand written, ad hoc error checking of the XML.
It's hard to make that assessment.
So, if crashes are unacceptable, I'd highly recommend a validating parser... though apparently it's not currently necessary for property_tree or Boost.Serialization.
If the parser itself don't crash, then it's fairly easy to avoid human errors. It the parser can crash, then so we have to assume the validation can too and so validation is another risk. I use a fairly old version of xerces in my code, I can't change it easily, nor upgrade easily. I doubt that validation will help anything. Instead I wrap it in a layer that throws an exception instead of crashing when the parser is misused. -Thorsten

Thorsten Ottosen wrote:
Daniel Walker wrote:
On 4/23/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
Schemas give you data types and type checking, which obviously is nice to have when you're dealing with data. I think XML schema validation is one of the most import features of XML for the same reason that I like C++ templates and type-safe compile time polymorphism: making sure your data types are correct before hand gives you one less thing to worry about.
Why is that better than a run-time exception when loading the file?
Why is what better? Maybe I wasn't clear. When an XML file includes a schema and fails validation when loaded, you do get a run-time exception. I was trying to say that's a good thing. An XML validating parser is similar to a compiler for a strongly typed language: it catches type errors (in addition to syntax errors) immediately before you actually try to use the file.
Ok, why is that exception better than the one I generate if I don't meet the tag I expect, of if the contents of a tag is not of the type I expect?
-Thorsten _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
It's not that the exception is atomically better, but the question seems similar to the old-time 2-phase construction programmer asking why it's better to avoid 2-phase construction. The parsing/validation of XML allows me to trust that my data structure has been constructed fully and as a valid foo - or that an exception was thrown for some appropriate reason. For XML, malformed XML itself is akin to a constructor call where it can only be determined to be malformed at runtime. In fact, in each of the places where I've used XML to transmit data , it has been a representation of an invariant. This is no different than a class which properly represents an invariant, and I prefer not to be able to receive broken objects from factories. Pardon me if I'm missing the point of your question... Brian Allison

Trent Nelson wrote:
What are your thoughts on this being included in your library?
Number one (including) is already present, at least in the INFO format. With an XInclude or external parsed entity-capable XML processor, it would also be present in the XML format. Neither JSON nor INI have include capabilities, and neither do any of the special sources, of course. However, it would be possible to write yet another parser for a format that has fancy include capabilities. Bottom line, including is a feature of the storage format and its parser, not the property tree itself. As for the second, value substitution, I think that should be an algorithm that you can apply to any tree. It will look up values of a special form and substitute them somehow. (By looking them up in the same tree, or perhaps by looking them up in another tree.)
participants (11)
-
Brian Allison
-
Daniel Walker
-
Jeff Garland
-
Joel de Guzman
-
Lee Houghton
-
Marcin Kalicinski
-
Peter Dimov
-
Sebastian Redl
-
Thorsten Ottosen
-
Trent Nelson
-
Vladimir Prus