[Boost Review] Property Tree Library

Dear All, I have the great pleasure to announce that the formal boost-review of Marcin Kalicinski Property Tree Library begins now (the 18th of April) and runs through the 27th of April. Introduction ------------ This fairly high-level library consists of the following facilities: 1. a generic recursive property-tree data-structure 2. generic conversions to/from this data-structure from/to a. xml-files b. ini-flies c. json-files d. windows registry files As an example of how powerful the library is, consider this tutorial: - http://kaalus.atspace.com/ptree/doc/index.html#five_minute_tutorial The library may be downloaded from - the boost file vault http://boost-consulting.com/vault/ (property_tree_rev5.zip) - tinyurl: http://tinyurl.com/fkt7r - boost sandbox: http://www.boost.org/more/mailing_lists.htm#sandbox (look in boost/property_tree and libs/property_tree) The documentation may be viewed online at - http://kaalus.atspace.com/ptree/doc/index.html If your find the above interesting, please consider submitting a review to the boost developer mailing list: - http://www.boost.org/more/mailing_lists.htm#main You might end up using several hours on a review, but the end result could be a superb library that will save you weeks of work. Notes for reviewers ------------------- When writing your review, you may wish to consider the following questions: * What is your evaluation of the design? * What is your evaluation of the implementation? * What is your evaluation of the documentation? * What is your evaluation of the potential usefulness of the library? * Did you try to use the library? With what compiler? Did you have any problems? * How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? * Are you knowledgeable about the problem domain? And finally, every review should answer this question: * Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion. In particular, consider if you can answer the following questions: - was the library suitable for your daily xml-tasks? If not, what was missing? - was the library's performance good enough? If not, can you suggest improvements? - was the library's design flexible enough? If not, how would you suggest it should be redesigned to broaden its scope of use? best regards Thorsten Ottosen, Review Manager

Here's my preliminary review of the property tree library.
* Are you knowledgeable about the problem domain?
I'm a student of software engineering and have repeatedly written programs that require external configuration, mostly in C++, Java and PHP.
* How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I've begun using the library as the main means of configuration in a project. I have not yet, however, compiled any code or looked at the library source code.
* What is your evaluation of the design?
The design is straightforward and sound. The compatibility with standard containers makes the learning curve shallow, and the specialized functions offer an easy interface.
* What is your evaluation of the implementation?
I did not look at the implementation.
* What is your evaluation of the documentation?
The documentation is generally good. However, it could be better structured - in particular, the documentation index page is too long and needs to be split into parts. The documentation also requires attention from a native English speaker to correct the various grammar errors.
* What is your evaluation of the potential usefulness of the library?
High. It offers a flexible and universal interface to many different forms of configuration sources in a way that I have yet to see in any other language/library.
* Did you try to use the library? With what compiler? Did you have any problems?
I've so far written a bit of source code using the library, but have not yet compiled the code. It will eventually be compiled using GCC 3.4.x and 4.x.
And finally, every review should answer this question:
* Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion.
Yes. Already it transforms a job I've been dreading to do (and so far have reinvented every time, because I was never satisfied with my approach) into a matter of 5 lines. (Literally!) Enhancements from here on are incremental and their lack does not matter right now.
In particular, consider if you can answer the following questions:
- was the library suitable for your daily xml-tasks? If not, what was missing?
The library seems very well suited to read and save XML configuration files. I would suggest, however, that versions of read/write_xml are introduced that allow to specify the special keys for attributes, comments and text. Also, a flag should be added that specifies whitespace handling. (Ignore, preserve, collapse.)
- was the library's performance good enough? If not, can you suggest improvements?
No performance testing done.
- was the library's design flexible enough? If not, how would you suggest it should be redesigned to broaden its scope of use?
Seems flexible enough for me, thanks to the key and value type traits. Additional suggestions: 1) Having a separate namespace for the interface of each parser seems to be overly verbose. The property_tree namespace is sparsely populated (just ptree, a few predefined traits classes, and the exceptions if I'm not mistaken), so the read_* and write_* functions could reside there. Alternatively, a single namespace parsers could be created to hold all interface functions. Assuming pt is an alias for the property tree namespace, pt::xml_parser::read_xml contains redundant information about the type being read. 2) Add a parser for Java-like .properties files. The hierarchy could be built using common prefixes, i.e. build.dirs.libraries = /usr/lib build.dirs.headers = /usr/include could be transformed to build { dirs { libraries = /usr/lib headers = /usr/lib } } 3) The default separator should be settable. E.g. if I want to always separate my paths with '/' (I could be porting from a different library), I want to be able to write: pt.set_default_separator('/'); string libs = pt.get("build/dirs/libraries"); Perhaps I'll have additional comments at a later point. Sebastian Redl

[...] The documentation also requires attention from a native English speaker to correct the various grammar errors.
My english is far from perfect. I would be very grateful if a native speaker could skim the docs and point me towards several most glaring errors.
The library seems very well suited to read and save XML configuration files. I would suggest, however, that versions of read/write_xml are introduced that allow to specify the special keys for attributes, comments and text. Also, a flag should be added that specifies whitespace handling. (Ignore, preserve, collapse.)
Agreed. XML parser definitely needs more flexibility. All the above improvements are easy to implement.
Additional suggestions: 1) Having a separate namespace for the interface of each parser seems to be overly verbose. The property_tree namespace is sparsely populated [...]
The namespaces issue has been also mentioned by other people before the review. One suggestion was to rename ptree class to property_tree, and place it directly in boost namespace. Current implementation has a virtue of being extremely conservative, trying to minimize number of names introduced into existing namespaces. Because fiddling with namespaces is quite a fundamental change for the library interface, it would be nice to gather more suggestions in this matter before making any decisions. On the other hand, this is probably something that should be addressed before the library becomes a part of boost (if ever), because maintaining backwards compatibility with current solution would be a madman's nightmare.
2) Add a parser for Java-like .properties files. The hierarchy could be built using common prefixes, i.e. [...]
I'm not familiar with .properties file format, but it looks easy to implement a parser with Spirit. If so, it could be a matter of a day or two to have it working.
3) The default separator should be settable. E.g. if I want to always separate my paths with '/' (I could be porting from a different library), I want to be able to write: pt.set_default_separator('/'); string libs = pt.get("build/dirs/libraries");
Agreed. I thought about it before, and came to a conclusion that the best way to implement it is to use preprocessor constant to allow user specify default separator, e.g. BOOST_PROPERTY_TREE_SEPARATOR. Thank you, Marcin

Marcin Kalicinski wrote:
Agreed. I thought about it before, and came to a conclusion that the best way to implement it is to use preprocessor constant to allow user specify default separator, e.g. BOOST_PROPERTY_TREE_SEPARATOR.
I disagree. I think this should be a runtime per-tree setting, not something set at compile time. I also think it would be quite trivial to do. Sebastian Redl

Agreed. I thought about it before, and came to a conclusion that the best way to implement it is to use preprocessor constant to allow user specify default separator, e.g. BOOST_PROPERTY_TREE_SEPARATOR.
I disagree. I think this should be a runtime per-tree setting, not something set at compile time. I also think it would be quite trivial to do.
You mean ptree should have additional data member containing default separator? How do we make sure that all nodes in the tree have the same separators? Also, this would add some runtime overhead to the library, storing so many copies of the same character. As an alternative to macro, it could be a part of the traits. But I think this solution is too cumbersome to be practical, because user will need to roll out his own ptree type just to use different default separator. It will also made trees with different separators incompatible. Marcin

Marcin Kalicinski wrote:
You mean ptree should have additional data member containing default separator? How do we make sure that all nodes in the tree have the same separators? Also, this would add some runtime overhead to the library, storing so many copies of the same character.
That's a very valid problem. I'll think about it, but I would really like it if a feasible solution could be found. Sebastian Redl

"Sebastian Redl" <sebastian.redl@getdesigned.at> wrote in message news:44463E63.6030703@getdesigned.at... : Marcin Kalicinski wrote: : : >Agreed. I thought about it before, and came to a conclusion that the best : >way to implement it is to use preprocessor constant to allow user specify : >default separator, e.g. BOOST_PROPERTY_TREE_SEPARATOR. : > : > : I disagree. I think this should be a runtime per-tree setting, not : something set at compile time. I also think it would be quite trivial to do. Why do you care to have this as a per-tree setting ??? So now, say that someone writes a library function to read a ptree into a struct EmployeeInfo. Unless this function explicitly specifies the separator at every call, it will fail if you pass it a tree that is configured to use a different separator ? I'd much rather have a universal default ( . or / ), guaranteed, that I can use in all my "item paths". And maybe, if there is a reason to think that it will be useful, the ability to specify a custom separator for a call. But what is the point? Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

Ivan Vecerina wrote:
"Sebastian Redl" <sebastian.redl@getdesigned.at> wrote in message news:44463E63.6030703@getdesigned.at...
Marcin Kalicinski wrote:
Agreed. I thought about it before, and came to a conclusion that the best way to implement it is to use preprocessor constant to allow user specify default separator, e.g. BOOST_PROPERTY_TREE_SEPARATOR.
I disagree. I think this should be a runtime per-tree setting, not something set at compile time. I also think it would be quite trivial to do.
Why do you care to have this as a per-tree setting ???
So now, say that someone writes a library function to read a ptree into a struct EmployeeInfo. Unless this function explicitly specifies the separator at every call, it will fail if you pass it a tree that is configured to use a different separator ?
I'd much rather have a universal default ( . or / ), guaranteed, that I can use in all my "item paths". And maybe, if there is a reason to think that it will be useful, the ability to specify a custom separator for a call. But what is the point?
What if the keys contain '.' or '/'? Jeff Flinn

"Jeff Flinn" <TriumphSprint2000@hotmail.com> wrote in message news:e27qih$65r$1@sea.gmane.org... : Ivan Vecerina wrote: : > "Sebastian Redl" <sebastian.redl@getdesigned.at> wrote in message : > news:44463E63.6030703@getdesigned.at... : >> Marcin Kalicinski wrote: : >> : >>> Agreed. I thought about it before, and came to a conclusion that : >>> the best way to implement it is to use preprocessor constant to : >>> allow user specify default separator, e.g. : >>> BOOST_PROPERTY_TREE_SEPARATOR. : >>> : >>> : >> I disagree. I think this should be a runtime per-tree setting, not : >> something set at compile time. I also think it would be quite : >> trivial to do. : > : > Why do you care to have this as a per-tree setting ??? : > : > So now, say that someone writes a library function to read a ptree : > into a struct EmployeeInfo. Unless this function explicitly : > specifies the separator at every call, it will fail if you pass : > it a tree that is configured to use a different separator ? : > : > I'd much rather have a universal default ( . or / ), guaranteed, : > that I can use in all my "item paths". : > And maybe, if there is a reason to think that it will be useful, : > the ability to specify a custom separator for a call. But : > what is the point? : : What if the keys contain '.' or '/'? Then the function that uses such a key can explicitly specify a different separator, or use a different accessor function. One could also choose to provide an escaping mechanism within the path. But in the first place, ptree will never be able to open just any XML/whatever file - users of it can also adapt. Do you need to have built-in support for every corner case ? But the key point is: the function that uses a path string also knows which separator it intended to use in its notation. Storing the separator as a property of the tree goes against encapsulation. I would just pick a default separator. I don't care which choice is made, but '/' makes sense. Also implement path-access as non-member functions, so users are free to choose an path-specification approach if desired. And how about supporting array indexing in the path syntax ? Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

But the key point is: the function that uses a path string also knows which separator it intended to use in its notation. Storing the separator as a property of the tree goes against encapsulation.
Additionally, it is not very feasible. You have to remember that tree is a recursive structure. How do we make sure that all nodes contain the same default separator?
I would just pick a default separator. I don't care which choice is made, but '/' makes sense.
I think any choice of default separator has drawbacks. For example '/' is commonly found in filesystem paths or web addresses, so it would complicate life for quite many users. I could pick some really obscure character, like back-apostrophe or ^, but that would render paths quite unreadable. So I decided to stick to a dot. Marcin

Marcin Kalicinski wrote:
Additionally, it is not very feasible. You have to remember that tree is a recursive structure. How do we make sure that all nodes contain the same default separator?
You are right. In my initial enthusiasm, I overvalued per-tree default characters, and the problem you mention is indeed grave. I withdraw my request. Perhaps as a constant in the traits, then? I don't think the same problems apply. Sebastian Redl

You are right. In my initial enthusiasm, I overvalued per-tree default characters, and the problem you mention is indeed grave. I withdraw my request.
Perhaps as a constant in the traits, then? I don't think the same problems apply.
But then trees with different constants would have different types and be incompatible. That is good in the sense that you would be unable to insert one into another, but it would seem weird not to be able to compare them, for example. I think separator is a property of path dissection algorithm, not the tree (not even the path), so any attempt to artificially affix it to the tree will lead to trouble, sooner or later. Marcin

Marcin Kalicinski wrote:
But the key point is: the function that uses a path string also knows which separator it intended to use in its notation. Storing the separator as a property of the tree goes against encapsulation.
Additionally, it is not very feasible. You have to remember that tree is a recursive structure. How do we make sure that all nodes contain the same default separator?
Why does the tree even care what the separators are? Do you store full paths to each node? IMO, as I've stated elsewhere in this thread for other reasons, there is a need for a separate path class/concept. A path could have constructor taking a string and an optional separator. The path would expose iterators ala boost::filesystem. Jeff Flinn

as I've stated elsewhere in this thread for other reasons, there is a need for a separate path class/concept. A path could have constructor taking a string and an optional separator. The path would expose iterators ala boost::filesystem.
I have toyed with that idea (path objects) when I was thinking about configurable separators. The reason why I used it was because I thought it might complicate interface, at least conceptually, if not syntactically. Simplicity of use was generally #1 goal, otherwise people would just prefer to use MSXML, Expat, TinyXML etc. I might need to reconsider it if I add key policies to traits. Best regards, Marcin

I dont understand the char_type member typedef of the traits class. The functions that use it all seeem to be free functions which could change from e.g template<class Ptree> void read_xml(std::basic_istream<typename Ptree::char_type> &stream, Ptree &pt, int flags = 0); to template<class Ptree, class Char> void read_xml(std::basic_istream<Char> &stream, Ptree &pt,int flags = 0); without problems. That would remove one level of coupling wouldnt it? ........... Why are all ptree nodes the same type? Commonly a tree will have branch and leaf nodes. In the debug example there are two branch nodes and 4 leaf nodes, which means two empty lists . That is wasteful isnt it? ................ Is it necessary to make key a string. Could it not also be (say) an integer id? regards Andy Little

Andy Little wrote:
I dont understand the char_type member typedef of the traits class. The functions that use it all seeem to be free functions which could change from e.g
template<class Ptree> void read_xml(std::basic_istream<typename Ptree::char_type> &stream, Ptree &pt, int flags = 0);
to
template<class Ptree, class Char> void read_xml(std::basic_istream<Char> &stream, Ptree &pt,int flags = 0);
without problems. That would remove one level of coupling wouldnt it?
I'm not sure it is worth speculating too much about coupling. The function will implicitly depend on the interface of basic_ptree<> anyway. So there is no reason to hide that fact. I consider this interface most elegant (because ADL also works): namespace boost { template< class Traits > void read_xml( std::basic_istream<typename Traits::char_type>& str, basic_ptree<Traits>& tree, int flags = 0 ); } -Thorsten

I consider this interface most elegant (because ADL also works):
namespace boost { template< class Traits > void read_xml( std::basic_istream<typename Traits::char_type>& str, basic_ptree<Traits>& tree, int flags = 0 ); }
The reason why parsers are templated on tree type is that basic_ptree originally had more template parameters than now (just the traits), so replicating them everywhere looked like a potential maintenance problem. Marcin

"Thorsten Ottosen" wrote
Andy Little wrote:
I dont understand the char_type member typedef of the traits class. The functions that use it all seeem to be free functions which could change from e.g
template<class Ptree> void read_xml(std::basic_istream<typename Ptree::char_type> &stream, Ptree &pt, int flags = 0);
to
template<class Ptree, class Char> void read_xml(std::basic_istream<Char> &stream, Ptree &pt,int flags = 0);
without problems. That would remove one level of coupling wouldnt it?
I'm not sure it is worth speculating too much about coupling. The function will implicitly depend on the interface of basic_ptree<> anyway. So there is no reason to hide that fact.
The only resaon for the char_type traits parameter is for stream I/O though isnt it? IOW because of this coupling each ptree type can only be serialised to a stream parameterised on one char type. You cannot therefore have a wide string ptree loaded from a char file for example.
I consider this interface most elegant (because ADL also works):
namespace boost { template< class Traits > void read_xml( std::basic_istream<typename Traits::char_type>& str, basic_ptree<Traits>& tree, int flags = 0 ); }
ADL also works ... namespace boost { template< class Traits , typename Char> void read_xml( std::basic_istream<Char>& str, basic_ptree<Traits>& tree, int flags = 0 ); } regards Andy Little

Andy Little wrote:
"Thorsten Ottosen" wrote
I'm not sure it is worth speculating too much about coupling. The function will implicitly depend on the interface of basic_ptree<> anyway. So there is no reason to hide that fact.
The only resaon for the char_type traits parameter is for stream I/O though isnt it? IOW because of this coupling each ptree type can only be serialised to a stream parameterised on one char type. You cannot therefore have a wide string ptree loaded from a char file for example.
I consider this interface most elegant (because ADL also works):
namespace boost { template< class Traits > void read_xml( std::basic_istream<typename Traits::char_type>& str, basic_ptree<Traits>& tree, int flags = 0 ); }
ADL also works ...
namespace boost { template< class Traits , typename Char> void read_xml( std::basic_istream<Char>& str, basic_ptree<Traits>& tree, int flags = 0 ); }
I see you point, but I'm not qualified to tell you if it makes sense for Char and Traits::char_type to be different. Marcin? -Thorsten

ADL also works ...
namespace boost { template< class Traits , typename Char> void read_xml( std::basic_istream<Char>& str, basic_ptree<Traits>& tree, int flags = 0 ); }
I see you point, but I'm not qualified to tell you if it makes sense for Char and Traits::char_type to be different.
It does not make much sense. It would mean that we have to widen or narrow every string read from the stream before putting in the tree. If really needed, this can be done later by constructing another tree and converting strings as we copy them from original. Anyway, this is rather a job for locale codecvt facets, not for property_tree library. Marcin

Andy Little wrote:
Why are all ptree nodes the same type? Commonly a tree will have branch and leaf nodes. In the debug example there are two branch nodes and 4 leaf nodes, which means two empty lists . That is wasteful isnt it?
How do you distinguish, though? What happens when you add a node as a child of what formerly was a leaf node? How can you change the type of the node when some places (including, most likely, the caller) are holding a reference? How would you define value_type? Would you make the tree node polymorphic? Isn't that even more wasteful? (Not to mention cumbersome.) An empty list is just two pointers, and perhaps a count.
Is it necessary to make key a string. Could it not also be (say) an integer id?
It's certainly possible in theory, and the key_type typedef in the traits should make it possible in practice too. From looking at the implementation, it appears, however, that the key type must provide the std::string interface. This makes sense, in a way, as the keys can be concatenated to directly access deep properties. The question is whether making this more flexible, either by having the traits supply a conversion from string to key and do the lookup by tokenizing and converting each token, or by having the traits supply a type-specific concatenation operation, is worth the trouble. Using any key type but string makes most of the readers and writers useless: the existence of any XML where all element and attribute names are numbers is doubtful (and perhaps even forbidden, I'd have to check the specs). In addition, such an interface makes the implementation and the traits that much more complicated. Sebastian Redl

Sebastian Redl wrote:
Andy Little wrote:
...
Is it necessary to make key a string. Could it not also be (say) an integer id?
It's certainly possible in theory, and the key_type typedef in the traits should make it possible in practice too. From looking at the implementation, it appears, however, that the key type must provide the std::string interface. This makes sense, in a way, as the keys can be concatenated to directly access deep properties.
I haven't had a chance to look at the library yet, but I'd have thought there would be a corresponding path concept/class. The std::string in the interface sounds overly restrictive. Just as the string algorithm is not restricted to std::(w)string.
The question is whether making this more flexible, either by having the traits supply a conversion from string to key and do the lookup by tokenizing and converting each token, or by having the traits supply a type-specific concatenation operation, is worth the trouble. Using any key type but string makes most of the readers and writers useless: the existence of any XML where all element and attribute names are numbers is doubtful (and perhaps even forbidden, I'd have to check the specs).
Is property_tree then actually an XML library.
In addition, such an interface makes the implementation and the traits that much more complicated.
I'll try to take a more in depth look at the lib soon. Jeff Flinn

Jeff Flinn wrote:
Is property_tree then actually an XML library.
No, but I would imagine that read/write_xml will be among the most common uses of the library. Besides, having only numeric keys is unusual for the other formats too: JSON, INI, etc. (Not forbidden, just unusual.) BTW, I've looked it up, and an XML element or attribute name may indeed only start with an undescore, a colon, or a "Letter" which is defined as a wide range of unicode characters, but not including digits. So for XML, it is forbidden. Sebastian Redl

Jeff Flinn wrote:
I haven't had a chance to look at the library yet, but I'd have thought there would be a corresponding path concept/class. The std::string in the interface sounds overly restrictive. Just as the string algorithm is not restricted to std::(w)string.
Hmm ... something along the lines of this? struct ptree_traits { typedef std::string key_type; typedef std::string path_type; typedef char separator_type; static const separator_type default_separator = '.'; std::pair<key_type, path_type> split_path(const path_type &p, separator_type s) { path_type::size_type offset = p.find(s); if(offset == path_type::npos) { // Do whatever is to do in this situation. } else { return std::make_pair(p.substr(0, offset), p.substr(offset+1)); } } }; struct radixtree_traits { typedef int key_type; typedef int path_type; typedef int separator_type; static const separator_type default_separator = 10; std::pair<key_type, path_type> split_path(const path_type &p, separator_type s) { return std::make_pair(p % s, p / s); } }; Other path types (e.g. a vector of ints) would just ignore the separator. Sounds good to me. Not as complicated as I expected either. Sebastian Redl

"Sebastian Redl" <sebastian.redl@getdesigned.at> wrote in message news:444656AB.2010207@getdesigned.at...
Andy Little wrote:
Why are all ptree nodes the same type? Commonly a tree will have branch and leaf nodes. In the debug example there are two branch nodes and 4 leaf nodes, which means two empty lists . That is wasteful isnt it?
How do you distinguish, though?
The obvious way is to make a node an abstract base class with interface functions such as (say) is_leaf(), is_branch(). Alternatively provide the abstract base class as a private member of node and manage it privately, changing it from a branch to a leaf as required. user defined leaves hold numeric, coordinate , string, arrays ... whatever efficiently. The interface would probably need an extensible mechanism so that a leaf could be interrogated to find out what type of data it holds. The obvious one is an integer type id per type What happens when you add a node as a
child of what formerly was a leaf node?
You would have to change the leaf to a branch. This could be automatic if the actual type of the node was encapsulated as described above. OTOH if you tried to add a child to a leaf it would clearly be an error. How can you change the type of
the node when some places (including, most likely, the caller) are holding a reference?
In the tree I describe you would copy data or interrogate nodes rather than hold references to nodes. In that case dont hold references. (If a reference is essential you may need to lock or refcount/release the branch to prevent modification). In general use the keys to access mutable data. Thats what they are for. If a caller provides a node reference in a function call expecting the node to be filled, then one would change the type of the argument from a node to a branch as only branches can have nodes added so inputting a node argument makes no sense anyway.
How would you define value_type?
This would be the exclusive concern of leaf nodes. Would you make the tree node
polymorphic?
Yes. Isn't that even more wasteful? (Not to mention cumbersome.)
An empty list is just two pointers, and perhaps a count.
Without trying out an alternative design I dont know if it would be more wasteful. If the hierarchy is flat then polymorphism can be cheap AFAIK. The major use I thought of is a scene graph. This typically consists of a large number of nodes ( Can be many Mb file) many containing structures of points and transforms. I suspect its cheaper to keep these in memory in their binary format rather than as strings. Whatever... it would be interesting to see the rationale behind the design decisions made within the documentation. I *think* that a trade off has been made in favour of convenience and 'light weight' (I think it would not perform well on large files for example) against compactness and performance. That is fine if its explicitly stated as the aim.
Is it necessary to make key a string. Could it not also be (say) an integer id?
It's certainly possible in theory, and the key_type typedef in the traits should make it possible in practice too. From looking at the implementation, it appears, however, that the key type must provide the std::string interface. This makes sense, in a way, as the keys can be concatenated to directly access deep properties. The question is whether making this more flexible, either by having the traits supply a conversion from string to key and do the lookup by tokenizing and converting each token, or by having the traits supply a type-specific concatenation operation, is worth the trouble. Using any key type but string makes most of the readers and writers useless: the existence of any XML where all element and attribute names are numbers is doubtful (and perhaps even forbidden, I'd have to check the specs). In addition, such an interface makes the implementation and the traits that much more complicated.
Maybe the tree with the string key type is a refinement of a more generic property tree? regards Andy Little

The major use I thought of is a scene graph. This typically consists of a large number of nodes ( Can be many Mb file) many containing structures of points and transforms. I suspect its cheaper to keep these in memory in their binary format rather than as strings.
What you need to store a scene graph is a generic tree container, not ptree. Property tree is for storing properties. Anyway, if you want you can consider customizing data type to be some sort of Object *, but I don't know if this is going to take you very far.
Whatever... it would be interesting to see the rationale behind the design decisions made within the documentation. I *think* that a trade off has been made in favour of convenience and 'light weight' (I think it would not perform well on large files for example)
Yes, being light-weight and easy to use it the main goal. I think this is said in the introduction part of the docs.
Is it necessary to make key a string. Could it not also be (say) an integer id?
Library would need some sort of path parsing policy. I think that the only required function would be to separate head of the path from the tail (sort of Lisp-like behaviour). This is quite an interesting proposition, I consider it the most valuable addition to the library at the moment. Thank you, Marcin

"Marcin Kalicinski" wrote
The major use I thought of is a scene graph. This typically consists of a large number of nodes ( Can be many Mb file) many containing structures of points and transforms. I suspect its cheaper to keep these in memory in their binary format rather than as strings.
What you need to store a scene graph is a generic tree container, not ptree. Property tree is for storing properties. Anyway, if you want you can consider customizing data type to be some sort of Object *, but I don't know if this is going to take you very far.
AFAICS the current ptree could be a refinement of a a generic tree container, however the current design of ptree is not so defined and its probably not going to be possible to 'reverse engineer' genericity from the current design. One example.. ptree doesnt distinguish the concepts of a branch and a leaf. It is possible to imagine an alternative design in which a branch can be iterated over whereas a leaf cannot, and the 'data' in a branch is a container of nodes whereas the data in a leaf is not. Did you consider a generic tree design? If so why did you reject it in favour of this one?
Whatever... it would be interesting to see the rationale behind the design decisions made within the documentation. I *think* that a trade off has been made in favour of convenience and 'light weight' (I think it would not perform well on large files for example)
Yes, being light-weight and easy to use it the main goal. I think this is said in the introduction part of the docs.
But it would be interesting to know why you have chosen this design over others. That is not stated in the docs AFAICS. It would help when I wanted to use such a library because it could help me to make a fast decisison as to whether it was suitable. BTW the current 'Rationale' section looks more like a 'FAQ' section FWIW.
Is it necessary to make key a string. Could it not also be (say) an integer id?
Library would need some sort of path parsing policy. I think that the only required function would be to separate head of the path from the tail (sort of Lisp-like behaviour). This is quite an interesting proposition, I consider it the most valuable addition to the library at the moment.
One option I envisage is to make a symbol table, where each new string token encountered is given an integer id, and only the id's are stored in the tree. The intent being to speed up key lookups. In this case a path to a node consists of a linear container (e.g a vector) of integer ids. That might go against the 'light-weight' nature of the design though. In such a design a 'path' would be a concept, and a vector of integer ids might be a model of path as might a string with separators. That would be a move towards a 'generic tree container' approach. I guess though that whether that has potential, depends on your answers to my questions above. regards Andy Little

"Andy Little" wrote
Did you consider a generic tree design? If so why did you reject it in favour of this one?
Just to refresh... The above is the most interesting and yet unanswered question about property tree for me. Am I missing something? Is this a silly question? Is it too trivial to answer? regards Andy Little

"Andy Little" <andy@servocomm.freeserve.co.uk> wrote in message news:e2a1p5$5he$1@sea.gmane.org... : : "Andy Little" wrote : > : >Did you consider a generic tree design? If so : > why did you reject it in favour of this one? : : Just to refresh... The above is the most interesting and yet unanswered question : about property tree for me. Am I missing something? Is this a silly question? Is : it too trivial to answer? I would agree that this is an interesting point. - ptree integrates too many member functions that really should be non-member utilities (e.g. path solving, value<->ptree conversions) Fixing this is a must IMHO, I wouldn't want to accept another std::string like beast (too much built-in, yet never enough, so you end up with an inconsistent mix of member & non-member interfaces) - in some ways, since it is already all-templated, maybe it could be made even more general (e.g. boost::any based values or nodes?) On the other hand, loading a DOM-like tree structure into memory, and being able to manipulate it, is I think quite a common need. By design, it also supports I/O to a variety of formats, which is also very nice. boost::serialize is nice, but it goes straight from binary object to external stream (or inversely), with no intermediate representation. And maybe there lies an interesting interface point: Should ptree be a potential target / archive type for boost::serialize?
From this perspective, what feature set would we want to see in the ptree structure ?
-- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

On the other hand, loading a DOM-like tree structure into memory, and being able to manipulate it, is I think quite a common need. By design, it also supports I/O to a variety of formats, which is also very nice.
boost::serialize is nice, but it goes straight from binary object to external stream (or inversely), with no intermediate representation. And maybe there lies an interesting interface point:
I am missing your point. What intermediate representation you need? IOW What is wrong with multi_index + serialize library for what you need? Gennadiy

Gennadiy Rozental <gennadiy.rozental <at> thomson.com> writes:
I am missing your point. What intermediate representation you need? IOW What is wrong with multi_index + serialize library for what you need?
One thing that property_tree (PT) seems to do is to, from an end-user's viewpoint, handle simple and quite complex data and configuration files in textual format in a straight-forward manner. This is very nice and goes hand in hand with the old unix tradition. To me it is overkill to bring into two complex libraries like multi_index and serialize to handle a "simple" config file. By the way, is it even possible to have serialize work with textual files or does it always use a binary format? If so, the combination of multi_index and serialize is disqualified for a simple config file. /Rune

"Rune" <rune.sune@yahoo.com> wrote in message news:loom.20060421T181602-979@post.gmane.org...
Gennadiy Rozental <gennadiy.rozental <at> thomson.com> writes:
I am missing your point. What intermediate representation you need? IOW What is wrong with multi_index + serialize library for what you need?
One thing that property_tree (PT) seems to do is to, from an end-user's viewpoint, handle simple and quite complex data and configuration files in textual format in a straight-forward manner.
Serilize lib would allow this even more strait forward.not need for all this string <-> data type conversions all the time.
This is very nice and goes hand in hand with the old unix tradition.
To me it is overkill to bring into two complex libraries like multi_index and serialize to handle a "simple" config file.
For the simple program you don't need to save config parameters programmatically anyway you could edit file directly. But if you already have some kind of gui to update probram configuration at runtime you could afford to include miltiindex and serialize lib either.
By the way, is it even possible to have serialize work with textual files or does it always use a binary format?
Quote possible. I know for a fact it has XML support. Gennadiy

Rune wrote:
One thing that property_tree (PT) seems to do is to, from an end-user's viewpoint, handle simple and quite complex data and configuration files in textual format in a straight-forward manner.
My thoughts exactly. Property Tree is understandable within minutes, although the interface could certainly be improved. (I find myself sharing the voiced concerns about too much of the interface being inside the class.)
To me it is overkill to bring into two complex libraries like multi_index and serialize to handle a "simple" config file. By the way, is it even possible to have serialize work with textual files or does it always use a binary format? If so, the combination of multi_index and serialize is disqualified for a simple config file.
Serialization can handle various formats, and is explicitely designed to be extensible. One of the formats is XML. However, Serialization does not handle arbitrary (simple) XML, AFAIK. Sebastian Redl

To me it is overkill to bring into two complex libraries like multi_index and serialize to handle a "simple" config file. By the way, is it even possible to have serialize work with textual files or does it always use a binary format? If so, the combination of multi_index and serialize is disqualified for a simple config file.
Serialization can handle various formats, and is explicitely designed to be extensible. One of the formats is XML. However, Serialization does not handle arbitrary (simple) XML, AFAIK.
1. Why do you care? What difference does it make what is the format of the permanent storage you are uging for your configuration. You are not going to edit this manualy anyway. 2. If you insist you could always implement your own format SimpleXML instead of reinventing the wheel. Gennadiy

However, Serialization does not handle arbitrary (simple) XML, AFAIK.
1. Why do you care? What difference does it make what is the format of the permanent storage you are uging for your configuration. You are not going to edit this manualy anyway.
Why not? Hand-editing of configuration files is a common Unix tradition to paraphrase one of the previous posters. Plus XML, I believe, is meant to be very much hand-editable. Otherwise what would be its adavantage over any binary format?
2. If you insist you could always implement your own format SimpleXML instead of reinventing the wheel.
I think implementing your own format looks very much like reinventing the wheel. You could also rewrite all the world's software in asm, or implement your own version of C++, but why would you do that if property_tree can do what you need in several lines of code? Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2bk84$k1b$1@sea.gmane.org...
However, Serialization does not handle arbitrary (simple) XML, AFAIK.
1. Why do you care? What difference does it make what is the format of the permanent storage you are uging for your configuration. You are not going to edit this manualy anyway.
Why not? Hand-editing of configuration files is a common Unix tradition to paraphrase one of the previous posters.
There should be a compelling reason for you to delve into you automatically generated XML file. Not a common problem I am sure.
Plus XML, I believe, is meant to be very much hand-editable.
It's matter of opinion how XML is meant to be edited. Why do you think there so many XML editors?
Otherwise what would be its advantage over any binary format?
Almost none. I would use binary format. But XML has nice advantage that I could use any number of existing tools to display/process it.
2. If you insist you could always implement your own format SimpleXML instead of reinventing the wheel.
I think implementing your own format looks very much like reinventing the wheel.
I don't plan to introduce any new formats. I just propose not to reinvent infrastructure for permanent storage support.
You could also rewrite all the world's software in asm, or implement your own version of C++, but why would you do that if property_tree can do what you need in several lines of code?
property_tree could "rewrite all the world's software in asm, or implement your own version of C++" in several line of code ;))? Sorry I am missing your point here. Gennadiy

Plus XML, I believe, is meant to be very much hand-editable.
It's matter of opinion how XML is meant to be edited. Why do you think there so many XML editors?
Otherwise what would be its advantage over any binary format?
Almost none. I would use binary format. But XML has nice advantage that I could use any number of existing tools to display/process it.
But you just said you do not need to edit it. Why do these tools exist then? I'm quite sure (and I see some other posters are as well) that editing of XML is very important. This is basically its most important "selling point".
You could also rewrite all the world's software in asm, or implement your own version of C++, but why would you do that if property_tree can do what you need in several lines of code?
property_tree could "rewrite all the world's software in asm, or implement your own version of C++" in several line of code ;))? Sorry I am missing your point here.
Sorry, I was unvoluntarily straying off the topic with my comments. What I meant is: you keep saying that this library could be rather implemented using serialization / PO / multi_index / etc. I agree, but it was implemented differently. I think it is the interface that matters most, not the implementation. Why care if it uses other libs or not? It's even better if it does not, think about all these dependencies and compile times. Even boost guidelines are pretty clear on that, I should not use other libs unless they provide critical functionality. The current implementation is very well covered by tests (more than 1 line of tests for 1 line of code), and none of the posters so far had any problems compiling it. I used it with quite large data files, and I know there are no showstopping performance problems. Performance will definitely be improved once I implement some of the suggestions that were already posted (namely path objects). So, I do not see reason why I should rewrite it at as a layer on top of above mentioned libraries. Especially that it would detract from some of its legitimate uses. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> writes:
What I meant is: you keep saying that this library could be rather implemented using serialization / PO / multi_index / etc. I agree, but it was implemented differently. I think it is the interface that matters most, not the implementation. Why care if it uses other libs or not? It's even better if it does not, think about all these dependencies and compile times.
That doesn't necessarily follow. If my project needs the other libraries and your library could have used them, too, you're increasing compile times in the worst case.
Even boost guidelines are pretty clear on that, I should not use other libs unless they provide critical functionality.
Where did you find that guideline? -- Dave Abrahams Boost Consulting www.boost-consulting.com

Even boost guidelines are pretty clear on that, I should not use other libs unless they provide critical functionality.
Where did you find that guideline?
http://boost.org/more/library_reuse.htm "A Boost library should use other Boost Libraries or the C++ Standard Library, but only when the benefits outweigh the costs." Perhaps "critical functionality" was an overstatement, but anyway my point remains correct. Best regards, Marcin

Marcin Kalicinski wrote:
Even boost guidelines are pretty clear on that, I should not use other libs unless they provide critical functionality.
Where did you find that guideline?
http://boost.org/more/library_reuse.htm
"A Boost library should use other Boost Libraries or the C++ Standard Library, but only when the benefits outweigh the costs."
Perhaps "critical functionality" was an overstatement, but anyway my point remains correct.
OTOH, that quote can be interpreted as just about anything :-) I think it is important to focus on the interface in the review, but I also see several benefits of an implementation that builds on Boost.MultiIndex: - fewer bugs like the one Joachin found - better space efficiency - exception-safety guarantees are immidiately full-filled (I haven't looked, but I suspect that there are several bugs in this area) Also, Joachin had a *very* interesting idea about flattening quereries like "foo.*" -Thorsten

"Marcin Kalicinski" <kalita@poczta.onet.pl> writes:
Even boost guidelines are pretty clear on that, I should not use other libs unless they provide critical functionality.
Where did you find that guideline?
http://boost.org/more/library_reuse.htm
"A Boost library should use other Boost Libraries or the C++ Standard Library, but only when the benefits outweigh the costs."
Which leaves a lot open to interpretation.
Perhaps "critical functionality" was an overstatement, but anyway my point remains correct.
That's not clear to me at all. There's no inherent good in avoiding dependencies among Boost libraries. It's a judgement call, based on many factors, including the author's personal preference. You have every right to make that call yourself, but if you're trying to somehow *justify* your decision, IMO so far you haven't said anything that demonstrates it to be the best one. -- Dave Abrahams Boost Consulting www.boost-consulting.com

That's not clear to me at all. There's no inherent good in avoiding dependencies among Boost libraries. It's a judgement call, based on many factors, including the author's personal preference. You have every right to make that call yourself, but if you're trying to somehow *justify* your decision, IMO so far you haven't said anything that demonstrates it to be the best one.
The reason why ptree does not use multi index is because implementation existed long before I considered submitting to boost, probably before even I knew of multi index existence. It was working well. Later, when I was improving it during pre-review process, I seriously considered using multi-index. But I decided it is not worth throwing everything out. Although ptree has large interface with many functions modifying state of the tree, it uses "single point of change" approach. Every insert eventually goes through one function, which takes care of exception safety and keeping index in sync with data. The same applies to erase. This function has 9 lines of code in case of insert, and (by coincidence) also 9 in case of erase. By using multi index these functions would obviously be simplified, maybe to 4 lines each. Net gain: 10 lines of code (out of several hundred in ptree_implementation.hpp). I'm aware that there are performance gains to be reaped as well, but at that time I was rather focusing on getting the interface right. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> writes:
That's not clear to me at all. There's no inherent good in avoiding dependencies among Boost libraries. It's a judgement call, based on many factors, including the author's personal preference. You have every right to make that call yourself, but if you're trying to somehow *justify* your decision, IMO so far you haven't said anything that demonstrates it to be the best one.
The reason why ptree does not use multi index is because implementation existed long before I considered submitting to boost, probably before even I knew of multi index existence. It was working well. Later, when I was improving it during pre-review process, I seriously considered using multi-index. But I decided it is not worth throwing everything out.
That's perfectly reasonable, but (through no fault of yours) it misses the point I was trying to make. I guess I should have said, "...that demonstrates it to be the best implementation." All I'm saying is that the extent to which a Boost library implementation should leverage other Boost libraries is not a question that can always be decided based on following simple guidelines, and that if this library is accepted, it's worth revisiting your decision. -- Dave Abrahams Boost Consulting www.boost-consulting.com

"Ivan Vecerina" wrote
"Andy Little" wrote news:e2a1p5$5he$1@sea.gmane.org... : : "Andy Little" wrote : > : >Did you consider a generic tree design? If so : > why did you reject it in favour of this one? : : Just to refresh... The above is the most interesting and yet unanswered question : about property tree for me. Am I missing something? Is this a silly question? Is : it too trivial to answer?
I would agree that this is an interesting point. - ptree integrates too many member functions that really should be non-member utilities (e.g. path solving, value<->ptree conversions) Fixing this is a must IMHO, I wouldn't want to accept another std::string like beast (too much built-in, yet never enough, so you end up with an inconsistent mix of member & non-member interfaces)
- in some ways, since it is already all-templated, maybe it could be made even more general (e.g. boost::any based values or nodes?)
On the other hand, loading a DOM-like tree structure into memory, and being able to manipulate it, is I think quite a common need.
Another common concept for Property Tree is the Path functionality of Boost.Filesystem. A Filesystem is AFAICS directly representable in a Property Tree. regards Andy Little

"Andy Little" wrote
Did you consider a generic tree design? If so why did you reject it in favour of this one?
Just to refresh... The above is the most interesting and yet unanswered question about property tree for me. Am I missing something? Is this a silly question? Is it too trivial to answer?
This has actually been brought up a couple of times before if you search in the archives. A generic tree structure has been asked for numerous times. I realize boost::graph covers this, but it's very cumbersome when you just want a simple hierarchical tree structure. I think BGL doesn't allow easy (just rearrange the poiinters) splices of branches either. You have to copy the branch out of the graph, then delete it from the old graph. FWIW, the implementation by Adobe at http://opensource.adobe.com/classadobe_1_1forest.html seems to be a pretty good one. -Michael Fawcett

"Andy Little" <andy@servocomm.freeserve.co.uk> wrote in message news:e2a1p5$5he$1@sea.gmane.org...
"Andy Little" wrote
Did you consider a generic tree design? If so why did you reject it in favour of this one?
Just to refresh... The above is the most interesting and yet unanswered question about property tree for me. Am I missing something? Is this a silly question? Is it too trivial to answer?
IMO the same result (as library presents) could be achieved just by using multi_index. Gennadiy

"Gennadiy Rozental" wrote
"Andy Little" wrote
"Andy Little" wrote
Did you consider a generic tree design? If so why did you reject it in favour of this one?
Just to refresh... The above is the most interesting and yet unanswered question about property tree for me. Am I missing something? Is this a silly question? Is it too trivial to answer?
IMO the same result (as library presents) could be achieved just by using multi_index.
There is a concept of a Path in Property Tree whereas multi_index has no path concept but rather a unique key per element and concerns itself with returning a particular view on a flat set of data. OTOH the tree is a fixed hierarchical structure where the position of an element in the structure is relevent (two elements can have the same name but can be distinguished by their positions), whereas multi_index presents a subset of a flat (non hierarchical) collection of elements. In the tree the particular properties of the elements plays no part in their ordering, whereas in multi_index their ordering in a particular view is a direct function of some particular properties of the elements. IOW there seems to me to be a great deal of difference between Tree and Multi-index. regards Andy Little

"Andy Little" <andy@servocomm.freeserve.co.uk> wrote in message news:e2c9jq$c8g$1@sea.gmane.org...
"Gennadiy Rozental" wrote
"Andy Little" wrote
"Andy Little" wrote
Did you consider a generic tree design? If so why did you reject it in favour of this one?
Just to refresh... The above is the most interesting and yet unanswered question about property tree for me. Am I missing something? Is this a silly question? Is it too trivial to answer?
IMO the same result (as library presents) could be achieved just by using multi_index.
There is a concept of a Path in Property Tree whereas multi_index has no path concept but rather a unique key per element and concerns itself with returning a particular view on a flat set of data. OTOH the tree is a fixed hierarchical structure where the position of an element in the structure is relevent (two elements can have the same name but can be distinguished by their positions), whereas multi_index presents a subset of a flat (non hierarchical) collection of elements. In the tree the particular properties of the elements plays no part in their ordering, whereas in multi_index their ordering in a particular view is a direct function of some particular properties of the elements.
IOW there seems to me to be a great deal of difference between Tree and Multi-index.
I never said say are the same. I said "could be". It also "could be" achieved with use of some tree data structure. And in some cases just plain std::map will surface either. Gennadiy

On Apr 21, 2006, at 12:32 AM, Andy Little wrote:
"Andy Little" wrote
Did you consider a generic tree design? If so why did you reject it in favour of this one?
Just to refresh... The above is the most interesting and yet unanswered question about property tree for me.
As I've lurked in this discussion, this is the question I've had in the back of my mind, too. Some of the most useful (imho) elements of the standard c++ library are collections. We don't spend much time anymore writing associative arrays, variable length arrays, or linked lists. Instead, we have a zoo of containers provided in the library and just populate them with data. I have long wondered why there aren't more attempts to do the same for a tree, something that separates the "tree-ness" of the data structure from the "data-ness" of the data structure similar to the way that std::list and std::map isolate data from structure. Along those lines, I have regularly used an open source implementation of a generic tree structure that I have come to dearly love: the forest class from the Adobe Source Library <http:// opensource.adobe.com/group__forest__related.html>. If it's not terribly overreaching, I humbly suggest that adobe::forest would be a good model for boost to adopt as a generic tree structure. Disclaimer: I am an Adobe employee and a sometimes contributor to the Adobe Source Library. Regards, Eric ----------------------------------------------------------------------- Eric Berdahl No job is too big. Senior Computer Scientist No fee is too big. Adobe Systems Incorporated - Dr. Peter Venkman, "Ghostbusters" berdahl@serendipity.org

Did you consider a generic tree design? If so why did you reject it in favour of this one?
Along those lines, I have regularly used an open source implementation of a generic tree structure that I have come to dearly love: the forest class from the Adobe Source Library <http:// opensource.adobe.com/group__forest__related.html>. If it's not terribly overreaching, I humbly suggest that adobe::forest would be a good model for boost to adopt as a generic tree structure.
Someone else mentioned this and I just skimmed the tutorial, and it looks good. And my smile gets even bigger when I see the license (MIT) is boost-compatible. Instead of muttering about lack of boost::tree I'll give this a try next time I need a tree structure. Darren

Darren Cook wrote:
Did you consider a generic tree design? If so why did you reject it in favour of this one?
Along those lines, I have regularly used an open source implementation of a generic tree structure that I have come to dearly love: the forest class from the Adobe Source Library <http:// opensource.adobe.com/group__forest__related.html>. If it's not terribly overreaching, I humbly suggest that adobe::forest would be a good model for boost to adopt as a generic tree structure.
Someone else mentioned this and I just skimmed the tutorial, and it looks good. And my smile gets even bigger when I see the license (MIT) is boost-compatible.
Instead of muttering about lack of boost::tree I'll give this a try next time I need a tree structure.
It looks nice, so now we just need somebody to make it into boost standards. The docs certainly need some work. -Thorsten

Hi Andy, First of all, sorry I did not reply sooner. The interest in the library is quite large and there is only that many posts I can answer per second ;-) Additionally, I spent most of today on ACCU conference, and so have developed a huge backlog.
AFAICS the current ptree could be a refinement of a a generic tree container, however the current design of ptree is not so defined and its probably not going to be possible to 'reverse engineer' genericity from the current design.
I think generic tree is a material for another library. The question if property_tree should be based on one is just an implementation issue, not of much interest to users. The biggest virtue of property_tree is easy to use interface. If we try to make generic tree of it, it will be compromised. This should not happen, because people will then prefer to use Expat or MSXML instead.
[...] ptree doesnt distinguish the concepts of a branch and a leaf. It is possible to imagine an alternative design in which a branch can be iterated over whereas a leaf cannot, and the 'data' in a branch is a container of nodes whereas the data in a leaf is not.
I never before thought of using polymorphism to distinguish between branch and leaf nodes. The reason is I never saw need to. For example this would imply that as soon as you add children to leaf it changes type (i.e. must be destroyed and reconstructed - what happens to pointers/references somebody may be holding?). My feeling is that interface would get quite muddled if I adopted that sort of approach.
But it would be interesting to know why you have chosen this design over others. That is not stated in the docs AFAICS. It would help when I wanted to use such a library because it could help me to make a fast decisison as to whether it was suitable.
Main reason was existing practice. I used this type of container for many years before I even considered writing any docs, let alone submitting it to boost. I had quite a lot of experience where and how it could be used. Plus the implementation I had - refined over several years - was starting to look mature. I knew it had general value, because it helped me on many projects. And the short answer is I never considered completely different implementation because the one I had worked, and I had other things to worry about :-)
BTW the current 'Rationale' section looks more like a 'FAQ' section FWIW.
That's right. I think it was bigger in the past, but many explanations proved to be ill thought out, and I had to remove them ;-) If this continues I might be forced to remove this section completely.
Library would need some sort of path parsing policy [...]
One option I envisage is to make a symbol table, where each new string token encountered is given an integer id, and only the id's are stored in the tree. The intent being to speed up key lookups. In this case a path to a node consists of a linear container (e.g a vector) of integer ids.
I think this could be done if I implement key policies. You just define key type to be a vector and supply appropriate head/tail functions (which would be trivial in this case). Thank you, Marcin

The biggest virtue of property_tree is easy to use interface. If we try to make generic tree of it, it will be compromised. This should not happen, because people will then prefer to use Expat or MSXML instead.
Could you clarify (with details) how is it easier then alternatives? Gennadiy

The biggest virtue of property_tree is easy to use interface. If we try to make generic tree of it, it will be compromised. This should not happen, because people will then prefer to use Expat or MSXML instead.
Could you clarify (with details) how is it easier then alternatives?
I assume that by alternatives you mean above mentioned Expat and MSXML? First, property_tree supports more formats, not just XML. Second, it presents unified interface to access data regardless of format. Third, to get a value from XML file (or any other supported format) you literally need _three_ lines of code (not counting one #include). That would be much more is case of any alternative I know of. Fourth, you do not have to link to over 1 MB DLL (in case of MSXML), just to read from you XML config file that startup GUI window position is (200, 200). I believe this is enough to consider the library as having some viable uses? Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2bjmg$i92$1@sea.gmane.org...
The biggest virtue of property_tree is easy to use interface. If we try to make generic tree of it, it will be compromised. This should not happen, because people will then prefer to use Expat or MSXML instead.
Could you clarify (with details) how is it easier then alternatives?
I assume that by alternatives you mean above mentioned Expat and MSXML?
No, of course. I mean why do I need this half baked property_tree as another data structure?
First, property_tree supports more formats, not just XML.
Property tree supports nothing in itself. It's just a data structure. You have parsers that produce property tree out of different sources. But you mat as well produce maps or something else. Here for example All that I need to do to "implement" similar functionality as your property tree: // Data structure itself template<typename ValueType,typename KeyType> struct Node; template<typename ValueType,typename KeyType> struct ptree_gen { typedef std::pair<KeyType,Node<ValueType,KeyType> > mi_value; typedef multi_index<mi_value, indexed_by<...> > type; }; template<typename ValueType,typename KeyType> struct Node { ValueType v; ptree_gen<ValueType,KeyType>::type children; }; // serilization support template<class Archive,typename ValueType,typename KeyType> void serialize(Archive & ar, Node<ValueType,KeyType>& n, const unsigned int version) { ar & n.v; ar & n.children; } // some access methods template<typename ValueType,typename KeyType> ValueType const& get( string const& keys, ptree_gen<ValueType,KeyType>::type const& src ) { std::pait<string,string> sk = split( keys, "." ); Node const& N = src.find( sk.first ); return sk.second.empty() ? N.v : get( sk.second, N.children ); } Use it like this: ptree_gen<string,string>::type PT; boost::archive::text_iarchive ia( std::ifstream ifs("filename") ); ia >> PT; string value = get( "a.b.c.d", PT ); Now tell me how property_tree interface is easier? And what is the value in 50k of Code you need to implement this data tructure. Gennadiy

// Data structure itself template<typename ValueType,typename KeyType> struct Node;
template<typename ValueType,typename KeyType> struct ptree_gen { typedef std::pair<KeyType,Node<ValueType,KeyType> > mi_value; typedef multi_index<mi_value, indexed_by<...> > type; };
template<typename ValueType,typename KeyType> struct Node { ValueType v; ptree_gen<ValueType,KeyType>::type children; };
// serilization support template<class Archive,typename ValueType,typename KeyType> void serialize(Archive & ar, Node<ValueType,KeyType>& n, const unsigned int version) { ar & n.v; ar & n.children; }
// some access methods template<typename ValueType,typename KeyType> ValueType const& get( string const& keys, ptree_gen<ValueType,KeyType>::type const& src ) { std::pait<string,string> sk = split( keys, "." );
Node const& N = src.find( sk.first );
return sk.second.empty() ? N.v : get( sk.second, N.children ); }
What you just implemented is stripped down, bare bones version of property_tree that, among other things, does not allow you to produce human editable XML files. Now add more interface (aka get functions), add more archives to serialization lib, add customization, add transparent translation from strings to arbitrary types and vice versa. Spend some weeks trying to get all the corner cases right, and then some more weeks trying to smooth rough edges in the interface. Then write tests. Write docs. At the end, I believe you will not get much less code than there is in the library already. Maybe you get some savings by using multi_index instead of manual indexing. Best regards, Marcin

"Thorsten Ottosen" wrote:
I have the great pleasure to announce that the formal boost-review of Marcin Kalicinski Property Tree Library
I vote weak yes to accept the library. Personally, I would prefere to wait for rev6 until some issues get resolved. Given that the review had started I think it is better to accept the library as it is very likely to get maintained and improved. Not accepting it would mean 6-12 months delay and possibly abandonment. I did a review of rev5 few days ago - the notes are pasted bellow if anyone is interested. /Pavel ------------------------------------------------------------------- Hello Marcin, I took look on the V5 and collected few notes. Most important, the documentation still needs quite a lot of work. I read some noises about expanding the library to be arbitrary tree and so on. IME this is very hard to develop and of rather low value for users. Ptree 's main value is its simplicity for its intended task and this should not be compromised. /Pavel _____________________________________________________ 1. docs: it is rather unusual for me read code without syntax highlighting. _____________________________________________________ 2. The first mention about use of exceptions in docs (in debug_settings::load) should by hyperlink to details. I would welcome every code snippet to have such link, very visible to people who co copy + paste. _____________________________________________________ 3. The first mention of get_d() should explicitly say that "d" means default. If possible this word should be bolded. _____________________________________________________ 4. A curiousity: in docs you write: Type of the value extracted is determined by type of second parameter, so we can simply write get_d(...) instead of get_d<int>(...). */ m_level = pt.get_d("debug.level", 0); If I do not specify return type explicitly there may be conversion made after function returns. If I specify the type explicitly there may be conversion of the default parameters when the thread of execution enters the function. There may be some interesting side-effects with this. At the moment I do not see much use for the distinction or some idiom based on it but perhaps it may be worth of some attention. _____________________________________________________ 5. docs, "Property tree as a container": it may be explicitly stated here (or linked from here) how validity of iterators changes after delete/update/etc. _____________________________________________________ 6. Possible feature: "...there is an additional indexing data structure inside property tree that allows fast searches..." There may be a type trait that disables creating of such structure, where the search complexity would degrade gracefuly to O(N). For example a huge structure that is processed sequentially may not need it (and would be better with more memory). _____________________________________________________ 7. The temptation to use std::lower_bound and similarly dangeroud std algorithm may be eliminated by defining unimplemented specialisations for ptree (in std:: namespace). This is allowed by standard. _____________________________________________________ 8. docs, "Synopsis": "Instances of this class are property trees." - the sentence may be better reworded, it sounds somehow redundant to me. ------------ The mention of ptree, wptree, iptree and wiptree should be hyperlinked - it is absolutely unclear what they are about. ------------ Perhaps tables could be used as a tool to combat the cryptic abbreviations. These tables would have list these abbreviations and rationale for the name. The docs hyperlinks may point to such tables for quick overview.
From the table one would be able to go deeper into references.
--------------- The lines class ptree_error; class ptree_bad_path; class ptree_bad_data; should have comment that these classes are actually exceptions. It is not that clear. ------------ The name "ptree_error" would be better "ptree_error_base" to give immediate clue ----------------- empty_ptree() should be create_empty_tree(). or make_empty_tree() (I prefere the first, the second is somehow semistandard). -------------- In template<class Ch, .... the "Ch" should be spelled fully. Ch gives no clue. Dtto the "Tr". --------------- The parametrisation of ptree should go futher. The basic_string<Ch> should be template parameter. Some people may like (or be forced) to use flex_string/AnsiString/CString/QString/ wxString/boost::fixed_string/etc and the library should not lay obstacles to them. Another possible template parameter is allocator. --------------- stable_unique() operation may be added. This would eliminate all duplicates while keeping order of what is kept. A generic implementation of stable_unique can be seen on http://uk.builder.com/programming/c/0,39029981,20271583,00.htm -------------------- The part of docs named // Ptree-specific operations should have more comments. It is almost unreadable blob as it is now (no syntax highlighting, no hyperlinks to actual code). ------------------- get_own() etc: I suggest to add table of abbreviations on the top of documentation, before anything else. I feel rather stupid looking on "own" and guessing what could it be (I try to feel as first time reader). ---------------- OTOH the locale is not needed to be described in such detail in the documentation. Just someting as --locale-- should be enough to get clue. --------------- Instead of "class CharType" a "typename CharType" feels better. It is also somehow confusing what suddenly new character type appears here and what it could mean in the design. There should be commen, possibly link to example. _____________________________________________________ 9. docs, "How to populate property tree" The word "parsers" should not be used. It brings feeling of Spirit or yacc. A standard and not overloaded word is "reader" or "reader/writer". -------------- "It has just one data string associated..." ==>> "It has just one data (typically string) associated..." ----------- The sentence of what parts of XML (as multiple props) are not supported should make it into section of its own and this section should appear in top table of contents. ------------- Existence of file_parser_error is not shown in the synopsis. There should be picture, class diagram of all existing exceptions, possibly as clickable map. ------------- The <xmlattr> discussion is now strangely splitted among two sections (I am not able to distinguish if they are or are not of the same level). ----------- The sections may be numbered with x.y stype. _____________________________________________________ 10. docs, "INI parser": The fillings as "the reason", "probably", "contrary", "actually" should be omitted. The text will be read by developer under time pressure and they won't appreaciate it. _____________________________________________________ 11. docs, "JSON parser": JSON could by hyperlinked to external website. _____________________________________________________ 12. docs, "Command line parser" - existence of Boost.Options library may be reminded in the begining of this section. _____________________________________________________ 13. docs, "How to access data in property tree ": "Property tree resembles (almost is) a standard container..." is very ambiguous as there are many standard containers. _____________________________________________________ 14. docs: few examples may use wide strings als L"...". _____________________________________________________ 15. The headers may have #if (defined _MSC_VER) && (_MSC_VER >= 1200) # pragma once #endif on the top to reduce a little bit compilation time for VC and Intel. _____________________________________________________ 16. the file property_tree/detail/ptree_interface.hpp should be moved a folder down as it is the most important header, not an "detail". _____________________________________________________ 17. Idea of a feature: right now the ptree is intended as temporary structure - read from file and then transformed into user data. People may like to use ptree as primary structure, without need to define their own helper classes. For this it would be useful to be able to attach some data to nodes. I suggest to add yet another template parameter or type traits: associated datatype, defaulting to void. If they are not void then this datatype will exist next to each string. (Boost.Any may be useful example.) _____________________________________________________ 18. ptree_utils: the function widen/narrow are horribly inefficient. The "result" string should be reserve()d before characters are added. Since the only char types known to a man are singned char/unsigned char/char and wchar_t it would be better to provide specialisations for these. If someone wants something strange (string of doubles) he would need to write a meaningful specialisation for it. The narrow<wchar_t> is flawed. The function trim() is already available in Boost.String Algorithms. _____________________________________________________ 19. A nit: source files should end with newline (Standard says so). Some compilers may complain if they do not, e.g. ptree_interface.hpp. _____________________________________________________ 20. Namespaces: I would prefere not to have yet another sub-namespace in boost. The ptree should be moved down (via "using") so boost::ptree<...> would be valid. _____________________________________________________ 21. json_parser.hpp, create_escapes(): a switch should be used instead of the if-else chain. ----------- The header is missing #include <cctype> _____________________________________________________ 22. Boost.Serialization should be supported. It is not that absurd as it may look - the tree may be part of application state that is saved/sent over network. _____________________________________________________ 23. cmdline_parser.hpp: there may be support for parameters passed via WinMain() - i.e. as single string. The code that splits the string into argv tokens already exists in program_options library. -------- I suspect the code: Ptree *child = local.put(text, Str()); child->push_back(std::make_pair(Str(), Ptree(child->data()))); is not exception safe (when if the Ptree contructor throws, who will clear the child). I have feeling an auto_ptr may be useful here. _____________________________________________________ 24. The docs should show exception safety level for every member function. (e.g. as smart_ptrs have). _____________________________________________________ 25. I tried to compile the test with BCB (version 5.8), from BDS 2006. The problem is that Borland doesn't like the out-of-class member bodies that return any kind of iterator (begin/end/rbegin/rend/front/back/find/erase/push_front/push_back). I found a workaround: it is needed to fully specify the returned type, not just "iterator" but expanded definition: So for example the template<class Ch, class Tr> typename basic_ptree<Ch, Tr>::iterator basic_ptree<Ch, Tr>::begin() { return m_impl->m_container.begin(); } would need to be changed to ugly: template<class Ch, class Tr> #if BOOST_WORKAROUND(__BORLANDC__, BOOST_TESTED_AT(0x564)) std::list<std::pair<std::basic_string<Ch>, basic_ptree<Ch, Tr> >
::iterator #else typename basic_ptree<Ch, Tr>::iterator #endif basic_ptree<Ch, Tr>::begin() { return m_impl->m_container.begin(); }
Perhaps it is worth of have BCB support. The compiler is quite shitty but their IDE is so far the best C++ RAD environment on the planet. Some people use it for this reason. I can help with BCB porting. It is also possible to download free BCB compiler (version 5.5, more-less the same as the 5.8 compiler). _____________________________________________________ 26. For better visual appearance you may insert few <hr> into the documentation. _____________________________________________________ EOF

Hi Pavel, thanks for a review.
I took look on the V5 and collected few notes.
First of all, from some of your comments I see you must have looked at the older revision. Rev.5 does not have any pointers in the interface, and no longer has get_d, get_b variants.
Most important, the documentation still needs quite a lot of work.
You can always say that about any documentation. Although it should be polished and possibly expanded in places (like traits customization), I think it is already quite large (over 150kb of text). It also contains a reference which covers 100% of interface functions in quite a detail. This is something that not every boost library has.
I read some noises about expanding the library to be arbitrary tree and so on. IME this is very hard to develop and of rather low value for users.
Agreed. This is not a generic tree container. This library is about idiomatic access to configuration data, easy reading/writing of many file formats.
_____________________________________________________ 1. docs: it is rather unusual for me read code without syntax highlighting.
The problem is, it is written and maintained in HTML. I could add code coloring, but that would render it close to unmaintainable. Any change would be a nightmare. I should have used QuickBook, but unfortunately I had problems setting up the toolchain. So I stuck to HTML. Btw. some other well estabilished libraries in boost also do not have code coloring, and I didn't notice anybody complaining.
_____________________________________________________ 2. The first mention about use of exceptions in docs (in debug_settings::load) should by hyperlink to details.
I agree. I could add some more hyperlinks all over the docs. But whole synopsis section, and most importantly the reference, are thoroughly cross-linked.
3. The first mention of get_d() should explicitly say that "d" means default. If possible this word should be bolded.
get_d is not longer part of the library, it was removed in rev5. You must have looked at the older version.
_____________________________________________________ 4. A curiousity: in docs you write:
Type of the value extracted is determined by type of second parameter, so we can simply write get_d(...) instead of get_d<int>(...). */
m_level = pt.get_d("debug.level", 0);
If I do not specify return type explicitly there may be conversion made after function returns.
If I specify the type explicitly there may be conversion of the default parameters when the thread of execution enters the function.
I agree, but is it a problem? I don't think it will be an issue if you read long instead of int and have it implicitly converted. If you want int, you can always get it by using get<int>(...)
_____________________________________________________ 5. docs, "Property tree as a container": it may be explicitly stated here (or linked from here) how validity of iterators changes after delete/update/etc.
Yes, that's right. This is not explicitly said anywhere, but iterators only get invalidated when element they point to is removed (like std::list). Insertions/erases of elements do not affect other iterators.
_____________________________________________________ 6. Possible feature:
"...there is an additional indexing data structure inside property tree that allows fast searches..."
There may be a type trait that disables creating of such structure, where the search complexity would degrade gracefuly to O(N).
I know, you said that before, I have it hovering on my list of things to be done. It has never got big enough priority though.
_____________________________________________________ 7. The temptation to use std::lower_bound and similarly dangeroud std algorithm may be eliminated by defining unimplemented specialisations for ptree
You can always sort the tree and then use the algorithm safely, so banning it would not be wise.
-------------- In template<class Ch, ....
the "Ch" should be spelled fully. Ch gives no clue. Dtto the "Tr".
No more templates on <Ch> in the library. You must have looked on the old version.
--------------- The parametrisation of ptree should go futher. The basic_string<Ch> should be template parameter. Some people may like (or be forced) to use flex_string/AnsiString/CString/QString/
Exactly that was done in revision 5. basic_ptree is now parametrized on key type (besides other things).
stable_unique() operation may be added. This would eliminate all duplicates while keeping order of what is kept.
This can be provided as an generic external algorithm, no need to clutter the class interface. I don't think it belongs to ptree library, it's rather an extension to <algorithm> header.
A generic implementation of stable_unique http://uk.builder.com/programming/c/0,39029981,20271583,00.htm
From what I seen the implementation requires random-access iterators, so it would not work with ptree.
"It has just one data (typically string) associated..."
The docs are littered with hardcoded references to string as a key/data type, because customization facilities were added late. Anyway, I think that using std::strings will be the most common case, and adding parentheses everywhere I talk about data/key types might do more harm than good.
_____________________________________________________ 10. docs, "INI parser":
The fillings as "the reason", "probably", "contrary", "actually" should be omitted. The text will be read by developer under time pressure and they won't appreaciate it.
Yeah, my english could use some polishing. The text is probably overly complicated and littered with meaningless words in places. Again, I would appreciate if a native speaker skimmed over it and pointed me towards several most glaring mistakes (grammar/style etc.).
_____________________________________________________ 15. The headers may have
#if (defined _MSC_VER) && (_MSC_VER >= 1200) # pragma once #endif
I know, you suggested that before :-) Again, this is somewhere on my list of things to be done, but never got high enough priority. Btw. is compilation time really shortened that much? I think that it only removes the preprocessing step, which is done in a snap anyway. I haven't done any measurements so I might be wrong.
_____________________________________________________ 16. the file property_tree/detail/ptree_interface.hpp should be moved a folder down as it is the most important header, not an "detail".
The top level folder _only_ contains files which are includable be the user, all the rest in is details. I want to stick to that.
_____________________________________________________ 18. ptree_utils: the function widen/narrow are horribly inefficient. The "result" string should be reserve()d before characters are added.
I know there are performance issues with some of the parsers. I think these can be resolved safely later, because they do not have any impact on the interface.
The narrow<wchar_t> is flawed.
Why do you think it is flawed?
The function trim() is already available in Boost.String Algorithms.
I though I'd rather avoid dependencies if they do not provide critical functionality, like e.g. use of Spirit to parse XML.
_____________________________________________________ 19. A nit: source files should end with newline (Standard says so). Some compilers may complain if they do not, e.g. ptree_interface.hpp.
As far as I know all source files end in newline. I doublechecked ptree_interface.hpp, and it does end in newline. I think gcc (with -Wall -pedantic) issues warining if a file does not end in newline, and library compiles without any warnings on gcc.
_____________________________________________________ 20. Namespaces: I would prefere not to have yet another sub-namespace in boost.
The ptree should be moved down (via "using") so boost::ptree<...> would be valid.
The header is missing #include <cctype>
From which file it is missing?
_____________________________________________________ 22. Boost.Serialization should be supported. It is not that absurd as it may look - the tree may be part of application state that is saved/sent over network.
I was actually trying to get a step further: use property tree as a target (not source) for serialization. I.e. creating Archive which writes to a ptree. This opens door to many interesting possibilities, for example you can save your archives as JSON out of the box. I may post some source code in the future when I get it working. The only drawback to using ptree as a generic archive is performance, it is going to be much slower than a regular archive.
-------- I suspect the code:
Ptree *child = local.put(text, Str()); child->push_back(std::make_pair(Str(), Ptree(child->data())));
is not exception safe (when if the Ptree contructor throws, who will clear the child).
It is safe. The pointer returned points to child which is owned by local, which is stack based and will be cleaned up with all the children in case of exception. Btw. this is code from old version of the library.
_____________________________________________________ 25. I tried to compile the test with BCB (version 5.8), from BDS 2006.
I know the library will have problems compiling on non-compliant tools. It woukd be nice to have it working on old stuff, but it probably requires so much work that it is rather low on my priority list.
I can help with BCB porting. It is also possible to download free BCB compiler (version 5.5, more-less the same as the 5.8 compiler).
That's great. If somebody else with better knowledge of BCB intricacies could do it, it would be nice. On the other hand I'm afraid of introducing too many maintenance problems (aka ifdefs), or compromising the design/performance of the library to make old stuff happy. Thank you for another great review. Marcin

Hello Marcin,
First of all, from some of your comments I see you must have looked at the older revision. Rev.5 does not have any pointers in the interface, and no longer has get_d, get_b variants.
Gosh, I found I have two source code trees here and I picked up the old one. Sorry for the mess. I recall everything I said and will review the latest version again. _____________________________________________________
1. docs: it is rather unusual for me read code without syntax highlighting.
The problem is, it is written and maintained in HTML. I could add code coloring, but that would render it close to unmaintainable. Any change would be a nightmare. I should have used QuickBook, but unfortunately I had problems setting up the toolchain. So I stuck to HTML. Btw. some other well estabilished libraries in boost also do not have code coloring, and I didn't notice anybody complaining.
Newer libraries generaly have syntax highlighting. I complain because it is years and years since I wrote code without colorizing editor. _____________________________________________________
6. Possible feature:
"...there is an additional indexing data structure inside property tree that allows fast searches..."
There may be a type trait that disables creating of such structure, where the search complexity would degrade gracefuly to O(N).
I know, you said that before, I have it hovering on my list of things to be done. It has never got big enough priority though.
It seems I rerun my previous review once again. I do not keep the previous texts so it may happen more than once I "discover" something repeatedly. _____________________________________________________
7. The temptation to use std::lower_bound and similarly dangeroud std algorithm may be eliminated by defining unimplemented specialisations for ptree
You can always sort the tree and then use the algorithm safely, so banning it would not be wise.
Hmm. Perhaps specialisations could be written that in debug mode check the tree is actually sorted. _____________________________________________________
15. The headers may have
#if (defined _MSC_VER) && (_MSC_VER >= 1200) # pragma once #endif
Btw. is compilation time really shortened that much? I think that it only removes the preprocessing step, which is done in a snap anyway. I haven't done any measurements so I might be wrong.
Small but noticeable if applied rigorously: http://lists.boost.org/Archives/boost/2003/09/52773.php _____________________________________________________
25. I tried to compile the test with BCB
That's great. If somebody else with better knowledge of BCB intricacies could do it, it would be nice. On the other hand I'm afraid of introducing too many maintenance problems (aka ifdefs), or compromising the design/performance of the library to make old stuff happy.
#ifdefs will be necessary for BCB. Generally, most of Boost libraries are littered with #ifdefs to be almost unreadable. A "clean" version of Boost has been suggested but it is not likely to happen in next few years. /Pavel

Pavel Vozenilek wrote:
7. The temptation to use std::lower_bound and similarly dangeroud std algorithm may be eliminated by defining unimplemented specialisations for ptree
You can always sort the tree and then use the algorithm safely, so banning it would not be wise.
Hmm. Perhaps specialisations could be written that in debug mode check the tree is actually sorted.
What kind of dangerousness are we talking about here: a crash or false results? -Thorsten

Hmm. Perhaps specialisations could be written that in debug mode check the tree is actually sorted.
What kind of dangerousness are we talking about here: a crash or false results?
False results. upper_bound / lower_bound / binary_search only work on sorted ranges. Property_tree might look like a sorted range on first sight, but in reality it is not (unless explicitly sorted). Marcin

This is my second review of the library (I mistakenly reviewed rev4 before). Since Marcin plans to change the get/set interface of the library it would be better to arrange review for rev6, within reasonable time interval (few months, not in a year). I would be happy with current acceptance as well, as futher improvement are very likely. The library has potential to be very useful for real world application (in current form and even more in rev6). Clearly, domain for the library is to read/write configuration data, not to serve as generic tree or complete XML parser. Temptation to extend the now simple API into giant does-it-all tool may result in something practically no one would dare to use. I do not see the overlap with PO as obstacle. Merge between PT and PO is unlikely - I have never seen anything like this on Boost so far. Things that need improvement: ------------------------------------------ 1. Examples, examples, examples. Adding many small but complete examples into documentation is the most effective way how to increase acceptance of the library among users. No one is interested in reading lenghty manual just to get a value from INI. If the documentation is already too big drop reference, heck, drop anything but add examples. As others had pointed, there is more to improve in the documentation. 2. Limitation of the library should be clearly spelled, especially for XML. I am in favour to limit XML features as much as possible - it is not task of simple configuration tool to provide validating parser and whatnot. Rename te library to "simple property tree" if people cannot stop requesting more. Existence of ptree is no barrier for anyone to create "powerfull property tree". 3. Others: * basic_string should not be hardcoded. There are too many string implementations floating around. * Optional MT safety may be added. * Named template parameters may be used to deal with large number of traits * Preserving comments and whitespace in INI/JSON/INFO could be the /most/ important reason someone decides to use the library. * The library should be able to read multiline data that do not require backslash at the end. It would be usefuul e.g. to store scipts inside configuration. * The comand line parser should have reference to PO and should clearly state its limits. * It is possible and useful to extend the registry parser (registry_ptree?). One existing registry library can be found on: http://synesis.com.au/software/stlsoft/help/group__winstl__reg__library.html * The property tree itself may be serializable as it could be part of larger application state (this is different from using Serialization as engine for PT). * Support for various Unicode encodings (frex reading UTF16 BE and converting the strings to ANSI, all transparently). * Overloads for parsers to parse given memory block (e.g. mapped file, shared memory) * An user defined datatype (e.g. boost::any) attached to each node may be useful when one is content with the ptree data structure and doesn't want to move it from/to C++ structures. Features that I do not consider as needed: --------------------------------------------------------- * Transformation from one format to another. Unless this capability is naturally present in the library the tool should not get more complicated because of it. * Validators, type safety checkers, notifiers etc. Usefulness for the 80% of real world applications is IMHO the main selling point and I am far from sure whether these features could be implemented without sacrifying current simplicity. * Genericity for sake of genericity. Purpose of the lib is to read configuration data with least of hassle and then get out of way. It is not Boost.Tree (and should not be documented as if it is), it is not Visitor, it is not in memory database. * I would be very cautious with replacing parts of the library with Serialization / Multi-index. There are many practical consideraration - compilation time, supported compilers, projects with multiple executables, versioning. People using the library just want it to work and would not be willing to solve possible problems related to other libraries or even debug through 3pp libraries. * Breaking library to parts. This is just a way to tell intended users: "you'd better to quickly hack up something by hand, don't bother with PT". /Pavel

Hi Pavel, Thank you for another review, it's probably 4th or even 5th, counting ones you did before the review started.
Temptation to extend the now simple API into giant does-it-all tool may result in something practically no one would dare to use.
Right. A lot of the people posting to this thread proposed at least one new major feature. If I added all of them (if I ever managed to), the library would sink under its own weight.
1. Examples, examples, examples.
Probably a very good idea. But I need to port docs to QuickBook first. Otherwise maintaining it is going to kill me.
Rename the library to "simple property tree" if people cannot stop requesting more.
He he he. I don't think it is going to stop them.
* basic_string should not be hardcoded. There are too many string implementations floating around.
It is now only "soft-coded" for the key type. By "soft-coded" I mean that some std::string interface is required (substr etc.), but key_type itself can be changed. Of course, this sort of partial solution is not sufficient. Adding more generic path/keys support is the now the most important change waiting in the queue.
* Optional MT safety may be added.
I would rather leave it at std::container style - no built-in safety.
* Preserving comments and whitespace in INI/JSON/INFO could be the /most/ important reason someone decides to use the library.
The more I think about it, the simpler it seems, and originally I thought it is a definite no-no, just look at my replies to Jeff Garland. Some suggested adding extra fields to the node, but this is not acceptable for me; if comments cannot be handled by the tree alone it's better if they are not handled at all. The way to do it is to add metadata (like <xmlattr> or "\values"), which describes comments and optionally location. Of course this will slow some of the parsers very significantly, so it should be optional. Metadata will be fully editable, but will not pollute normal data. I hope this is doable.
* The property tree itself may be serializable as it could be part of larger application state
Yes. This is probably very easy to do.
* Support for various Unicode encodings (frex reading UTF16 BE and converting the strings to ANSI, all transparently).
If I understand the problem correctly, this is a job for codecvt facet, it was designed to do it. Library only allows setting locale for all relevant operations.
* Overloads for parsers to parse given memory block (e.g. mapped file, shared memory)
Maybe. But how about doing another library that presents arbitrary memory block in form of C++ stream? I think it would be usable in many other contexts as well, and in one of my projects definitely. Stringstream is flawed because it must copy the data first. Copying whole memory-mapped file is really a rubbish idea :-)
* An user defined datatype (e.g. boost::any) attached to each node may be useful [...]
No problem now. Customize data_type. Thank you again, Marcin

"Marcin Kalicinski" wrote:
* Optional MT safety may be added.
I would rather leave it at std::container style - no built-in safety.
Having get/set protected by mutext doesn't guarantee complete MT safety but it should be enough for application where the configuration is loaded, read (from one or more threads) and at the end saved. More complex scenarios would need user written wrapper.
* Preserving comments and whitespace in INI/JSON/INFO could be the /most/ important reason someone decides to use the library.
The more I think about it, the simpler it seems, and originally I thought it is a definite no-no, just look at my replies to Jeff Garland. Some suggested adding extra fields to the node, but this is not acceptable for me; if comments cannot be handled by the tree alone it's better if they are not handled at all. The way to do it is to add metadata (like <xmlattr> or "\values"), which describes comments and optionally location. Of course this will slow some of the parsers very significantly, so it should be optional. Metadata will be fully editable, but will not pollute normal data. I hope this is doable.
I won't talk about XML (do not use, not that much knowledge, doubts about its future). It is implementable for INIs (I did it) - these metadata would be loaded, hidden from user but saved. For JSON and INFO this likely applies as well. As comments and whitespace are metadata the application should not have any access to them. They should NOT be editable (by the application).
* Overloads for parsers to parse given memory block (e.g. mapped file, shared memory)
Maybe. But how about doing another library that presents arbitrary memory block in form of C++ stream? I think it would be usable in many other contexts as well, and in one of my projects definitely. Stringstream is flawed because it must copy the data first. Copying whole memory-mapped file is really a rubbish idea :-)
Hmm, possibly. Having parse(condt void* data, unsigned size, ptree&); is however the simples interface possible. The string type may be custom one, pointing into provided memory instead of using allocations of its own. The result would be something able to parse really large amount of data. /Pavel

On Wed, 26 Apr 2006 00:31:29 +0200 "Pavel Vozenilek" <pavel_vozenilek@hotmail.com> wrote:
As comments and whitespace are metadata the application should not have any access to them. They should NOT be editable (by the application).
I think comments *should* be available to an application, especially an application that will change or add fields to the configuration file... it would be nice to add some comments to describe the field and have them look like native comments for whichever file format is being used...

"Jody Hagins" wrote:
As comments and whitespace are metadata the application should not have any access to them. They should NOT be editable (by the application).
I think comments *should* be available to an application, especially an application that will change or add fields to the configuration file... it would be nice to add some comments to describe the field and have them look like native comments for whichever file format is being used...
It /may/ be useful to add comment to newly created item. Creating change-log, hmm, this can generate a lot of useless fluff unless lot of care is taken. Interpreting the metadata by app could result in making sense out those funny commented out values or of empty space used to prettify the config file. IMO most would prefere specialized formatting for such comments, rather than trying to process free style metadata. One of aims should be to minimize number of changes in config file as they are treated by CVS or other tool. I am not aware of any open source library having such feature. /Pavel

Hi, I am really busy lately and won't have time for complete review. But since I am intimately familiar with this domain, I've decided to take peek on my way home. First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story). This library not only do not address issues of existent solution it's not even comparatively close feature wise. It does present some additional media formats. But this could as well be done as a add-on to existing solution. So my vote is NO, thank you very much. Here some notes in no particular order from my skimming through docs and implementation: 1. Name is really bad. Originally I thought this submission has something to do with property_map and was surprised to see what I see. 2. Why would you need fast lookup? In 99.9% of cases each variable is accessed single time in a program lifetime. No need to optimize so that you program startup time go from 5 mls to 5 mks. You won't notice this anyway (And in reality it's not even that big difference. Most of your startup time you will spend initiating either network connection or some GUI component or similar) 3. Even if you insist on alternative lookup. Why not just use single typedef with multi_index instead of all the implementation. 4. Extra functionality (to the one in multi_index) should go in free functions not in member functions - it would enhance encapsulation and reusability. 5. CLA parser is a joke. It's unacceptable any way you present it. 6. I personally believe that inline implementation in this case is actually is rather a disadvantage. I would be perfectly happy to stick to ASCII for configuration purposes. Even if you insist on wide char support, I would implement it as a thin template wrapper around offline implementation that operates with void*. Plus some runtime polymorphism could be used. 7. ptree_utils reinvent the wheel. Why not use Boost string algorithms? 8. I keep repeating: Traits could not be template parameters. You should had have three template parameter KeyType, ValueType and ComparePolicy. The same way as std::map is doing 9. Access interface are lacking. If I got string identify subsystem and string that identify parameter separately why do I need to concatenate them to get to the value 10. General note: the whole design/implementation is unnecessary complicated. You general part is about 50k. In Boost.Test utils section I have a component with very similar rationale but better design (IMO obviously) implemented in 16k. 11. Generating part if the design/implementation is completely unnecessary. It may? be useful in 1% of the usage cases and in those cases I would stick to some alternatives. IMO library presenters at best needed to do some comparative analysis with existing solution and present it as a part of the review. Regards, Gennadiy

First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story). This library not only do not address issues of existent solution it's not even comparatively close feature wise. ...
Which existing library? Darren

"Darren Cook" <darren@dcook.org> wrote in message news:44470078.9000503@dcook.org...
First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story). This library not only do not address issues of existent solution it's not even comparatively close feature wise. ...
Which existing library?
? program options Gennadiy

Gennadiy Rozental wrote:
"Darren Cook" <darren@dcook.org> wrote in message news:44470078.9000503@dcook.org...
First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story). This library not only do not address issues of existent solution it's not even comparatively close feature wise. ...
Which existing library?
? program options
It is true that there is a small overlap with program-options, but the rest of the library (the majority of the library) doesn't have anything to do with Program Options AFAICT. Also, but that needs to be confirmed, my impression was that for small jobs, Marcin's approach would be easier to apply, whereas Program Options is when you really need through solution. -Thorsten

First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story). This library not only do not address issues of existent solution it's not even comparatively close feature wise. ...
Which existing library?
? program options
It is true that there is a small overlap with program-options, but the rest of the library (the majority of the library) doesn't have anything to do with Program Options AFAICT.
What is the basis for your opinion? IMO it's 1:1 correspondence. Could you do feature by feature comparison?
Also, but that needs to be confirmed, my impression was that for small jobs, Marcin's approach would be easier to apply, whereas Program Options is when you really need through solution.
It's matter of opinion. I agree PO could've been done much better. From usability standpoint also. Gennadiy

Gennadiy Rozental wrote:
First and foremost I would like to remind everybody that we
already have
one library intended to cover this problem domain (completely
unacceptable
IMO - but that's another story). This library not only do not
address issues
of existent solution it's not even comparatively close feature
wise. ...
Which existing library?
? program options
It is true that there is a small overlap with program-options, but the rest of the library (the majority of the library) doesn't have anything to do with Program Options AFAICT.
What is the basis for your opinion? IMO it's 1:1 correspondence. Could you do feature by feature comparison?
Could you? AFAICT, just readon the tutorial for Program Options vs Reading the docs for Command Parsing with the property-tree doesn't show much overlap. Program options is much more advanced and formats messages for you and all. The property tree gives you a something like a map from arguments to argument values.
Also, but that needs to be confirmed, my impression was that for small jobs, Marcin's approach would be easier to apply, whereas Program Options is when you really need through solution.
It's matter of opinion. I agree PO could've been done much better. From usability standpoint also.
Right. -Thorsten

It is true that there is a small overlap with program-options, but the rest of the library (the majority of the library) doesn't have anything to do with Program Options AFAICT.
What is the basis for your opinion? IMO it's 1:1 correspondence. Could you do feature by feature comparison?
Could you?
AFAICT, just readon the tutorial for Program Options vs Reading the docs for Command Parsing with the property-tree doesn't show much overlap. Program options is much more advanced and formats messages for you and all. The property tree gives you a something like a map from arguments to argument values.
I believe Storate component of PO and property_tree in this submission are the one that overlaping. Then each library present different parsers for different formats to populate this storage. But most importantly: the problem domains are the one that are overlaping. Unless you want ot amend the problem domain for this submission. Gennadiy

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:e272ol$3tv$1@sea.gmane.org... : : "Darren Cook" <darren@dcook.org> wrote in message : news:44470078.9000503@dcook.org... : >> First and foremost I would like to remind everybody that we already : >> have : >> one library intended to cover this problem domain (completely : >> unacceptable : >> IMO - but that's another story). This library not only do not address : >> issues : >> of existent solution it's not even comparatively close feature wise. ... : > : > Which existing library? : : ? program options I thought that you would have mentioned boost::serialize as an overlapping library, but program_options ?? I think that a clear difference is that Property Tree is intended to support I/O of configuration settings, and of other kinds of human-readable data files. I could well envision an application that uses both libs: program_options to handle command-line parameters, and ptree as a storage format for its data files. I agree with many of your other points. In particular ptree could be made leaner, and the double-indexing may well be overkill (I haven't looked at the implementation itself). But the needs the library seeks to address are very real. Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

I think that a clear difference is that Property Tree is intended to support I/O of configuration settings, and of other kinds of human-readable data files. I could well envision an application
The same woth PO. PO doesn't deal with config file generation but that (as I mention in my review) is completely unnessesary anyway. Why and how frequently would I want to generate my config files programmatically?
that uses both libs: program_options to handle command-line parameters, and ptree as a storage format for its data files.
program_options has is't own facility for that.
I agree with many of your other points. In particular ptree could be made leaner, and the double-indexing may well be overkill (I haven't looked at the implementation itself). But the needs the library seeks to address are very real.
Could be. But PO already doing everything this library does. Any specific examples of what is missing (other than some extra parsers for xml and registry)? Gennadiy

Gennadiy Rozental wrote:
The same woth PO. PO doesn't deal with config file generation but that (as I mention in my review) is completely unnessesary anyway. Why and how frequently would I want to generate my config files programmatically?
Every program that allows configuration via a GUI or even an in-program CLI will want to persist the configuration changes. Examples? Take any program at all that is not meant to run once, do a job, and exit (as compilers do), but instead runs for some time. Any web browser, email client, word processor, whatever. They all allow configuration at runtime and thus need to persist the configuration into permanent storage. Rare exceptions here are servers: Apache, Xorg Server (though that's not a virtue for the XServer). And even the X server has many tools that generate configuration. Even the Linux kernel has programs that generate config files! Sebastian Redl

Gennadiy Rozental wrote:
I think that a clear difference is that Property Tree is intended to support I/O of configuration settings, and of other kinds of human-readable data files. I could well envision an application
The same woth PO. PO doesn't deal with config file generation but that (as I mention in my review) is completely unnessesary anyway. Why and how frequently would I want to generate my config files programmatically?
We read and write our config files a lot. Our systems perform the own optimisations and the latest settings our stored in a config file. The software reads in the current config file, modifies some values and writes out the entire config file again (currently they are XML based). But some values are also only human editable (not on the system setup display etc) so simple XML based files work well because they can easily be read/written by a person and the software itself. Cheers Russell

We read and write our config files a lot. Our systems perform the own optimisations and the latest settings our stored in a config file. The software reads in the current config file, modifies some values and writes out the entire config file again (currently they are XML based).
Use Serialization lib for this.
But some values are also only human editable (not on the system setup display etc) so simple XML based files work well because they can easily be read/written by a person and the software itself.
There many ways to deal with it. For example you could keep these fields as strings in memory or use separate storage for them or teach boost::serialization to save/load them. Gennadiy.

Use Serialization lib for this.
But some values are also only human editable (not on the system setup display etc) so simple XML based files work well because they can easily be read/written by a person and the software itself.
There many ways to deal with it. For example you could keep these fields as strings in memory or use separate storage for them or teach boost::serialization to save/load them.
I have recently tried to do just this. Not subtracting from the great value of serialization library, it has a nasty habit of polluting XML files it generates with all sorts of magic id numbers that definitly are not human editable or creatable. By creating my own archive class I was able to get rid of most of them, but not all. Some (class-id integers and tracking-id integers) are so inherently embedded in the library that it would require quite an effort to hack them out. To be able to hand-edit XML serialization files, I needed to replace class-id integers with class names (the ones you specify to BOOST_SERIALIZATION_EXPORT macros). Initially I thought it will be easy, but I was wrong. Entire dynamic instantiation mechanism in serialization library seems to depend heavily on these integer ids. It only uses names when class is first seen, and there is not id for it yet. I then tried to use sure-fire method ;-) , which failed as well. Briefly, I made my archive class to generate fake class-ids from class names (in an effort to predict which id would be expected now by serialization). It failed for reasons I cannot recall at the moment - they had something to do with inheritance hierarchies. In case of tracking-ids I just wanted them out, because I do not have duplicate objects in human-created files. On the other hand, I couldn't disable tracking per-class, because I still wanted other archives to do tracking on these objects. A couple of days ago I even posted a question for Robert, and he says that I would need to create my own version of tracking.hpp. So as you can see it is not all roses with serialization library and human readable files. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2bho8$c7e$1@sea.gmane.org...
Use Serialization lib for this.
But some values are also only human editable (not on the system setup display etc) so simple XML based files work well because they can easily be read/written by a person and the software itself.
There many ways to deal with it. For example you could keep these fields as strings in memory or use separate storage for them or teach boost::serialization to save/load them.
I have recently tried to do just this. Not subtracting from the great value of serialization library, it has a nasty habit of polluting XML files it generates with all sorts of magic id numbers that definitly are not human editable or creatable. By creating my own archive class I was able to get rid of most of them, but not all. Some (class-id integers and tracking-id integers) are so inherently embedded in the library that it would require quite an effort to hack them out.
To be able to hand-edit XML serialization files, I needed to replace class-id integers with class names (the ones you specify to BOOST_SERIALIZATION_EXPORT macros). Initially I thought it will be easy, but I was wrong. Entire dynamic instantiation mechanism in serialization library seems to depend heavily on these integer ids. It only uses names when class is first seen, and there is not id for it yet. I then tried to use sure-fire method ;-) , which failed as well. Briefly, I made my archive class to generate fake class-ids from class names (in an effort to predict which id would be expected now by serialization). It failed for reasons I cannot recall at the moment - they had something to do with inheritance hierarchies.
In case of tracking-ids I just wanted them out, because I do not have duplicate objects in human-created files. On the other hand, I couldn't disable tracking per-class, because I still wanted other archives to do tracking on these objects. A couple of days ago I even posted a question for Robert, and he says that I would need to create my own version of tracking.hpp.
So as you can see it is not all roses with serialization library and human readable files.
Well you need to bring it all to the Robert attention. Boost serialization should support wide variety of archives, including simple xml. Gennadiy

So as you can see it is not all roses with serialization library and human readable files.
Well you need to bring it all to the Robert attention. Boost serialization should support wide variety of archives, including simple xml.
Of course I did. You can see his answer in a separate thread, posted today I think. Best regards, Marcin

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:e2c2s5$l3n$1@sea.gmane.org... : > In case of tracking-ids I just wanted them out, because I do not have : > duplicate objects in human-created files. On the other hand, I couldn't : > disable tracking per-class, because I still wanted other archives to do : > tracking on these objects. A couple of days ago I even posted a question : > for Robert, and he says that I would need to create my own version of : > tracking.hpp. : > : > So as you can see it is not all roses with serialization library and human : > readable files. : : Well you need to bring it all to the Robert attention. Boost serialization : should support wide variety of archives, including simple xml. But can it? Can boost::serialization support an archive format that: - is easily human-readable, human-editable, and human-*writable* (which IMO excludes having "class ID" fields and such) - provides straightforward support for default values and optional fields, which is important for backwards compatiblity with older archives - will not bark at additional/not-understood fields that are present in the file (important for forward-compatibility) - provides support for archives that are read/written by other programming languages (as JSON does). By the current support for XML provided by boost::s11n, I doubt that these features can be easily provided. But I'd love to be proven wrong. Can you help us see the light ? I find you are being quite vocal against a library which obviously does not address a need that you have. But as Bjarne himself likes to admit, no one knows what most C++ programmers do. When I define a configuration or data file, I like to design it around what end-users will think is logical, not around the (current and possibly changing) representation of the data within my C++ program. Boost::s11n as it is today does not seem to support this approach. ptree does much better. And others have other reasons to like ptree. The fact that, for your needs, ptree doesn't allow anything that s11n doesn't already do, should NOT be a reason to reject the library. As I wrote elsewhere, regex and spirit are just as overlapping as ptree and s11n. Yet many needs are only served well by one of the two twin libraries. Regards, Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

Ivan Vecerina wrote:
The fact that, for your needs, ptree doesn't allow anything that s11n doesn't already do, should NOT be a reason to reject the library.
As I wrote elsewhere, regex and spirit are just as overlapping as ptree and s11n. Yet many needs are only served well by one of the two twin libraries.
Regex and Xpressive cover almost exactly the same domain space, and both are Boost libraries (xpressive will be in 1.34). Overlap with existing libraries by itself is NOT enough to reject a library. I've used serialization, program options, multi-index, and property tree. To my eye property_tree is complementary to the others and shouldn't necessarily be implemented in terms of the others. Serialization, for example, solves a much more general persistence problem than property tree, but requires a compiled library. property_tree's persistence element is simpler and more focused and nicely header only. Note that I believe that serialization can be adapted to read/write JSON, for example, but to date no-one has done this. Anyway, it's my view that we should turn our focus to the merits of the library interface and implementation. Jeff

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
We read and write our config files a lot. Our systems perform the own optimisations and the latest settings our stored in a config file. The ...
When quoting other posters, please leave an attribution at the top so we know who you're quoting. Thanks, -- Dave Abrahams Boost Consulting www.boost-consulting.com

"David Abrahams" <dave@boost-consulting.com> wrote in message news:uodyu38jj.fsf_-_@boost-consulting.com...
"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
We read and write our config files a lot. Our systems perform the own optimisations and the latest settings our stored in a config file. The ...
When quoting other posters, please leave an attribution at the top so we know who you're quoting.
Thanks,
Don't you see it in thread view? But Ok. Gennadiy

Gennadiy Rozental wrote:
"David Abrahams" <dave@boost-consulting.com> wrote in message news:uodyu38jj.fsf_-_@boost-consulting.com...
"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
We read and write our config files a lot. Our systems perform the own optimisations and the latest settings our stored in a config file. The
...
When quoting other posters, please leave an attribution at the top so we know who you're quoting.
Thanks,
Don't you see it in thread view? But Ok.
I certainly do, but some people may get daily digests and such -Thorsten

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
"David Abrahams" <dave@boost-consulting.com> wrote in message news:uodyu38jj.fsf_-_@boost-consulting.com...
"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
We read and write our config files a lot. Our systems perform the own optimisations and the latest settings our stored in a config file. The ...
When quoting other posters, please leave an attribution at the top so we know who you're quoting.
Thanks,
Don't you see it in thread view?
Yes, but my newsreader also hides already-read messages by default, so the message you're replying to is usually not visible. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Gennadiy Rozental wrote:
"David Abrahams" <dave@boost-consulting.com> wrote in message news:uodyu38jj.fsf_-_@boost-consulting.com...
"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
We read and write our config files a lot. Our systems perform the own optimisations and the latest settings our stored in a config file. The ...
When quoting other posters, please leave an attribution at the top so we know who you're quoting.
Thanks,
Don't you see it in thread view?
Well, given that several posters reply in a way that breaks threading, at least I have to turn threading off in order to make sense of the conversation. - Volodya

The same woth PO. PO doesn't deal with config file generation but that (as I mention in my review) is completely unnessesary anyway. Why and how frequently would I want to generate my config files programmatically?
Administrative tools editing configurations for servers. The lack of saving capabilities prevented us from using PO with such a project. Best regards Jorge

"Jorge Lodos" <lodos@segurmatica.cu> wrote in message news:001c01c6649f$4f494f60$f9010a0a@segurmatica.cu...
The same woth PO. PO doesn't deal with config file generation but that (as I mention in my review) is completely unnessesary anyway. Why and how frequently would I want to generate my config files programmatically?
Administrative tools editing configurations for servers. The lack of saving capabilities prevented us from using PO with such a project.
Try Serialization Library. Gennadiy

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:e287in$lsr$1@sea.gmane.org... : : > I think that a clear difference is that Property Tree is intended : > to support I/O of configuration settings, and of other kinds of : > human-readable data files. I could well envision an application : : The same woth PO. PO doesn't deal with config file generation but that (as I : mention in my review) is completely unnessesary anyway. Why and how : frequently would I want to generate my config files programmatically? Whenever your users prefer a GUI to a text editor ? I currently an application for a configurable manufacturing process. Various parameters can be tuned interactively, and when good results are obtained, the current parameters can be saved into a configuration file. Another file stores a series of coordinates defining some reference positions used by my robotic system. There is no way that I would want to type in those coordinates by hand !! : > that uses both libs: program_options to handle command-line : > parameters, and ptree as a storage format for its data files. : : program_options has is't own facility for that. But it can only read it, right? : > I agree with many of your other points. In particular ptree : > could be made leaner, and the double-indexing may well be : > overkill (I haven't looked at the implementation itself). : > But the needs the library seeks to address are very real. : : Could be. But PO already doing everything this library does. Any specific : examples of what is missing (other than some extra parsers for xml and : registry)? The data I need to store often includes arrays, or even tree-like structures. How would I handle this with program_options ?? Using a library similar to ptree, I have also been storing complete "scene graphs" as text files (i.e. a hierarchical collection of objects, each stored with a transformation matrix, paths to mesh and texture files, and more...). The scene was edited graphically at most times, but having the textual representation was convenient for some manual touch-ups, and of course for tracking changes in a revision control system. I haven't written a single large application where I haven't felt the need to use a library similar to ptree -- and done it with very obvious benefits. I might not switch over to ptree, because I am not fond of some of its design aspects. But it definitely has uses that go beyond the scope of the provided "tutorial" example. Uses that program_options does not support... Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

Whenever your users prefer a GUI to a text editor ? I currently an application for a configurable manufacturing process. Various parameters can be tuned interactively, and when good results are obtained, the current parameters can be saved into a configuration file. Another file stores a series of coordinates defining some reference positions used by my robotic system. There is no way that I would want to type in those coordinates by hand !!
: > that uses both libs: program_options to handle command-line : > parameters, and ptree as a storage format for its data files. : : program_options has is't own facility for that.
But it can only read it, right?
: > I agree with many of your other points. In particular ptree : > could be made leaner, and the double-indexing may well be : > overkill (I haven't looked at the implementation itself). : > But the needs the library seeks to address are very real. : : Could be. But PO already doing everything this library does. Any specific : examples of what is missing (other than some extra parsers for xml and : registry)?
The data I need to store often includes arrays, or even tree-like structures. How would I handle this with program_options ??
Using a library similar to ptree, I have also been storing complete "scene graphs" as text files (i.e. a hierarchical collection of objects, each stored with a transformation matrix, paths to mesh and texture files, and more...). The scene was edited graphically at most times, but having the textual representation was convenient for some manual touch-ups, and of course for tracking changes in a revision control system.
I haven't written a single large application where I haven't felt the need to use a library similar to ptree -- and done it with very obvious benefits.
I might not switch over to ptree, because I am not fond of some of its design aspects. But it definitely has uses that go beyond the scope of the provided "tutorial" example. Uses that program_options does not support...
All of above could (should?) be implemented more conviniently using Serialization Library. I believe program runtime parameters support has different goals and rationale. If you need bidirectional permanent store for some application state data - use serilization lib. If you need to read configuration from different sources and potencially only part of it - you need different solution. The difference is that in first case there is a direct correspondance between data in permanane storage and it's presentation at runtime within application (hence bidirectional). In later case I may have parametes comming from different sources conflict with each other and I may want ot use only 1 in 1000 of them. And this is runtime parameters domain. Intended to be covered by PO library (failed IMO, but again that's a different story) Gennadiy

Gennadiy Rozental wrote:
All of above could (should?) be implemented more conviniently using Serialization Library. I believe program runtime parameters support has different goals and rationale. If you need bidirectional permanent store for some application state data - use serilization lib. If you need to read configuration from different sources and potencially only part of it - you need different solution. The difference is that in first case there is a direct correspondance between data in permanane storage and it's presentation at runtime within application (hence bidirectional). In later case I may have parametes comming from different sources conflict with each other and I may want ot use only 1 in 1000 of them. And this is runtime parameters domain. Intended to be covered by PO library (failed IMO, but again that's a different story)
We do use serialization for this currently to XML values, but not all the values in our config files are available via the UI because under normal circumstances, they don't need to be tweaked, but we still need access to them at some point. Therefore if the config file is human read-able, these can be tweaked without need for extra UI. We currently use XML archives with serialization for this, but because of class/type ids etc and other info serialization squirts out, this files can easily be screwed up. This solution seems much neater to me. Cheers Russell

We do use serialization for this currently to XML values, but not all the values in our config files are available via the UI because under normal circumstances, they don't need to be tweaked, but we still need access to them at some point.
Therefore if the config file is human read-able, these can be tweaked without need for extra UI.
We currently use XML archives with serialization for this, but because of class/type ids etc and other info serialization squirts out, this files can easily be screwed up.
Write SimpleXML Archive and submit it to include in serialization lib. It looks like there is a market for something like that. Gennadiy

We currently use XML archives with serialization for this, but because of class/type ids etc and other info serialization squirts out, this files can easily be screwed up.
Write SimpleXML Archive and submit it to include in serialization lib. It looks like there is a market for something like that.
I have tried to do just that, see my post copy-pasted from another part of this thread: <snip> I have recently tried to do just this. Not subtracting from the great value of serialization library, it has a nasty habit of polluting XML files it generates with all sorts of magic id numbers that definitly are not human editable or creatable. By creating my own archive class I was able to get rid of most of them, but not all. Some (class-id integers and tracking-id integers) are so inherently embedded in the library that it would require quite an effort to hack them out. To be able to hand-edit XML serialization files, I needed to replace class-id integers with class names (the ones you specify to BOOST_SERIALIZATION_EXPORT macros). Initially I thought it will be easy, but I was wrong. Entire dynamic instantiation mechanism in serialization library seems to depend heavily on these integer ids. It only uses names when class is first seen, and there is not id for it yet. I then tried to use sure-fire method ;-) , which failed as well. Briefly, I made my archive class to generate fake class-ids from class names (in an effort to predict which id would be expected now by serialization). It failed for reasons I cannot recall at the moment - they had something to do with inheritance hierarchies. In case of tracking-ids I just wanted them out, because I do not have duplicate objects in human-created files. On the other hand, I couldn't disable tracking per-class, because I still wanted other archives to do tracking on these objects. A couple of days ago I even posted a question for Robert, and he says that I would need to create my own version of tracking.hpp. So as you can see it is not all roses with serialization library and human readable files. <snip> Best regards, Marcin

Hi Everybody, I see there is some confusion about possible overlap of program_options library and property_tree libraries. I will try to present my view on the problem here. First of all, my opinion is there is very little overlap. The biggest difference between the libraries is that property_tree is hierarchical and program_options is linear - i.e. options are accessible through a map-like interface. Structure of file formats, like XML, JSON etc. is inherently hierarchical. The presence of hierarchy is their greatest strength, and most important feature. Large amount of information is conveyed by structure of the tree alone, not only by values of options it contains. One option in two different branches of the tree can have completely different meaning. Contrary to that, in program_options library there is only at best a notion of "positional options" which is completely different from hierarchy. Briefly, PO library operates on a flat strings of options, property_tree is a DOM. Think of property_tree like a replacement for Microsoft XML parser, or JSON API, or Windows INI API, or registry API. Do you imagine using program_options to XML trees, like you get from XML parser? It is not suited for that, its problem domain lies in completely different place, namely parsing a linear string of (command-line) options, possibly supplied as a config file or files. In addition to that, PO library does not support writing the structures back to config files. It is not a flaw with the library, it just does not need to do it, because options for programs are meant to be "one shot", or "read and forget", like command line arguments. There's not need to store them back where they came from. On the other hand, PO library supports many things that do not belong to property_tree. These include, for example, options descriptions and notify() mechanisms. Taking all that into account I do not see why anybody insists there might exist 1:1 correspondence between the libraries. I agree both can parse command lines, but that's about it. Other functionality is very different. Best regards, Marcin

Hi Everybody,
I see there is some confusion about possible overlap of program_options library and property_tree libraries. I will try to present my view on the problem here.
First of all, my opinion is there is very little overlap. The biggest difference between the libraries is that property_tree is hierarchical and program_options is linear - i.e. options are accessible through a map-like interface. Structure of file formats, like XML, JSON etc. is inherently hierarchical. The presence of hierarchy is their greatest strength, and most important feature. Large amount of information is conveyed by structure of the tree alone, not only by values of options it contains. One option in two different branches of the tree can have completely different meaning. Contrary to that, in program_options library there is only at best a notion of "positional options" which is completely different from hierarchy.
Briefly, PO library operates on a flat strings of options, property_tree is a DOM.
IMO it's just one of the many design flaws of PO library.
Think of property_tree like a replacement for Microsoft XML parser, or JSON API, or Windows INI API, or registry API. Do you imagine using program_options to XML trees, like you get from XML parser? It is not suited for that, its problem domain lies in completely different place, namely parsing a linear string of (command-line) options, possibly supplied as a config file or files.
In addition to that, PO library does not support writing the structures back to config files. It is not a flaw with the library, it just does not need to do it, because options for programs are meant to be "one shot", or "read and forget", like command line arguments. There's not need to store them back where they came from.
On the other hand, PO library supports many things that do not belong to property_tree. These include, for example, options descriptions and notify() mechanisms.
Taking all that into account I do not see why anybody insists there might exist 1:1 correspondence between the libraries. I agree both can parse command lines, but that's about it. Other functionality is very different.
I admit I must've misinterpret the problem domain you are trying to cover. So now you are saying that your library should be used to implement permanent data storage? But we already have a solution for that either. Much better IMO in any sense (safety, convenience, automation etc). All in all in between PO an Serialization library I do not see any place for this submission. It's doesn't stand a comparison as runtime parameters support facility even with PO library ( no conflict resolution, no formats specification, no automatic/custom validation no async action assignment etc). And it's doesn't stand a comparison as a permanent storage facility with Serialization library (in most senses). Make no mistake these are two different domains. And you could not sit on two chairs. Especially if any one of them is too big for you. Gennadiy.

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:e28s8g$29i$1@sea.gmane.org... : I admit I must've misinterpret the problem domain you are trying to cover. : So now you are saying that your library should be used to implement : permanent data storage? But we already have a solution for that either. : Much better IMO in any sense (safety, convenience, automation etc). All in : all in between PO an Serialization library I do not see any place for this : submission. Hi Gennadiy, There is a profusion of applications that use xml, JSON, or a similar format for data storage. In all these applications, being able to dynamically manipulate an in-memory representation of the data structure is a very common need. boost::serialize goes straight from C++ object to stream, and there is no opportunity to manipulate the stored data in-memory (let me be corrected if I am wrong). : It's doesn't stand a comparison as runtime parameters support : facility even with PO library ( no conflict resolution, no formats : specification, no automatic/custom validation no async action assignment : etc). And it's doesn't stand a comparison as a permanent storage facility : with Serialization library (in most senses). Make no mistake these are two : different domains. xml, JSON, the Windows registry, some uses of command-line parameters, etc have a lot of common. And ptree seeks to provide a common in-memory representation for all these formats. I see a real value it in. But I have a question for you, or for any advanced user/developer of boost serialize: How would you look at ptree being a possible target format (Archive) for boost::serialize ? -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

: I admit I must've misinterpret the problem domain you are trying to cover. : So now you are saying that your library should be used to implement : permanent data storage? But we already have a solution for that either. : Much better IMO in any sense (safety, convenience, automation etc). All in : all in between PO an Serialization library I do not see any place for this : submission.
Hi Gennadiy,
There is a profusion of applications that use xml, JSON, or a similar format for data storage. In all these applications, being able to dynamically manipulate an in-memory representation of the data structure is a very common need. boost::serialize goes straight from C++ object to stream, and there is no opportunity to manipulate the stored data in-memory (let me be corrected if I am wrong).
See my other post. What do you mean by manipulate the data? Care to give an example?
: It's doesn't stand a comparison as runtime parameters support : facility even with PO library ( no conflict resolution, no formats : specification, no automatic/custom validation no async action assignment : etc). And it's doesn't stand a comparison as a permanent storage facility : with Serialization library (in most senses). Make no mistake these are two : different domains.
xml, JSON, the Windows registry, some uses of command-line parameters, etc have a lot of common. And ptree seeks to provide a common in-memory representation for all these formats. I see a real value it in.
Runtime parameters support facility require a lot more then just common representation. It needs to support at least following features: * Conflict resolution What if the same parameter is in multiple sources? * Parsers should be flexible t osupport variety of formats CLA parser in particular. * There should be an ability for validation/notification automation. I want to register a callbacks for particular parameters and validation rules. There are more, but my point is that this library doesn't address any of it. So it couldn't be considered as viable candidate for runtime parameters support Also such facility wouldn't require any kind of fast search - no need to complicate the design.
But I have a question for you, or for any advanced user/developer of boost serialize:
How would you look at ptree being a possible target format (Archive) for boost::serialize ?
I am not sure I understand: how PT could be an Archive? Archive if I am not mistaken is the model of permanent storage. PT is the model of in-memory storage. Gennadiy

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:e2asoc$2mn$1@sea.gmane.org... : > There is a profusion of applications that use xml, JSON, or a similar : > format for data storage. In all these applications, being able to : > dynamically manipulate an in-memory representation of the data : > structure is a very common need. : > boost::serialize goes straight from C++ object to stream, and there : > is no opportunity to manipulate the stored data in-memory (let me : > be corrected if I am wrong). : : See my other post. What do you mean by manipulate the data? : Care to give an example? - extract some field values and inject them into an HTML document template (e.g. for a report-generator) - manipulate the contents of an XML/configuration file that contains classes for which I have no C++ class defined. But also: - maybe I want to handle the same kind of dynamic data structures that all scripting languages have available (e.g. represent 'instances' of a dynamic class in a portable way). - maybe I want better forward & backward portability of my storage file format than boost::serialize allows (manually checking for the existence of certain fields is more flexible). - maybe boost::serialize is too rigid when it comes to reading a file that has been edited by users - maybe boost::serialize is too complex and intimidating It's not like boost::serialize currently dominates the world of persistent storage. Obviously many developers are not happy with it and use other kinds of solutions (JSON, XML, etc). That is a fact. : > xml, JSON, the Windows registry, some uses of command-line parameters, etc : > have a lot of common. And ptree seeks to provide a common in-memory : > representation for all these formats. I see a real value it in. : : Runtime parameters support facility require a lot more then just common : representation. It needs to support at least following features: I said *some* uses of command-line parameters. Your demands are higher, without doubt. Many standard libraries have limitations that make them unacceptable to some users. A library exists for parsing CLA, but its scope is extremely narrow and specific. : > How would you look at ptree being a possible target format (Archive) : > for boost::serialize ? : : I am not sure I understand: how PT could be an Archive? Archive if I am not : mistaken is the model of permanent storage. PT is the model of in-memory : storage. ... which is why PT is very complementary. My point is: if there is such a strong overlap between the data structure of boost::serialize and PT, then probably PT could use the (XML) serialization archive protocol to read/write XML files. A common back-end protocol would allow every new format support (JSON, INFO, etc) to be leveraged by both libraries. At the same time, by providing a serialization archive that constructs an in-memory PT, the PT library would leverage the assets of boost::serialize, for example to facilitate the conversion of a struct to a ptree. Where you see overlap and redundancy, I see complementarity and opportunity for synergy. Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

"Ivan Vecerina" <ivec@yahoo.com> wrote in message news:e2b3d4$rgt$1@sea.gmane.org...
"Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote in message news:e2asoc$2mn$1@sea.gmane.org... : > There is a profusion of applications that use xml, JSON, or a similar : > format for data storage. In all these applications, being able to : > dynamically manipulate an in-memory representation of the data : > structure is a very common need. : > boost::serialize goes straight from C++ object to stream, and there : > is no opportunity to manipulate the stored data in-memory (let me : > be corrected if I am wrong). : : See my other post. What do you mean by manipulate the data? : Care to give an example?
- extract some field values and inject them into an HTML document template (e.g. for a report-generator) - manipulate the contents of an XML/configuration file that contains classes for which I have no C++ class defined.
There is quite a veriety of in-memory data structures already exist that would allow one to manipulate data in more convinient/effitient and relyable way.
But also: - maybe I want to handle the same kind of dynamic data structures that all scripting languages have available (e.g. represent 'instances' of a dynamic class in a portable way).
So? What's the problem? Use tree with strings in nodes.
- maybe I want better forward & backward portability of my storage file format than boost::serialize allows (manually checking for the existence of certain fields is more flexible). - maybe boost::serialize is too rigid when it comes to reading a file that has been edited by users - maybe boost::serialize is too complex and intimidating
Maybe you need to make up to my mind on what you need? If it's permanent storage for configuration edited at runtime, why do you need to edit it manually? In any case all of the problems above could be solved with PT.
It's not like boost::serialize currently dominates the world of persistent storage.
So? I see a potencial for it to become a standard like iostreams now.
Obviously many developers are not happy with it and use other kinds of solutions (JSON, XML, etc). That is a fact.
Is it?
: > How would you look at ptree being a possible target format (Archive) : > for boost::serialize ? : : I am not sure I understand: how PT could be an Archive? Archive if I am not : mistaken is the model of permanent storage. PT is the model of in-memory : storage. ... which is why PT is very complementary.
But Why would I need another data structure? How is it better than tree or multiindex?
My point is: if there is such a strong overlap between the data structure of boost::serialize and PT,
There is none. boost::serialize doesn't deal in data structure domain.
then probably PT could use the (XML) serialization archive protocol to read/write XML files. A common back-end protocol would allow every new format support (JSON, INFO, etc) to be leveraged by both libraries.
At the same time, by providing a serialization archive that constructs an in-memory PT, the PT library would leverage the assets of boost::serialize, for example to facilitate the conversion of a struct to a ptree.
Where you see overlap and redundancy, I see complementarity and opportunity for synergy.
All of the above *could* be done. I still do not see why it *should*? What are the advatages of PT over other data structures. Gennadiy

: > How would you look at ptree being a possible target format (Archive) : > for boost::serialize ? : : I am not sure I understand: how PT could be an Archive? Archive if I am not : mistaken is the model of permanent storage. PT is the model of in-memory : storage. ... which is why PT is very complementary.
There are reasons that you might want to have it as an archive target. You get archives in JSON, INI, Registry formats for free, and even an alternative format for XML. I have actually implemented an Archive class that uses ptree, so it definitely is possible and quite easy also. I might post it in the future when I make it somewhat more general (it's just a quick hack now). Truth is, it is not very fast, but if you want a sample of archive in JSON, here you go: { "d": { "class_id_type": "1", "class_name_type": "class Derived", "tracking_type": "0", "version_type": "0", "Base": { "class_id_optional_type": "0", "tracking_type": "0", "version_type": "0", "b": "78" }, "d": "111" }, "d2": { "class_id_type": "1", "Base": { "b": "11" }, "d": "146" }, "b": { "class_id_type": "0", "b": "52" }, "i": "3", "v": { "class_id_optional_type": "2", "tracking_type": "0", "version_type": "0", "count": "5", "item": { "count": "3", "item": "1", "item": "2", "item": "3" }, "item": { "count": "0" }, "item": { "count": "0" }, "item": { "count": "0" }, "item": { "count": "0" } } } Kind regards, Marcin

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
Briefly, PO library operates on a flat strings of options, property_tree is a DOM.
IMO it's just one of the many design flaws of PO library.
Is Vladimir aware of the other things you consider to be design flaws? -- Dave Abrahams Boost Consulting www.boost-consulting.com

"David Abrahams" <dave@boost-consulting.com> wrote in message news:u8xpzhr4r.fsf@boost-consulting.com...
"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
Briefly, PO library operates on a flat strings of options, property_tree is a DOM.
IMO it's just one of the many design flaws of PO library.
Is Vladimir aware of the other things you consider to be design flaws?
During review I was opposing essentially every single design decision in this library. My wows were quite loud, so I think Vladimir is aware of my position. I doubt much changed since. By now I just don't care. I am never gonna be using it. That's for sure. Gennadiy

Gennadiy Rozental wrote:
Hi,
1. Name is really bad. Originally I thought this submission has something to do with property_map and was surprised to see what I see.
Maybe the name of property_map is very bad. :-)
5. CLA parser is a joke. It's unacceptable any way you present it.
Please be more detailed. -Thorsten

"Thorsten Ottosen" <thorsten.ottosen@dezide.com> wrote in message news:44475F31.6020709@dezide.com...
Gennadiy Rozental wrote:
Hi,
1. Name is really bad. Originally I thought this submission has something to do with property_map and was surprised to see what I see.
Maybe the name of property_map is very bad. :-)
5. CLA parser is a joke. It's unacceptable any way you present it.
Please be more detailed.
Compare it even with the one in PO (which I also consider not flexible enough). Gennadiy.

Gennadiy Rozental wrote:
5. CLA parser is a joke. It's unacceptable any way you present it.
Please be more detailed.
Compare it even with the one in PO (which I also consider not flexible enough).
I was trying to get you to put something concrete on the table, so I as a review manager (and everybody else) has a chance of understanding you. -Thorsten

"Thorsten Ottosen" <thorsten.ottosen@dezide.com> wrote in message news:4447D452.9010606@dezide.com...
Gennadiy Rozental wrote:
5. CLA parser is a joke. It's unacceptable any way you present it.
Please be more detailed.
Compare it even with the one in PO (which I also consider not flexible enough).
I was trying to get you to put something concrete on the table, so I as a review manager (and everybody else) has a chance of understanding you.
What do you want me to present? Set of formats supported by PO CLA parser vs. set of formats supported by this submission? Believe me the relation would be at least 50:1. Just look on both libraries. Gennadiy

Hi Gennadiy, Thank you for the review.
First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story).
You are talking about Program Options library. I don't quite agree it covers the same problem domain. I feel it is more focued on command-line than on reading configuration in general. The biggest difference is that property_tree is a DOM, not a linear structure. In program options you cannot easily take a only a piece of configuration and pass it to another component, like in the example below: <configuration1> <option1>value1</option1> <option2>value2</option2> </configuration1> <configuration2> <option1>value1</option1> <option2>value2</option2> </configuration2> Now suppose you have a component which is only interested in option1 and option2: ptree &config1 = get_child("configuration1"); ptree &config2 = get_child("configuration2"); use_my_component(config1); // Use component with config #1 use_my_component(config2); // Use component with config #2 With program options you would have to teach my_component extract values from whatever place they are in config file(s). Other things that are not supported by program_options: - writing configuration - XML, JSON, INI, Windows registry parsing out of the box Also, I think that in simple cases, syntax offered by property_tree is simpler and has less steep learning curve.
1. Name is really bad. Originally I thought this submission has something to do with property_map and was surprised to see what I see.
Name comes from Java Properties class, I think problem domains of these are quite close, although property_tree is somewhat more sophisticated.
2. Why would you need fast lookup? In 99.9% of cases each variable is accessed single time in a program lifetime. No need to optimize so that you
I think you make a mistake of assuming that this library is only supposed to be used only as a reader for command-line or other startup options (truth is that the tutorial might suggest it). Think of it like a lightweight replacement for Microsoft XML parser, or JSON API, or Registry functions, or Windows INI file API. All of these have fast lookups, and there are good reasons for it. Command line parsing is there only for the sake of completness, it covers maybe less than 5% of the library problem domain. I actually added it because I thought it can make parsing _simple_ commandline options schemes extremely easy.
program startup time go from 5 mls to 5 mks. You won't notice this anyway (And in reality it's not even that big difference. Most of your startup time you will spend initiating either network connection or some GUI component or similar)
Your GUI component might be initialized by reading and looking up data in property tree. In one of my previous projects GUI layout was stored in property trees.
3. Even if you insist on alternative lookup. Why not just use single typedef with multi_index instead of all the implementation.
Could you elaborate more on that? I considered use of multi_index to implement indexing for properties, but it only affected the implementation part of library, not interface, and because I already had a working, exception safe solution, I didn't see the reason to dump it and add another dependency on another library.
5. CLA parser is a joke. It's unacceptable any way you present it.
Could you please be more specific. Anyway, bear in mind that cmdline parser is not intended to be a generic parser for all command-line parsing tasks, a replacement for program options. This is clearly stated in the docs. Its first and foremost advantage is that it is very simple to use, easier than program options in simple cases: Can you beat that simplicity (4 lines of code)? : boost::property_tree::ptree pt; read_cmdline(argc - 1, argv + 1, "-", pt); // Gets -Nxx where xx is integer int n = pt.get<int>("N", 0); // Tests if -fno-frills is present bool no_frills = pt.get_optional<std::string>("f.no-frills");
6. I personally believe that inline implementation in this case is actually is rather a disadvantage. I would be perfectly happy to stick to ASCII for configuration purposes. Even if you insist on wide char support, I would implement it as a thin template wrapper around offline implementation that operates with void*. Plus some runtime polymorphism could be used.
I don't agree. Templating on character type is very easy and dropping that does not buy us anything. To be flexible the library needs to support policy-customization anyway, so cpp implementation is not an option. Also, large part of the member function are templated on extracted type so they couldn't be moved to cpp file.
7. ptree_utils reinvent the wheel. Why not use Boost string algorithms?
Agreed. That is implementation detail, could be changed later.
8. I keep repeating: Traits could not be template parameters. You should had have three template parameter KeyType, ValueType and ComparePolicy. The same way as std::map is doing
What about Extractor and Inserter? How about a proposed path splitting policy to allow non-string keys? I think it is better if all that is condensed into one class, because changing one element leads in most cases to changing the rest as well (if you change data type you have to provide different extractor and inserter, if you change key type you have to provide different comparison policy). std::map is a different case because there template parameters are much more independent.
9. Access interface are lacking. If I got string identify subsystem and string that identify parameter separately why do I need to concatenate them to get to the value
You don't have to concatenate them: int subsystem_parameter = pt.get_child("subsystem").get<int>("parameter");
10. General note: the whole design/implementation is unnecessary complicated. You general part is about 50k. In Boost.Test utils section I have a component with very similar rationale but better design (IMO obviously) implemented in 16k.
I don't see a way I could simplify it considerably. If you see, please let me know how.
11. Generating part if the design/implementation is completely unnecessary. It may? be useful in 1% of the usage cases and in those cases I would stick to some alternatives.
I don't understand what you mean. Can you please rephrase? Thank you, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e27vp4$o1p$1@sea.gmane.org...
Hi Gennadiy,
Thank you for the review.
First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story).
You are talking about Program Options library. I don't quite agree it covers the same problem domain. I feel it is more focused on command-line than on reading configuration in general.
Is there any facts supporting this feeling?
The biggest difference is that property_tree is a DOM, not a linear structure.
In a big part it's implementation detail. I might as well keep all the parameter in single list with names like "a.b.c.d". Though I agree that variable_map could be done better.
In program options you cannot easily take a only a piece of configuration and pass it to another component, like in the example below:
<configuration1> <option1>value1</option1> <option2>value2</option2> </configuration1> <configuration2> <option1>value1</option1> <option2>value2</option2> </configuration2>
Now suppose you have a component which is only interested in option1 and option2:
ptree &config1 = get_child("configuration1"); ptree &config2 = get_child("configuration2"); use_my_component(config1); // Use component with config #1 use_my_component(config2); // Use component with config #2
I am not saying PO library designed good. It should have supported parameters hierarchy in it's variable_map.
With program options you would have to teach my_component extract values from whatever place they are in config file(s).
Other things that are not supported by program_options:
- writing configuration
Why would I need that? Or rather in how many instances I would need that?
- XML, JSON, INI, Windows registry parsing out of the box
You could implement them as an add-ons.
Also, I think that in simple cases, syntax offered by property_tree is simpler and has less steep learning curve.
It's matter of opinion and again it's just PO issue.
1. Name is really bad. Originally I thought this submission has something to do with property_map and was surprised to see what I see.
Name comes from Java Properties class, I think problem domains of these are quite close, although property_tree is somewhat more sophisticated.
Word property seems misplaced here. Why do you name this entities properties? It's way too generic name. Word tree is misleading here. Tree is just an implementation detail. You may as well implement this using different data structure.
2. Why would you need fast lookup? In 99.9% of cases each variable is accessed single time in a program lifetime. No need to optimize so that you
I think you make a mistake of assuming that this library is only supposed to be used only as a reader for command-line or other startup options (truth is that the tutorial might suggest it).
Ok. So state your problem domain.
Think of it like a lightweight replacement for Microsoft XML parser, or JSON API, or Registry functions, or Windows INI file API. All of these have fast lookups, and there are good reasons for it. Command line parsing is there only for the sake of completness, it covers maybe less than 5% of the library problem domain. I actually added it because I thought it can make parsing _simple_ commandline options schemes extremely easy.
My understanding of problem domain is: program runtime parameters support. Whatever media is used doesn't matter. And both yours and PO library intends to cover it.
program startup time go from 5 mls to 5 mks. You won't notice this anyway (And in reality it's not even that big difference. Most of your startup time you will spend initiating either network connection or some GUI component or similar)
Your GUI component might be initialized by reading and looking up data in property tree. In one of my previous projects GUI layout was stored in property trees.
Majority of your startup time you will spend in a system calls anyway.
3. Even if you insist on alternative lookup. Why not just use single typedef with multi_index instead of all the implementation.
Could you elaborate more on that? I considered use of multi_index to implement indexing for properties, but it only affected the implementation part of library, not interface, and because I already had a working, exception safe solution, I didn't see the reason to dump it and add another dependency on another library.
I believe typedef multiindex< ... > property_tree; and several free functions like get( .... ) should constitute your whole interface and implementation.
5. CLA parser is a joke. It's unacceptable any way you present it.
Could you please be more specific.
Essentially it's unusable for anything any different for one rigid format you chose.
Anyway, bear in mind that cmdline parser is not intended to be a generic parser for all command-line parsing tasks, a replacement for program options. This is clearly stated in the docs.
Why do you include it then? What it is intended to be?
Its first and foremost advantage is that it is very simple to use, easier than program options in simple cases:
Can you beat that simplicity (4 lines of code)? :
boost::property_tree::ptree pt; read_cmdline(argc - 1, argv + 1, "-", pt);
// Gets -Nxx where xx is integer int n = pt.get<int>("N", 0);
// Tests if -fno-frills is present bool no_frills = pt.get_optional<std::string>("f.no-frills");
Given that you did not have any way to specify CLA options I don't see it that simpler than PO interface.
6. I personally believe that inline implementation in this case is actually is rather a disadvantage. I would be perfectly happy to stick to ASCII for configuration purposes. Even if you insist on wide char support, I would implement it as a thin template wrapper around offline implementation that operates with void*. Plus some runtime polymorphism could be used.
I don't agree. Templating on character type is very easy and dropping that does not buy us anything. To be flexible the library needs to support policy-customization anyway, so cpp implementation is not an option. Also, large part of the member function are templated on extracted type so they couldn't be moved to cpp file.
Most of the things that are implemented using compile-time polymorphism could be done using runtime one. You just need to tweak you implementation a bit.
8. I keep repeating: Traits could not be template parameters. You should had have three template parameter KeyType, ValueType and ComparePolicy. The same way as std::map is doing
What about Extractor and Inserter?
Personally I don't see why do you need that at all. I would stick with lexical_cast. Simplicity is a virtue. But! If you do want to support an ability to extract/save values differently for different types, you need to implement this as trait not as a policy.
How about a proposed path splitting policy to allow non-string keys? I think it is better if all that is condensed into one class, because changing one element leads in most cases to changing the rest as well (if you change data type you have to provide different extractor and inserter, if you change key type you have to provide different comparison policy). std::map is a different case because there template parameters are much more independent.
How are they more independent? Are you saying that you compare function is somehow different then std::map one?
9. Access interface are lacking. If I got string identify subsystem and string that identify parameter separately why do I need to concatenate them to get to the value
You don't have to concatenate them:
int subsystem_parameter = pt.get_child("subsystem").get<int>("parameter");
And if I have more levels? string trace_config = pt.get_child("MyLib").get("debug").get("trace").get("config"); I would rather prefer string trace_config = pt.get("MyLib", "debug", "trace", "config" );
10. General note: the whole design/implementation is unnecessary complicated. You general part is about 50k. In Boost.Test utils section I have a component with very similar rationale but better design (IMO obviously) implemented in 16k.
I don't see a way I could simplify it considerably. If you see, please let me know how.
I did. See config_file.hpp ib Boost.Test code base.
11. Generating part if the design/implementation is completely unnecessary. It may? be useful in 1% of the usage cases and in those cases I would stick to some alternatives.
I don't understand what you mean. Can you please rephrase?
I mean that writing config files programmatically not necessary enough and should not part of this library. Gennadiy

I mean that writing config files programmatically not necessary enough and should not part of this library.
I don't understand that point of view: any application (as opposed to simple command line utility) you write will almost certainly want to both read and write it's configuration : take a look at your favorite word processor/speadsheet/photo editor and I bet you it does exactly that. John.

I mean that writing config files programmatically not necessary enough and should not part of this library.
I don't understand that point of view: any application (as opposed to simple command line utility) you write will almost certainly want to both read and write it's configuration : take a look at your favorite word processor/speadsheet/photo editor and I bet you it does exactly that.
Any application is overstatement IMO. For the application have a need to save some parameters there should be a way to change them during application runtime. This fact limit our scope to something with GUI (with rare exclusions). These application indeed may need bidirectional access to permanent storage for some date. But I would use Serialization library for this purpose. Why would I need to go through extra hoops with parameters tree? Gennadiy

The biggest difference is that property_tree is a DOM, not a linear structure.
In a big part it's implementation detail. I might as well keep all the parameter in single list with names like "a.b.c.d". Though I agree that variable_map could be done better.
No, this is not implementation detail. This is a fundamental difference. Of course, you can have PO store options in form of a.b.c.d, but that does not make it hierarchical. How do you make a copy of a branch and attach it somewhere else? How do you remove a branch? How do you take only one branch and make another options structure of it? All these operations would be extremely cumbersome, not to say that also possibly inefficient, using linear structure.
[...] With program options you would have to teach my_component extract values from whatever place they are in config file(s).
Other things that are not supported by program_options:
- writing configuration
Why would I need that? Or rather in how many instances I would need that?
I think this type of usage is rather common, at least the way I used property tree relied heavily on these features. For example I used it as a primitive serialization library where files containing objects were human creatable. I had an application which had GUI to manipulate these objects, and allowed to save them back again to files. All I/O was done using property tree.
- XML, JSON, INI, Windows registry parsing out of the box
You could implement them as an add-ons.
How can you present hierarchical structure like XML or JSON as a linear string of options? A string of a.b.c.d -form keys is not extremely useful.
Also, I think that in simple cases, syntax offered by property_tree is simpler and has less steep learning curve. It's matter of opinion and again it's just PO issue.
Possibly, but it also supports options descriptions and notifications. These do not belong to property_tree. For example descriptions would make little sense when parsing XML file.
Essentially it's unusable for anything any different for one rigid format you chose.
That format covers most simple command-lines I have seen. It is not enough to implement a command line for a tool like gcc, but then it is not intended to. If you are implementing a copy utility that takes two filenames and several flags, (like -r, -m etc.), it works.
9. Access interface are lacking. If I got string identify subsystem and string that identify parameter separately why do I need to concatenate them to get to the value
You don't have to concatenate them:
int subsystem_parameter = pt.get_child("subsystem").get<int>("parameter");
And if I have more levels?
string trace_config = pt.get_child("MyLib").get("debug").get("trace").get("config");
I would rather prefer
string trace_config = pt.get("MyLib", "debug", "trace", "config" );
This is a matter of taste, in my opinion both are similar. But would you implement your version using variadic arguments or by supplying implementations up to N parameters?
I don't see a way I could simplify it considerably. If you see, please let me know how. I did. See config_file.hpp ib Boost.Test code base.
I'll definitely have a look. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e28p14$mdt$1@sea.gmane.org...
The biggest difference is that property_tree is a DOM, not a linear structure.
In a big part it's implementation detail. I might as well keep all the parameter in single list with names like "a.b.c.d". Though I agree that variable_map could be done better.
No, this is not implementation detail. This is a fundamental difference. Of course, you can have PO store options in form of a.b.c.d, but that does not make it hierarchical. How do you make a copy of a branch and attach it somewhere else? How do you remove a branch? How do you take only one branch and make another options structure of it? All these operations would be extremely cumbersome,
You got me convinced. PO indeed shoudld be implemented differently ;)
not to say that also possibly inefficient, using linear structure.
Who cares?
[...] With program options you would have to teach my_component extract values from whatever place they are in config file(s).
Other things that are not supported by program_options:
- writing configuration
Why would I need that? Or rather in how many instances I would need that?
I think this type of usage is rather common, at least the way I used property tree relied heavily on these features. For example I used it as a primitive serialization library where files containing objects were human creatable. I had an application which had GUI to manipulate these objects, and allowed to save them back again to files. All I/O was done using property tree.
This type of usage require different tools. I would use Serialization lib for that.
- XML, JSON, INI, Windows registry parsing out of the box
You could implement them as an add-ons.
How can you present hierarchical structure like XML or JSON as a linear string of options? A string of a.b.c.d -form keys is not extremely useful.
Also, I think that in simple cases, syntax offered by property_tree is simpler and has less steep learning curve. It's matter of opinion and again it's just PO issue.
Possibly, but it also supports options descriptions and notifications. These do not belong to property_tree. For example descriptions would make little sense when parsing XML file.
They would make a lot of sence. Especially when you will start producing an error message if some required property is missing.
Essentially it's unusable for anything any different for one rigid format you chose.
That format covers most simple command-lines I have seen. It is not enough to implement a command line for a tool like gcc, but then it is not intended to. If you are implementing a copy utility that takes two filenames and several flags, (like -r, -m etc.), it works.
Exactly. It does not stand a comparison with another existing solution for CLA parsing.. Gennadiy

I think this type of usage is rather common, at least the way I used property tree relied heavily on these features. For example I used it as a primitive serialization library where files containing objects were human creatable. I had an application which had GUI to manipulate these objects, and allowed to save them back again to files. All I/O was done using property tree.
This type of usage require different tools. I would use Serialization lib for that.
What I think PT must have that serialization library is not meant to is: 1. The ability to load/save properties independently, not as a whole. 2. A documented (for library extensibility) parser interface allowing parser developers to accomplish (1). At least 3 storages requiring (1) come to mind: windows registry, ISA Server storage and IIS metabase. I would put these requisites as conditions for acceptance. Best regards Jorge

Jorge Lodos wrote:
What I think PT must have that serialization library is not meant to is:
1. The ability to load/save properties independently, not as a whole. 2. A documented (for library extensibility) parser interface allowing parser developers to accomplish (1).
At least 3 storages requiring (1) come to mind: windows registry, ISA Server storage and IIS metabase. I would put these requisites as conditions for acceptance.
I think it is quite hard to require that a parser for config file X must exists for us to accept the library. It puts a great deal of burden on the library author. Our focus should be on the general core interface of the library s.t. we get a flexible solution that can be useful in many areas. Then if the author agrees to it, we can look at new parsers. -Thorsten

Thorsten Ottosen wrote:
Jorge Lodos wrote:
What I think PT must have that serialization library is not meant to is:
1. The ability to load/save properties independently, not as a whole. 2. A documented (for library extensibility) parser interface allowing parser developers to accomplish (1).
At least 3 storages requiring (1) come to mind: windows registry, ISA Server storage and IIS metabase. I would put these requisites as conditions for acceptance.
I think it is quite hard to require that a parser for config file X must exists for us to accept the library. It puts a great deal of burden on the library author. Our focus should be on the general core interface of the library s.t. we get a flexible solution that can be useful in many areas.
Where in the documentation would one find that interface documented in order to review that aspect? I think this ommision is what's leading to the perception that the library is an xml library, or a command line library, or a... That along with the fact that nearly 1/2 of the documentation deals with describing these specific parsers. My vote on the current property_tree submission is NO. I think it could be re-submitted in the future given a clear rationale accurately stating: - it's intended use - a comparison with similar libraries (have you looked at McObject's ExtremeDb as one example) - why/why not it's not based on boost.graph libray as a Directed-Acyclic-Graph - why/why not it interacts with trees in spirit [Joel recently mentioned a fusion tree on the spirit mailing list] - why would I not just use map< vector<string>, data > along with upper/lower bound [or a TST based map] Without these the criteria required for review is a floating target. Other specific issues needing to be addressed include: - clear separation of the library from specific instances of it's use(xml/json/...) [ala boost.iostream filters for zlib...] - string/separator/paths (see other postings in this thread) - possible merging/interaction with proram_options library Jeff Flinn

My vote on the current property_tree submission is NO. I think it could be re-submitted in the future given a clear rationale accurately stating:
- it's intended use - a comparison with similar libraries (have you looked at McObject's ExtremeDb as one example) - why/why not it's not based on boost.graph libray as a Directed-Acyclic-Graph
I believe this rather belongs to a domain of implementation details. What difference would there be for the users? I have a clear vision of ptree, and this is std container + extra member functions to make access a snap. I would not like to expose any boost.graph interface anyway.
- why would I not just use map< vector<string>, data > along with upper/lower bound
Hmm. I think I have an answer below: #include <map> #include <string> #include <vector> #include <sstream> // I want to extract an int from some config file int i; // That's what I've got to do: typedef std::map< std::vector<std::string>, std::string > Container; Container container; read_xml(..., container); // Note this must be defined somewhere anyway std::vector<std::string> path; path.push_back("some"); path.push_back("arbitrarily"); path.push_back("long"); path.push_back("path"); Container::const_iterator it = container.find(path); // A very inefficient lookup if (it != container.end()) { std::ostringstream out_stream; out_stream << it->second; std::istringstream in_stream(out_stream.str()); in_stream >> i; // Finally got it! } else throw ... Not a pretty sight, is it? I'd rather download TinyXML. Note that it already assumes there are parsers implemented somewhere else, which create that monstrous map. The map solution also does not let you manipulate the hierarchy in memory - map is a linear structure even though it is internally implemented as a tree. I believe any attempt to make the above shorter and more manageable leads just towards implementing another version of property_tree. Best regards, Marcin

Marcin Kalicinski wrote:
My vote on the current property_tree submission is NO. I think it could be re-submitted in the future given a clear rationale accurately stating: - it's intended use - a comparison with similar libraries (have you looked at McObject's ExtremeDb as one example) - why/why not it's not based on boost.graph libray as a Directed-Acyclic-Graph
I believe this rather belongs to a domain of implementation details. What difference would there be for the users? I have a clear vision of ptree, and this is std container + extra member functions to make access a snap. I would not like to expose any boost.graph interface anyway.
My point is that these reasons should have been stated in the rationale section of the documentation. What if I'd like to use BGL's bfs or dfs on the tree? Sounds like useful functionality to me, without reinventing the wheel.
- why would I not just use map< vector<string>, data > along with upper/lower bound
Hmm. I think I have an answer below:
#include <map> #include <string> #include <vector> #include <sstream>
// I want to extract an int from some config file int i;
// That's what I've got to do:
typedef std::map< std::vector<std::string>, std::string > Container; Container container; read_xml(..., container); // Note this must be defined somewhere anyway std::vector<std::string> path; path.push_back("some"); path.push_back("arbitrarily"); path.push_back("long"); path.push_back("path"); Container::const_iterator it = container.find(path); // A very inefficient lookup if (it != container.end()) { std::ostringstream out_stream; out_stream << it->second; std::istringstream in_stream(out_stream.str()); in_stream >> i; // Finally got it! } else throw ...
Not a pretty sight, is it? I'd rather download TinyXML. Note that it already assumes there are parsers implemented somewhere else, which create that monstrous map. The map solution also does not let you manipulate the hierarchy in memory - map is a linear structure even though it is internally implemented as a tree.
Again these issues should have been stated in the rationale section. And if the above used(the current CVS version) of boost::filesystem::path and/or the boost::assign, many of your objections could be alleviated. Which also brings up a possible solution using boost::spirit::symbol, which would have the benefits of a ternarys state tree.
I believe any attempt to make the above shorter and more manageable leads just towards implementing another version of property_tree.
With a different set of abilities/limitations that the reviewers could compare. Jeff Flinn

I think it is quite hard to require that a parser for config file X must exists for us to accept the library. It puts a great deal of burden on the library author. Our focus should be on the general core interface of the library s.t. we get a flexible solution that can be useful in many areas.
The requisites I meant were the ones enumerated, namely 1. The ability to load/save properties independently, not as a whole. 2. A documented (for library extensibility) parser interface allowing parser developers to accomplish (1). Sorry if this was not clear enough. Cheers Jorge

What I think PT must have that serialization library is not meant to is:
Sorry. I am not sure what you mean: does the submission has these requisite already or not?
1. The ability to load/save properties independently, not as a whole.
What do you mean by that? Load should be separate from save?
2. A documented (for library extensibility) parser interface allowing parser developers to accomplish (1).
How is this different from Serialization library? Gennadiy

What I think PT must have that serialization library is not meant to is:
Sorry. I am not sure what you mean: does the submission has these requisite already or not?
IMHO not.
1. The ability to load/save properties independently, not as a whole.
What do you mean by that? Load should be separate from save?
I mean that only one property (which of course could contain others) could be loaded (or saved) at a time. Not the whole tree. For instance, if my tree corresponds to a windows registry key, it is desirable to load or save a single value without loading or saving all the values in the key.
2. A documented (for library extensibility) parser interface allowing parser developers to accomplish (1).
How is this different from Serialization library?
The difference is support for (1). Cheers Jorge

"Jorge Lodos" <lodos@segurmatica.cu> wrote in message news:005701c66565$b7451ce0$f9010a0a@segurmatica.cu...
What I think PT must have that serialization library is not meant to is:
Sorry. I am not sure what you mean: does the submission has these requisite already or not?
IMHO not.
1. The ability to load/save properties independently, not as a whole.
What do you mean by that? Load should be separate from save?
I mean that only one property (which of course could contain others) could be loaded (or saved) at a time. Not the whole tree. For instance, if my tree corresponds to a windows registry key, it is desirable to load or save a single value without loading or saving all the values in the key.
I am quite sure Serialization lib will allow you to do this. Just apply it to appropriate field. Gennadiy

I mean that only one property (which of course could contain others) could be loaded (or saved) at a time. Not the whole tree. For instance, if my tree corresponds to a windows registry key, it is desirable to load or save a single value without loading or saving all the values in the key.
I am quite sure Serialization lib will allow you to do this. Just apply it to appropriate field.
It requires a lot of work. The serialization library does not provide a model for in-memory property storage, nor does it provide a windows registry archive (the last time I saw). Applying serialization to appropiate fields implies I already have those fields somehow organized. The way I see it, PT is the serialization library plus in-memory property storage. Perhaps it could have used serialization internally and serialization archives for loading and saving. As I said before, the real difference could be the loading and saving of some of the properties in the memory storage, without having to load/save them all. Cheers Jorge

On 4/21/06, Jorge Lodos <lodos@segurmatica.cu> wrote:
The way I see it, PT is the serialization library plus in-memory property storage. Perhaps it could have used serialization internally and serialization archives for loading and saving. As I said before, the real difference could be the loading and saving of some of the properties in the memory storage, without having to load/save them all.
I think you are correct here. This really sounds like it could've (should've?) been implemented using the serialization library and maybe BGL or a generic tree structure (which I would love to be a Boost library). That way you get all of the traversal mechanisms available to the tree (or graph) structure, and all of the archiving techniques available to the serialization lib. -Michael Fawcett

"Michael Fawcett" <michael.fawcett@gmail.com> wrote in message news:bc5bffe80604211135l41c2ecacx18105b56aa0689e8@mail.gmail.com...
On 4/21/06, Jorge Lodos <lodos@segurmatica.cu> wrote:
The way I see it, PT is the serialization library plus in-memory property storage. Perhaps it could have used serialization internally and serialization archives for loading and saving. As I said before, the real difference could be the loading and saving of some of the properties in the memory storage, without having to load/save them all.
I think you are correct here. This really sounds like it could've (should've?) been implemented using the serialization library and maybe BGL or a generic tree structure (which I would love to be a Boost library).
That way you get all of the traversal mechanisms available to the tree (or graph) structure, and all of the archiving techniques available to the serialization lib.
Exactly my point. Gennadiy

I think you are correct here. This really sounds like it could've (should've?) been implemented using the serialization library and maybe BGL or a generic tree structure (which I would love to be a Boost library).
That way you get all of the traversal mechanisms available to the tree (or graph) structure, and all of the archiving techniques available to the serialization lib.
Exactly my point.
Gennadiy
You are correct it might (possibly) have been implemented this way. However, I'm quite sure you would need 30 lines of code, not 3 to get a value from a simple config file. And people would prefer to use MSXML or Expat instead anyway. Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2bkmh$ldu$1@sea.gmane.org...
I think you are correct here. This really sounds like it could've (should've?) been implemented using the serialization library and maybe BGL or a generic tree structure (which I would love to be a Boost library).
That way you get all of the traversal mechanisms available to the tree (or graph) structure, and all of the archiving techniques available to the serialization lib.
Exactly my point.
Gennadiy
You are correct it might (possibly) have been implemented this way. However, I'm quite sure you would need 30 lines of code, not 3 to get a value from a simple config file. And people would prefer to use MSXML or Expat instead anyway.
Why? Here is my 3 lines: StringTree t; SimpleXMLArcive( a.xml ) >> t; get( "a.b.c.d", t ); Gennadiy

Marcin Kalicinski wrote:
I think you are correct here. This really sounds like it could've (should've?) been implemented using the serialization library and maybe BGL or a generic tree structure (which I would love to be a Boost library).
That way you get all of the traversal mechanisms available to the tree (or graph) structure, and all of the archiving techniques available to the serialization lib.
Exactly my point.
Gennadiy
You are correct it might (possibly) have been implemented this way. However, I'm quite sure you would need 30 lines of code, not 3 to get a value from a simple config file. And people would prefer to use MSXML or Expat instead anyway.
Counting lines is a risky business in any language that has functions: std::string get_value_from_xml_file(const std::string& file, const std::string& name) { // 10K lines of assembler } makes acessing value from xml file a one-liner. - Volodya

"Jorge Lodos" <lodos@segurmatica.cu> wrote in message news:007801c66577$3a129dd0$f9010a0a@segurmatica.cu...
I mean that only one property (which of course could contain others) could be loaded (or saved) at a time. Not the whole tree. For instance, if my tree corresponds to a windows registry key, it is desirable to load or save a single value without loading or saving all the values in the key.
I am quite sure Serialization lib will allow you to do this. Just apply it to appropriate field.
It requires a lot of work.
How come?
The serialization library does not provide a model for in-memory property storage,
It's not supposed to. std::vector, std::map, multi_index, tree - take your pick
nor does it provide a windows registry archive (the last time I saw).
Ok - Implement one. No need to reinvent the whole infrastructure.
Applying serialization to appropriate fields implies I already have those fields somehow organized.
Why? Or rather what do you mean by that.
The way I see it, PT is the serialization library plus in-memory property storage.
The way I see it we already have better solution for either task.
Perhaps it could have used serialization internally and serialization archives for loading and saving.
Better stop reinventing the wheel and just write an add-ons to excising solution.
As I said before, the real difference could be the loading and saving of some of the properties in the memory storage, without having to load/save them all.
I still do not see any difficulties implementing this without PT. Gennadiy

It requires a lot of work.
How come?
The serialization library does not provide a model for in-memory property storage,
It's not supposed to.
std::vector, std::map, multi_index, tree - take your pick
nor does it provide a windows registry archive (the last time I saw).
Ok - Implement one. No need to reinvent the whole infrastructure.
Applying serialization to appropriate fields implies I already have those fields somehow organized.
Why? Or rather what do you mean by that.
The way I see it, PT is the serialization library plus in-memory property storage.
The way I see it we already have better solution for either task.
Perhaps it could have used serialization internally and serialization archives for loading and saving.
Better stop reinventing the wheel and just write an add-ons to excising solution.
As I said before, the real difference could be the loading and saving of some of the properties in the memory storage, without having to load/save them all.
I still do not see any difficulties implementing this without PT.
If I understand correctly, your point is that implementing a library that: 1. Reads and saves hierarchical configurations from/to several different storages (including windows registry and XML) 2. Maintains an accessible memory storage for configuration data 3. Is able to load/save individual configuration parameters without having to load/save all of them is not a lot of work, using serialization and existing containers. Since "lot" is a subjective term, my point is that using a separate library that puts it all together with an easy and extensible (for storages) interface is a lot easier than develop my own solution. The possible use of serialization or container classes in this library are implementation details to me. Moreover, since read/writing configurations is a very common task, for many people learning how to use serialization just to assess feasibility is not possible, and there is no single library that does what I enumerated before, IMHO such a library has a place in boost. Another question is if that library is PT :-) Cheers Jorge

"Jorge Lodos" <lodos@segurmatica.cu> wrote in message news:009401c66592$025dcc50$f9010a0a@segurmatica.cu...
If I understand correctly, your point is that implementing a library that: 1. Reads and saves hierarchical configurations from/to several different storages (including windows registry and XML) 2. Maintains an accessible memory storage for configuration data 3. Is able to load/save individual configuration parameters without having to load/save all of them is not a lot of work, using serialization and existing containers.
No. This is not my point. I am not talking about serialization at all. You may have any number of archive types each require some work to implement. They may become worthy addition to the Serialization library functionality. As for the data structure that is used to represent this data in memory I don't see much value in it. You need simple and convinient solution for permanent storage of you application data in some particular formats (config is just another type of application data) Write an archive for that. I am quite sure usage will be much more simple and convinient than the one presented by this solution. Gennadiy

I mean that only one property (which of course could contain others) could be loaded (or saved) at a time. Not the whole tree. For instance, if my tree corresponds to a windows registry key, it is desirable to load or save a single value without loading or saving all the values in the key.
I think this is rather simple, there are no special docs needed to cover that: ptree pt; write_registry(..., pt.get_child("whatever.child.you.want")); Marcin

Marcin Kalicinski wrote:
I mean that only one property (which of course could contain others) could be loaded (or saved) at a time. Not the whole tree. For instance, if my tree corresponds to a windows registry key, it is desirable to load or save a single value without loading or saving all the values in the key.
I think this is rather simple, there are no special docs needed to cover that:
ptree pt; write_registry(..., pt.get_child("whatever.child.you.want"));
That's not an interface *I* want to use for any configuration storage. I want this: whatever_class settings; settings["whatever.you.name.it"] = QPoint(10, 10); and then I don't want to invoke 'write_registry' or 'write_ldap' or 'write_ini' depending on configuration method that some other kind of program has choosen. I want setting to be saved behing the back. Speaking again about program_options and property_tree interaction: if this is ever to work, I want property_tree to expose some "changed" events, like W3W DOM has, that I can catch to sync data back to storage. Without that, smooth configuration saving will be impossible. - Volodya

ptree pt; write_registry(..., pt.get_child("whatever.child.you.want"));
That's not an interface *I* want to use for any configuration storage. I want this:
whatever_class settings; settings["whatever.you.name.it"] = QPoint(10, 10);
and then I don't want to invoke 'write_registry' or 'write_ldap' or 'write_ini' depending on configuration method that some other kind of program has choosen. I want setting to be saved behing the back.
In ptree, instead of settings["whatever.you.name.it"] = QPoint(10, 10); you do: settings.put("whatever.you.name.it", QPoint(10, 10)); I believe the only difference is in syntax. The first one is marginally simpler, but it does not allow you to distinguish between get and put operations, and also does not scale well towards default value and boost::optional overloads. To automatically save settings behind your back, you need a wrapper over ptree, that will among other things, decide when to save.
Speaking again about program_options and property_tree interaction: if this is ever to work, I want property_tree to expose some "changed" events, like W3W DOM has, that I can catch to sync data back to storage. Without that, smooth configuration saving will be impossible.
I used ptree for saving configuration numerous times. I don't quite understand what you mean by "smooth", but for me it was adequate without notifications. In my case, saving was usually done in a destructor of some high-level class, something along these lines: ~Application() { write_xml(m_filename, m_settings); } If you need notifications - and I'm sure there are cases where you must have them - this is the wrong library. Use program_options instead (when it adds support for saving). Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2iggr$9lk$1@sea.gmane.org... : In ptree, instead of : : settings["whatever.you.name.it"] = QPoint(10, 10); : : you do: : : settings.put("whatever.you.name.it", QPoint(10, 10)); Now: as any JSON/etc user would expect, the QPoint structure should be stored as a 2-field node: [ 10, 10 ] or { "x": 10 , "y": 20 } I asked this before as it is not documented, but I still do not have an answer: what exactly does it take to achieve this ? : > Speaking again about program_options and property_tree interaction: if : > this : > is ever to work, I want property_tree to expose some "changed" events, : > like : > W3W DOM has, that I can catch to sync data back to storage. Without that, : > smooth configuration saving will be impossible. ... : If you need notifications - and I'm sure there are cases where you must have : them - this is the wrong library. Use program_options instead (when it adds : support for saving). What is really being asked for here is (yet another) different interface for editing a ptree container in memory.... Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

Marcin Kalicinski wrote:
I used ptree for saving configuration numerous times. I don't quite understand what you mean by "smooth", but for me it was adequate without notifications. In my case, saving was usually done in a destructor of some high-level class, something along these lines:
~Application() { write_xml(m_filename, m_settings); }
This is not acceptable for me. Setting should be stored as soon as they are changed, just in case your application crashes or is killed.
If you need notifications - and I'm sure there are cases where you must have them - this is the wrong library.
I don't think this is the wrong library, I would argue this is the problem with property_tree library. Wanting to be notified when a tree changes is quite reasonable thing. - Volodya

~Application() { write_xml(m_filename, m_settings); }
This is not acceptable for me. Setting should be stored as soon as they are changed, just in case your application crashes or is killed.
I am using shared mem for this Gennadiy

What I think PT must have that serialization library is not meant to is:
1. The ability to load/save properties independently, not as a whole. 2. A documented (for library extensibility) parser interface allowing parser developers to accomplish (1).
At least 3 storages requiring (1) come to mind: windows registry, ISA Server storage and IIS metabase. I would put these requisites as conditions for acceptance.
I believe you can do #1 quite easily: boost::property_tree::ptree pt; write_xml(filename, pt.get_child("whatever.child.you.want")); #2 becomes a non-issue then. I appreciate you interest in the library, and need for ISA Server storage and IIS metabase formats support. However, I think there are at least 27 billion different file formats on the planet, and if everybody wanted to condition acceptance on his favorites, heat death of universe would eventually stop all work in progress in that area. Best regards, Marcin

"Marcin Kalicinski" wrote:
3. Even if you insist on alternative lookup. Why not just use single typedef with multi_index instead of all the implementation.
Could you elaborate more on that? I considered use of multi_index to implement indexing for properties, but it only affected the implementation part of library, not interface, and because I already had a working, exception safe solution, I didn't see the reason to dump it and add another dependency on another library.
Multi-index has disadvantages: * high compilation time (people do change configuration structures very often) * doesn't compile on Borland * if it will be possible to provoke compile time error inside mu;ti-index by wrong use of ptree no one wil understand the result message IMO the fast lookup should be optional feature. /Pavel

"Pavel Vozenilek" <pavel_vozenilek@hotmail.com> wrote in message news:e28grh$pfk$1@sea.gmane.org...
"Marcin Kalicinski" wrote:
3. Even if you insist on alternative lookup. Why not just use single typedef with multi_index instead of all the implementation.
Could you elaborate more on that? I considered use of multi_index to implement indexing for properties, but it only affected the implementation part of library, not interface, and because I already had a working, exception safe solution, I didn't see the reason to dump it and add another dependency on another library.
Multi-index has disadvantages: * high compilation time (people do change configuration structures very often)
This point couldn't affect the desing of the component so drstically. Let's then reimplement any single component in STL, boost etc in with simpler but less reach features.
* doesn't compile on Borland
Oh, well
* if it will be possible to provoke compile time error inside mu;ti-index by wrong use of ptree no one wil understand the result message
Is't it an issue with any modern library?
IMO the fast lookup should be optional feature.
IMO it does not belong there at all. Gennadiy

IMO the fast lookup should be optional feature.
IMO it does not belong there at all.
I cannot agree. Can you imagine an XML or JSON library that would do O(n) lookup on keys? I think it would be quite a curiosity. And be quite useless if any larger data chunk was to be manipulated. In one of my previous projects I used ptree to manipulate data about 2 MB in size. Without lookup that would be probably an order of magnitude slower. On the other hand, having an option to disable lookup could save some memory/time (if user knew he will only iterate over the structure). Kind regards, Marcin

First of all i want to state that i really like the functionality and interface that ptree offer. As said before the easy of use of this library and the clean code that it generates are impressive. If we have to make a change that compromise the interface i think we are going to kill it. Although, i have a proposal that i think will: a) preserve the actual interface. b) add a lot of functionality with the same easy of use c) give better performance I think that we are missing the big picture discussing about . or / in the get-put functions. The fact is that the library implements the path search directly, and this is a very nice spot to insert one level more of abstraction. A path can be view as a single linked list of keys, the "xxx.yyy.zzz" is only one way to representing it. (a very convenient way i have to said) Think what we could gain if we define a path concept here (i have read all the code of the library and it is a straight forward change to it). The get/put function will receive as a parameter a path, insted of a '.' concatenated key. First we support conversion from '.' concatenate keys to path... Now we can use the ptree in the same way as it is now. Next we support operator / for making the path. The thing is that with a path concept in the middle we can now use the following alternative idioms // Standard conversion // Very clean and the path can be formatted as a hole or can be read from a file. { ptree data; data.put( "debug.info.count", 3 ); } // Other separation parameter { ptree data; ptree::path p(' '); data.put( p / "debug info count", 3 ); // space used as a separator, note that this is cleanner than // pass ' ' as a parameter, and half of the get-put functions in the // ptree implementation dissapear } // Can have a path with '.', '/', ' ' or any other character in the keys { ptree data; ptree::path p; data.put( p / "debug" / "info" / "count", 3 ); data.put( p / "d.e.b.u.g" / "i n f o" / "s.t.u.f.f", 10 ); } // Can use variables without the needs of special concatenation care // Observe we can mix concatenation with operator/ { struct Human {string name; int age; }; typedef vector<Human> Humans; Humans humans; ... ptree data; ptree::path p( '/' ); for( Humans::iterator h = humans.begin(), end = humans.end(); h != end; h++ ) { data.put( p / h->name / "info/age" , h->age ); } } // And finally other things can easily be supported // As an example, suppose we need to group e-mails by providers { struct email_info { string direction; string owner }; typedef list<email_info> eList; eList email; ... ptree data; ptree::reverse_path p('@'); for( eList::iterator e = email.begin(), end = email.end(); e != end; e++ ) { data.put( p / i->direction , i->owner ); } // And now we can ask to the tree things like data.count( p / "gmail.com" ); // Or save this info grouped in an xml! } // Now we can consistenly support other key types { intpTree ipData; intptree::path p; ipData.put( p / 140 / 123 / 25 / 10 , "home" ) } I think that this flexibility is great and i cant see any major drawback of adding it to ptree. Best of all, programs that used the old version will compile without any change. The lookup performance is better because you don't have to make many copies of the '.' concatenated key tails. And if the operator/ sintaxis is used the performance is even better. Some implementations details could change, in my head the put/get function pop the head of the path list while they are using it, and as a post condition always return an empty path. I know it is not very usual, and other more standard way of doing this could be worked it out. Ok, that is it... i hope others like this flexibility too. * What is your evaluation of the design? I really like it, specially the easy of use. I would add the path abstraction to boost ptree flexibility :) Only one thing, i dont like to pass a bool as a function parameter because i always forget what was intend for. You use this aproach to specify if the path must be overwritted. I thinked about it and not reach to any other good sintaxis, but it is there. * What is your evaluation of the potential usefulness of the library? Very, very high I think this library will be widely used. * Did you try to use the library? With what compiler? Did you have any problems? VS2005, works very nice... * How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? Read the documentation, study all the .hpp and think about it quite a lot. * Do you think the library should be accepted as a Boost library? Yes, i will encourage other to use it too. - was the library's performance good enough? If not, can you suggest improvements? incorporate path concept - was the library's design flexible enough? If not, how would you suggest it should be redesigned to broaden its scope of use? again, the path concept regards Matias Capeletto Argentina PD: Congratulations to the library creator! it is very light-weight and powerfull... PD: this is my first libray revision... sorry if there is another aproach to present it.

Hi Matias, Thank you for a great review. You may have noticed I have already replied to similar proposal of Jeff Flinn, somewhere in review thread. Now, as you presented the implementation and interface in so much detail, I must definitely reconsider. Previously I only thought of a path class as means of encapsulation for default separator. And your proposition goes much further. The idea with overloading operator / looks very nice to me. It will remove the need for default separator altogether (read: halve the number of overloads). It will also make the lookup faster. I'm thinking how to merge it with another important suggestion, which is to allow paths of arbitrary type, not only strings (for example arrays of ints, where each int is an index on its level, in some cases this might be useful). Having the path class, whole path parsing should be done by path objects. Probably the best approach would be to templatize basic_ptree on path_type and require certain functionality from it. This is already done, except that instead of path we have a string functionality that is needed.
If we have to make a change that compromise the interface i think we are going to kill it.
I think you are right.
Only one thing, i dont like to pass a bool as a function parameter because i always forget what was intend for. You use this aproach to specify if the path must be overwritted. I thinked about it and not reach to any other good sintaxis, but it is there.
There is a different approach, vastly superior to the extra bool parameter. Just add "set" functions in addition to "get" and "put". "set" will always replace existing property if it is there, while "put" will always add new one. The only reason why I didn't implement it is because the get/put interface was already bloated, and adding "set" looked like a step in wrong direction. Now, that I think we can get rid of those ugly separator overloads (sigh), this may be an option again. Additionally the library should have a get/put-style member function to delete an existing property. It obviously has erase, but it does not work with paths. We would end up with get/set/put/del(?), and the picture would be, hopefully, complete. Again thank you for a great review and best regards, Marcin

Additionally the library should have a get/put-style member function to delete an existing property. It obviously has erase, but it does not work with paths. We would end up with get/set/put/del(?), and the picture would be, hopefully, complete.
That is it! I really like the new set/del, seems fairly easy to learn and solve the bool argument problem, adds a lot of functionality too. I have to propose only one more feature, before the picture is complete. If you look in this thread, someone ask for something like "debug.info.testfail[2]" for access repeated tags If we get the path into ptree we can overload other operator to support this type of path (we can not overload operator[]... because of precedence issues :( someone nows a way we can still use it?, the sintaxis will be very nice. As i see it, the only the choosen operator must have the same priority, see the examples, so we have only have * and % ). Something like this may work... ptree data; ptree::path p; data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "otherfile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "wer.hpp" ); data.set( p / "debug" / "info" / "testfail" %2 , "file.hpp" ); // we use set! And we can use it in the middle too, if we have... ------------------------------------- <logFile> <log> <time>10:13</time> <what>cant find path</what> </log> <log> <time>10:16</time> <what>other error<what> </log> <log> <time>10:29</time> <what>segmentation fault</what> </log> </logfile> ------------------------------------- ptree data; ptree::path p; read_xml("logFile.txt",data); string swhat = data.get( p / "log" %1 / "what" ): ------------------------------------- What do you think?
I'm thinking how to merge it with another important suggestion, which is to allow paths of arbitrary type, not only strings (for example arrays of ints, where each int is an index on its level, in some cases this might be useful). Having the path class, whole path parsing should be done by path objects. Probably the best approach would be to templatize basic_ptree on path_type and require certain functionality from it.
I like more the path to be ortogonal to the tree, i think we may need to use differents paths in the same ptree, like a simple path in some point and a reverse_path in other place, or even maybe a relative_ path. Something like: ptree data; for( Humans::iterator h = humans.begin(), end = humans.end(); h != end; h++ ) { ptree::relative_path p(data, h->name + ".info" ); data.put( p / "age" , h->age ); data.put( p / "nick" , h->nick ); data.put( p / "dir" , h->dir ); } But maybe is too flexible... and you can impose to only use one kind of path.

Matias Capeletto wrote:
If we get the path into ptree we can overload other operator to support this type of path (we can not overload operator[]... because of precedence issues :( someone nows a way we can still use it?, the sintaxis will be very nice. As i see it, the only the choosen operator must have the same priority, see the examples, so we have only have * and % ).
Something like this may work...
ptree data; ptree::path p;
data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "otherfile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "wer.hpp" );
data.set( p / "debug" / "info" / "testfail" %2 , "file.hpp" ); // we use set!
some simple function would allow you to write data.set( p / "debug" / "info" / "testfail" / at(2) , "file.hpp" ); or we might say data.set( (p / "debug" / "info" / "testfail")[2] , "file.hpp" ); or perhaps data.set( p / "debug" / "info" / "testfail[2]" , "file.hpp" ); -Thorsten

"Thorsten Ottosen" <thorsten.ottosen@dezide.com> wrote in message news:4449FFDF.4000107@dezide.com... : some simple function would allow you to write : : data.set( p / "debug" / "info" / "testfail" / at(2) , "file.hpp" ); Somehow I find it inelegant to be composing a path object (vector of strings?) instead of directly indexing the structure. It might be "cool" if we could write: pt / "debug" / "info" / "testfail" % 2 = "file.hpp" ; But there is a problem: the desired behavior of "path access" is not the same whether one is reading or setting a value -- as demonstrated by the std::map::operator[] snafu: If the path does not currently exist, we will want a 'put' operation to succeed, while a 'get' operation should either fail (throw an exception) or return a dummy "null" tree. I don't have time to seriously look into this, but I'm sure that a solution can be worked out. In my experience, however, I have rarely(ever?) needed to directly access a 'deep' field of a tree with a path. IMO it would be very reasonable to initially accept ptree without a path-access mechanism -- which can easily be added latter -- if we find that things are already complicated enough this way. Just for laughs: put( pt ) / "debug" / "info" / "testfail" % 2 << "file.hpp" ; get( pt ) / "debug" / "info" / "testfail" % 2 >> var; tryget( pt ) / "debug" / "info" / "testfail" % 2 >> var || (var=5); - no explicit template params are ever specified - default value is optionally specified after a ||, but the default expression is not evaluated if not used. But don't you ever quote me on this ! ;-) Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

Thorsten Ottosen wrote:
Matias Capeletto wrote:
If we get the path into ptree we can overload other operator to support this type of path (we can not overload operator[]... because of precedence issues :( someone nows a way we can still use it?, the sintaxis will be very nice. As i see it, the only the choosen operator must have the same priority, see the examples, so we have only have * and % ).
Something like this may work...
ptree data; ptree::path p;
data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "otherfile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "wer.hpp" );
data.set( p / "debug" / "info" / "testfail" %2 , "file.hpp" ); // we use set!
some simple function would allow you to write
data.set( p / "debug" / "info" / "testfail" / at(2) , "file.hpp" );
or we might say
data.set( (p / "debug" / "info" / "testfail")[2] , "file.hpp" );
or perhaps
data.set( p / "debug" / "info" / "testfail[2]" , "file.hpp" );
Exactly! There are many solutions that provide the generality/flexibility while providing ease of use. The library could also provide a function(s) that take a delimited string and return a valid path to encompass the original usage presented by Marcin. Jeff Flinn

On 4/24/06, Jeff Flinn <TriumphSprint2000@hotmail.com> wrote:
Thorsten Ottosen wrote:
Matias Capeletto wrote:
If we get the path into ptree we can overload other operator to support this type of path (we can not overload operator[]... because of precedence issues :( someone nows a way we can still use it?, the sintaxis will be very nice. As i see it, the only the choosen operator must have the same priority, see the examples, so we have only have * and % ).
Something like this may work...
ptree data; ptree::path p;
data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "otherfile.hpp" ); data.put( p / "debug" / "info" / "testfail" , "wer.hpp" );
data.set( p / "debug" / "info" / "testfail" %2 , "file.hpp" ); // we use set!
some simple function would allow you to write
data.set( p / "debug" / "info" / "testfail" / at(2) , "file.hpp" );
Yes, it is possible, i still like the % more, i want that the difference from indexing(%) and 'getNextChild' (/) can be read directly from the code.
or we might say
data.set( (p / "debug" / "info" / "testfail")[2] , "file.hpp" );
This have the drawback that we can not index internal nodes and continue with a path... That is the reason IMO the two operators must have the same precedence level. And it is just beatifull to see that there are two operators (% and *) that fullfill this requierement for the operator/. I think that something like this data.get( p / "log"%4 / "info" / "time" ) is very self descriptive. If there are less than 5 childs with the tag "log" this will generate the result you choose when pick the dessired get function (throw,optional,default)
or perhaps
data.set( p / "debug" / "info" / "testfail[2]" , "file.hpp" );
This is ok... but if we give the option of using "log.info.time" or p / "log" / "info" / "time" then we must consistenly give an option for indexing too.. and in the case whe choose operator% or * i think it will be more clear if we use the following notation... "log%3.info.time" ( now that i see it in the screen "log[3].info.time" is somewhat more nice to the view, what do you think?)

Thorsten Ottosen <thorsten.ottosen@dezide.com> writes:
Matias Capeletto wrote:
If we get the path into ptree we can overload other operator to support this type of path (we can not overload operator[]... because of precedence issues :( someone nows a way we can still use it?, the sintaxis will be very nice. As i see it, the only the choosen operator must have the same priority, see the examples, so we have only have * and % ).
Something like this may work...
ptree data; ptree::path p;
data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" );
If we're going down that road, Shouldn't data.put( p/"debug/info/testfail/somefile.hpp" ); be made to work, too? -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
Thorsten Ottosen <thorsten.ottosen@dezide.com> writes:
Matias Capeletto wrote:
Something like this may work...
ptree data; ptree::path p;
data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" );
If we're going down that road, Shouldn't
data.put( p/"debug/info/testfail/somefile.hpp" );
be made to work, too?
I'm really not seeing the introduction of a path class as an improvement. What was wrong with the two traversal mechanisms already in the library? Combine it with a more intelligent path grammar that autodetects alternative separators as in "debug.info.testfile" // using . "/debug/info/test.file" // uses / and all is well.

Matias Capeletto wrote:
Something like this may work...
ptree data; ptree::path p;
data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" );
If we're going down that road, Shouldn't
data.put( p/"debug/info/testfail/somefile.hpp" );
be made to work, too?
I'm really not seeing the introduction of a path class as an improvement. What was wrong with the two traversal mechanisms already in the library? Combine it with a more intelligent path grammar that autodetects alternative separators as in
"debug.info.testfile" // using . "/debug/info/test.file" // uses /
and all is well.
Nothing is wrong with the library now, and adding the path class will still allow you to write the code you post and with better performance (you may want to see how is implemented now in the ptree_implementation.hpp). This is one of the reason Marcin said that he is going to considered it. This abstraction opens the door to a more general aproach to especifing path. We already discusse that we can use it to give indexing support and permitted to embed variables directly in the path without the need for something as ugly as: "debug." + infoName + ".testfile" IMO this code is more error prone and innefinient than p / "debug" / infoName / "testfile"

Something like this may work...
ptree data; ptree::path p;
data.put( p / "debug" / "info" / "testfail" , "somefile.hpp" );
If we're going down that road, Shouldn't
data.put( p/"debug/info/testfail/somefile.hpp" );
be made to work, too?
The instruction "data.put( p/"debug/info/testfail/somefile.hpp" );" have no data, it is only a path and you are using put... i think you wanted to say: ptree::path p('/'); // because by default marcin will continue to use the '.' i think data.put( p / "debug/info/testfail" , "somefile.hpp" ); And yes... you are right this have to work, and this other too: string testName = "testfail"; data.put( p / "debug/info" / testName / "time/hour" , 23 ); And if you not want that the mini parser inside path looks for the '/' because you are a performance freak, you may... ptree::ns_path p; // ns: no separator data.put( p / "debug" / "info" / testName / "time" / "hour" , 23 ); and you force yourself to write this way... this have the advantage that now you can stop worring about a spurius separator in your keys that break them without you noticing it. Thinking of it... it seems that it maybe a little bit error prone to give '.' as a default... and insted be could: ptree::path p; // by default, the keys dont get parsed... it behave as the ns_path ptree::path p('.'); // now the string you passed to the / operator are parsed looking for a '.' ptree::path p('@'); // to parse emails :) I feel it will save us from some surprises...

I cannot agree. Can you imagine an XML or JSON library that would do O(n) lookup on keys?
Shouldn't a tree already be approx log N in most cases.
This is the case if your config is well balanced among nodes. If you have a long linear sequence of subkeys that becomes O(N).
Optimizing it into log log N by adding a multimap to each node seems like an overkill to me. (at least as default)
I don't think it costs that much. It's just an implementation detail hidden deep inside, several more bytes per node, that's all to it. It is not exposed in the interface. In return we get nice time constraints, which, I believe, adhere to the principle of least surprise. Usually if you have a lookup by name in some library you don't expect it does linear search through the whole lot. Especially if that lot can be arbitrarily large. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e290f2$fqe$1@sea.gmane.org...
IMO the fast lookup should be optional feature.
IMO it does not belong there at all.
I cannot agree. Can you imagine an XML or JSON library that would do O(n) lookup on keys? I think it would be quite a curiosity. And be quite useless if any larger data chunk was to be manipulated.
Make up you mind finally. What is your library? 1. Runtime parameter facility 2. Permanent storage facility 3. XML parser Each task has different tasks and priorities. If it (1) and (2) it doesn't need any fast lookup. If it's (3) - I wouldn't recommend DOM based parsers in this case anyway - Your logN lookup may not be fast enough anyway. What if I want some hash instead? I could achieve the same with multi_index serialized/deserialized from different archives. And I do not see any advantages in your solution. Gennadiy

Make up you mind finally. What is your library?
1. Runtime parameter facility 2. Permanent storage facility 3. XML parser
I'm afraid it is all 3, plus some more. By "some more" I mean it has more parsers than just XML, and it can also be used to manipulate hierarchical, human readable data structures at runtime.
Each task has different tasks and priorities. If it (1) and (2) it doesn't need any fast lookup. If it's (3) - I wouldn't recommend DOM based parsers in this case anyway - Your logN lookup may not be fast enough anyway. What if I want some hash instead?
If log(n) is not enough you can use SQL Server. But I doubt you will want to, if only thing you need to know is what is the startup position of main application window. If fast lookup is a problem for many people, it can be made optional, or customizable. But (I have already presented my view on that matter somewhere in that thread but cannot find it now), in my opinion having log(n) lookup is a reasonable default, which adheres to the least surprise principle. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2bn7h$s8v$1@sea.gmane.org...
Make up you mind finally. What is your library?
1. Runtime parameter facility 2. Permanent storage facility 3. XML parser
I'm afraid it is all 3, plus some more. By "some more" I mean it has more parsers than just XML, and it can also be used to manipulate hierarchical, human readable data structures at runtime.
And that's the main problem with this submission. Instead of clearly specified problem domain and design that address issue in this domain, you present some mixture of half-good components each with unclear advantages over existing dedicated solution in each respective area. I want faster search or no search at all - no can do. I want different some versioning support for permanent storage - no can do. I want some automatic validation and conflict resolution - no can do. I understand it's good enough for you. But this is just the choices you made. Coupling solution for independent problems under the hood on one library is the source of inflexibility and unacceptable for boost IMO. We do need good tree storage. We do need better runtime parameters support library (IMO). We already have quite powerful solution for permanent storage support, but we could use more archive formats for it. What you propose is none of it. Gennadiy

[...] that's the main problem with this submission. Instead of clearly specified problem domain and design that address issue in this domain, you present some mixture of half-good components each with unclear advantages over existing dedicated solution in each respective area. [...] We do need good tree storage. We do need better runtime parameters support library (IMO). We already have quite powerful solution for permanent storage support, but we could use more archive formats for it. What you propose is none of it.
I'm sure you will agree that ptree is different from serialization. If you don't agree, read some of quite interesting posts by Ivan Vecerina. The main difference is that serialization lib is _only_ about translation of some C++ structures to/from a string of bytes. This is a huge and complicated task that it does very well. But you must have noticed that some people refrained from using it, because the files it produces are not "config files" in wide meaning of this word. The main issue with them is they are not hand editable. This cannot and should not be remedied by supplying another parser, just read Robert post outside this thread. On the other hand program options cannot handle hierarchical structures very well unless it is improved, and even then it will probably present more complicated interface than ptree. Best regards, Marcin

Marcin Kalicinski wrote:
I'm sure you will agree that ptree is different from serialization. If you don't agree, read some of quite interesting posts by Ivan Vecerina. The main difference is that serialization lib is _only_ about translation of some C++ structures to/from a string of bytes. This is a huge and complicated task that it does very well. But you must have noticed that some people refrained
I agree. We do use serialization for our data files (via binary archive and export via xml archives so some people here who use python can browse through them and load up the interesting bits that they want, without us having to describe the entire binary format to them). But for config files, serialization is overkill. We don't want object tracking and reference counting in config files. We just want name/value pairs written out, and the extra bonus of a tree structure here is very useful. Another difference that we use when writing config files is that if properties have 'default' values then we don't write those to the file. This makes the file very small in most cases as only values that really do differ between systems are written and it makes it very easy to see if other values have been specifically over-ridden for some systems. This is different from object serialization which reads/writes every item every time. That isn't to say that an archive couldn't be created that new about default values and didn't write them out, but I believe that is not what serialization is intended for and I would use it for that. Cheers Russell

Marcin Kalicinski wrote: Marcin, consider leaving the attribution intact, as I believe Dave has already requested.
[...] that's the main problem with this submission. Instead of clearly specified problem domain and design that address issue in this domain, you present some mixture of half-good components each with unclear advantages over existing dedicated solution in each respective area. [...] We do need good tree storage. We do need better runtime parameters support library (IMO). We already have quite powerful solution for permanent storage support, but we could use more archive formats for it. What you propose is none of it.
I'm sure you will agree that ptree is different from serialization. If you don't agree, read some of quite interesting posts by Ivan Vecerina. The main difference is that serialization lib is _only_ about translation of some C++ structures to/from a string of bytes.
This is not entirely correct, even though Robert himself states so. The serialization library has two separate parts: the first part translates a C++ structure into a sequence of archive calls, the second part (the archive) translates this sequence to either a string of bytes or something else (a property tree, for example.) The problem (in this context) with the serialization library is that its first part inserts semi-opaque metadata into the stream which leads to the output being not particularly well suited for human-readable and editable configuration files. Anyway, as I'm already in the thread... In my opinion, the property tree library is at or above the average Boost quality and should be accepted. It is obvious (at least from a quick reading of the documentation) that it has a reasonably clear purpose and fulfills this purpose pretty well. There is a significant overlap with other existing and future (string_to) Boost components, but the overlap is inherent in the design and cannot be eliminated without making the library something else entirely. It is evident that the library is string-based at its heart and the parameterization on arbitrary key or value types is not yet a first-class citizen. It is likely that the interface needs to evolve a bit in this area. As an example, tree structures that hold intrusive_ptr<element> and can be indexed by path are very useful. On the other hand, we should evaluate whether a non-string key type works at all, since most interesting library features don't seem to be available, it degenerates into the simple struct shown at the start of the tutorial. I very much do not like the 'traits' policy parameter (which is not a traits class at all, even though basic_string says so.) Do not deceive the users that you have a single axis of parameterization when in fact you have six, some of them redundant, some of them more useful than others. basic_ptree should be templatized on key, data, ordering, inserter and extractor, not necessarily in that order. The default arguments should be clearly documented and useful outside of the library context. It should be possible to eliminate char_type, but I'm not 100% positive about it. One easy way to eliminate the distinct separator methods would be to make the path syntax always start with a separator. If we really want a non-string key type, one option I see is to use vector<key_type> as a path. It might be useful to make the library take filesystem paths, getting the operator/ for free. The INI parser could probably support more than one level by using l1.l2.l3.l4 as key (with an appropriate escaping scheme for the real dot.)

"Peter Dimov" <pdimov@mmltd.net> writes:
I very much do not like the 'traits' policy parameter (which is not a traits class at all, even though basic_string says so.) Do not deceive the users that you have a single axis of parameterization when in fact you have six, some of them redundant, some of them more useful than others. basic_ptree should be templatized on key, data, ordering, inserter and extractor, not necessarily in that order. The default arguments should be clearly documented and useful outside of the library context. It should be possible to eliminate char_type, but I'm not 100% positive about it.
It may be possible to use the Boost.Parameter library to help keep this interface manageable. It now supports named *and* deduced (unnamed) template parameters. A deduced template parameter is one whose argument can be provided in any order because its type uniquely determines which parameter it maps to. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Marcin Kalicinski wrote:
[...] that's the main problem with this submission. Instead of clearly specified problem domain and design that address issue in this domain, you present some mixture of half-good components each with unclear advantages over existing dedicated solution in each respective area. [...] We do need good tree storage. We do need better runtime parameters support library (IMO). We already have quite powerful solution for permanent storage support, but we could use more archive formats for it. What you propose is none of it.
I'm sure you will agree that ptree is different from serialization. If you don't agree, read some of quite interesting posts by Ivan Vecerina. The main difference is that serialization lib is _only_ about translation of some C++ structures to/from a string of bytes. This is a huge and complicated task that it does very well. But you must have noticed that some people refrained from using it, because the files it produces are not "config files" in wide meaning of this word. The main issue with them is they are not hand editable. This cannot and should not be remedied by supplying another parser, just read Robert post outside this thread.
I doubt this statement. If you agree to serialize only "good" structures, that require no object tracking, it will be possible to implement archive class that will project no tracking information that upsets you.
On the other hand program options cannot handle hierarchical structures very well unless it is improved, and even then it will probably present more complicated interface than ptree.
This sounds like FUD. How can you make any statements about an as-yet-nonexistent interface? - Volodya

Gennadiy Rozental wrote:
"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2bn7h$s8v$1@sea.gmane.org...
Make up you mind finally. What is your library?
1. Runtime parameter facility 2. Permanent storage facility 3. XML parser
I'm afraid it is all 3, plus some more. By "some more" I mean it has more parsers than just XML, and it can also be used to manipulate hierarchical, human readable data structures at runtime.
And that's the main problem with this submission. Instead of clearly specified problem domain and design that address issue in this domain, you present some mixture of half-good components each with unclear advantages over existing dedicated solution in each respective area. I want faster search or no search at all - no can do.
There is the possibility to rewrite the internals of property_tree with Boost.MultiIndex, and then it could use a hash-index instead internally. Why is having no search so important. How does it hurt you that it is there? (if the small overhead is not important anyway)
I want different some versioning support for permanent storage - no can do.
What does this mean? versioning ala boost.serialization? (aside: I think it would be reasonable for ptree to be serializable)
I want some automatic validation and conflict resolution - no can do.
What does this mean?
I understand it's good enough for you. But this is just the choices you made. Coupling solution for independent problems under the hood on one library is the source of inflexibility and unacceptable for boost IMO.
We can also keep adding customization to a component until it suddenly collapses under its own weight. Gennadiy, I hope we can avoid too much flame-wars. And I hope we can to the buttom of the problems you mention s.t. you also would want to use this library (unlike PO). -Thorsten

Marcin Kalicinski wrote:
Make up you mind finally. What is your library?
1. Runtime parameter facility 2. Permanent storage facility 3. XML parser
I'm afraid it is all 3, plus some more. By "some more" I mean it has more parsers than just XML, and it can also be used to manipulate hierarchical, human readable data structures at runtime.
I seems like you is very confused yourself about positioning of your library. Here, you say "it's all 3" where 3 is "XML parser", and in another thread you say: On the other hand, if boost gets its XML library, It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library? I guess you either need to position your library as generic tree with some extra indexing, or as full-blown XML library. - Volodya

Vladimir Prus wrote:
Marcin Kalicinski wrote:
Make up you mind finally. What is your library?
1. Runtime parameter facility 2. Permanent storage facility 3. XML parser
I'm afraid it is all 3, plus some more. By "some more" I mean it has more parsers than just XML, and it can also be used to manipulate hierarchical, human readable data structures at runtime.
I seems like you is very confused yourself about positioning of your library. Here, you say "it's all 3" where 3 is "XML parser", and in another thread you say:
On the other hand, if boost gets its XML library,
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
Why? -Thorsten

"Thorsten Ottosen" wrote
Vladimir Prus wrote:
On the other hand, if boost gets its XML library,
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
Why?
Property_tree must at least document what subset of XML its XML parser supports though surely? regards Andy Little

Andy Little wrote:
"Thorsten Ottosen" wrote
Vladimir Prus wrote:
On the other hand, if boost gets its XML library,
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
Why?
Property_tree must at least document what subset of XML its XML parser supports though surely?
Of course. This has been discussed elsewhere in the thread. -Thorsten

Thorsten Ottosen wrote:
I seems like you is very confused yourself about positioning of your library. Here, you say "it's all 3" where 3 is "XML parser", and in another thread you say:
On the other hand, if boost gets its XML library,
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
Why?
Why choosing? Well, say I have a task of reading XML config files. I can use both property_tree and some future separate XML library. I can't use both, so I need to choose one. If I choose property_tree, then it's possible that several years later that specific config files will start using XML features not supported by property_tree and I'll need to rewrite my code. - Volodya

Vladimir Prus wrote:
Thorsten Ottosen wrote:
I seems like you is very confused yourself about positioning of your library. Here, you say "it's all 3" where 3 is "XML parser", and in another thread you say:
On the other hand, if boost gets its XML library,
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
Why?
Why choosing?
No, why is it bad?
Well, say I have a task of reading XML config files. I can use both property_tree and some future separate XML library. I can't use both, so I need to choose one.
Right.
If I choose property_tree, then it's possible that several years later that specific config files will start using XML features not supported by property_tree and I'll need to rewrite my code.
Oh, well, that's everyday life for programmers. It's good to "program in the future tense" as Meyer says, but it also a recipe for disaster to take too much into consideration. -Thorsten

Vladimir Prus wrote:
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
When boost gets a full XML library then the property_tree XML parser could (should, IMO) be replaced with a wrapper over that one. What's the problem? -- Daniel Wesslén

Daniel Wesslén wrote:
Vladimir Prus wrote:
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
When boost gets a full XML library then the property_tree XML parser could (should, IMO) be replaced with a wrapper over that one. What's the problem?
There is no need for that. As long as the scope of the xml-parser for a property tree is well-defined wrt. what subset of xml it can handle, the user is free to choose. It is not obvious how more advanced xml-features should map to a property-tree, and it is quite thinkable that it does not make much sense to use those together. -Thorsten

Thorsten Ottosen wrote:
Daniel Wesslén wrote:
Vladimir Prus wrote:
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
When boost gets a full XML library then the property_tree XML parser could (should, IMO) be replaced with a wrapper over that one. What's the problem?
There is no need for that. As long as the scope of the xml-parser for a property tree is well-defined wrt. what subset of xml it can handle, the user is free to choose.
It is not obvious how more advanced xml-features should map to a property-tree, and it is quite thinkable that it does not make much sense to use those together.
Schema validation, XInclude and encodings could be transparently supported. Sure, some advanced features wouldn't map neatly to a property_tree, but that is no reason to rule out support for those that do. -- Daniel Wesslén

Daniel Wesslén wrote:
Thorsten Ottosen wrote:
When boost gets a full XML library then the property_tree XML parser could (should, IMO) be replaced with a wrapper over that one. What's the problem?
There is no need for that. As long as the scope of the xml-parser for a property tree is well-defined wrt. what subset of xml it can handle, the user is free to choose.
It is not obvious how more advanced xml-features should map to a property-tree, and it is quite thinkable that it does not make much sense to use those together.
Schema validation, XInclude and encodings could be transparently supported. Sure, some advanced features wouldn't map neatly to a property_tree, but that is no reason to rule out support for those that do.
I'm not qualified enough to reply to those claims. But let's use this sub-thread to discuss the scope of the xml-parser. -Thorsten

Thorsten Ottosen wrote:
Daniel Wesslén wrote:
Thorsten Ottosen wrote:
When boost gets a full XML library then the property_tree XML parser could (should, IMO) be replaced with a wrapper over that one. What's the problem? There is no need for that. As long as the scope of the xml-parser for a property tree is well-defined wrt. what subset of xml it can handle, the user is free to choose.
It is not obvious how more advanced xml-features should map to a property-tree, and it is quite thinkable that it does not make much sense to use those together.
Schema validation, XInclude and encodings could be transparently supported. Sure, some advanced features wouldn't map neatly to a property_tree, but that is no reason to rule out support for those that do.
I'm not qualified enough to reply to those claims.
But let's use this sub-thread to discuss the scope of the xml-parser.
Given that property_tree isn't meant to be a XML library as such, I don't think there is much need. I agree that is should be clearly documented which features of XML the parser can handle. After that is done, there is no reason not to add features that can fit in the tree. I was simply suggesting that when we get a full-featured parser, then that one should be used for property_tree as well, to support as much of XML as possible within the constraints of what's feasible to store in the ptree. -- Daniel Wesslén

Daniel Wesslén wrote:
Thorsten Ottosen wrote:
But let's use this sub-thread to discuss the scope of the xml-parser.
Given that property_tree isn't meant to be a XML library as such, I don't think there is much need. I agree that is should be clearly documented which features of XML the parser can handle. After that is done, there is no reason not to add features that can fit in the tree.
I was simply suggesting that when we get a full-featured parser, then that one should be used for property_tree as well, to support as much of XML as possible within the constraints of what's feasible to store in the ptree.
Ok, I see. But some users might prefer the current parser because it is more efficient when it doesn't need to worry about advanced features. Those users would then pay when the parser is replaced. -Thorsten

Thorsten Ottosen wrote:
Daniel Wesslén wrote:
Given that property_tree isn't meant to be a XML library as such, I don't think there is much need. I agree that is should be clearly documented which features of XML the parser can handle. After that is done, there is no reason not to add features that can fit in the tree.
I was simply suggesting that when we get a full-featured parser, then that one should be used for property_tree as well, to support as much of XML as possible within the constraints of what's feasible to store in the ptree.
Ok, I see. But some users might prefer the current parser because it is more efficient when it doesn't need to worry about advanced features. Those users would then pay when the parser is replaced.
Ah. In that case I agree with you. I'd like to have both available, but at some point a line has to be drawn, and that place may well be before adding two XML parsers to ptree. Writing a translator from a W3DOM or other XML representation to a ptree should be trivial in any case, and could be provided as an example if nothing else. -- Daniel Wesslén

Daniel Wesslén wrote:
Thorsten Ottosen wrote:
Ok, I see. But some users might prefer the current parser because it is more efficient when it doesn't need to worry about advanced features. Those users would then pay when the parser is replaced.
Ah. In that case I agree with you.
I'd like to have both available, but at some point a line has to be drawn, and that place may well be before adding two XML parsers to ptree. Writing a translator from a W3DOM or other XML representation to a ptree should be trivial in any case, and could be provided as an example if nothing else.
I agree. In fact, I think writing an 'xml parser' that doesn't provide full XML support is asking for trouble. Before long people will want to use XML features not supported by such a stripped-down parser, and will get confused if things break. It is much cleaner to draw the line such that the property tree library provides some builtin storage formats (with associated readers and writers), and an easy way to generate / translate new trees from other formats such as XML. I still need to get back to my XML DOM library proposal... :-( Regards, Stefan

Stefan Seefeld wrote:
Daniel Wesslén wrote:
Thorsten Ottosen wrote:
Ok, I see. But some users might prefer the current parser because it is more efficient when it doesn't need to worry about advanced features. Those users would then pay when the parser is replaced. Ah. In that case I agree with you.
I'd like to have both available, but at some point a line has to be drawn, and that place may well be before adding two XML parsers to ptree. Writing a translator from a W3DOM or other XML representation to a ptree should be trivial in any case, and could be provided as an example if nothing else.
I agree. In fact, I think writing an 'xml parser' that doesn't provide full XML support is asking for trouble. Before long people will want to use XML features not supported by such a stripped-down parser, and will get confused if things break.
When will that "before long" point be? The argument is that in most(?) cases, it will be never. My problem with a full blown XML parser is that we pay for a lot of the features that we do not need. Cheers, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Joel de Guzman wrote:
Stefan Seefeld wrote:
I agree. In fact, I think writing an 'xml parser' that doesn't provide full XML support is asking for trouble. Before long people will want to use XML features not supported by such a stripped-down parser, and will get confused if things break.
When will that "before long" point be? The argument is that in most(?) cases, it will be never. My problem with a full blown XML parser is that we pay for a lot of the features that we do not need.
Sorry for not being clear about what I mean with 'full XML support'. The XML specs are very modular. I'm definitely not talking about things such as validation. However, some aspects are already part of XML parsing (entity lookup, for example), or are quite handy to have (xinclude processing, say). But even if we are only talking about basic XML parsing the parser has to be aware of quite a lot of aspects to be considered standard-conformant. I don't think that can be hacked together quickly. Regards, Stefan

Stefan Seefeld wrote:
Joel de Guzman wrote:
Stefan Seefeld wrote:
I agree. In fact, I think writing an 'xml parser' that doesn't provide full XML support is asking for trouble. Before long people will want to use XML features not supported by such a stripped-down parser, and will get confused if things break. When will that "before long" point be? The argument is that in most(?) cases, it will be never. My problem with a full blown XML parser is that we pay for a lot of the features that we do not need.
Sorry for not being clear about what I mean with 'full XML support'. The XML specs are very modular. I'm definitely not talking about things such as validation. However, some aspects are already part of XML parsing (entity lookup, for example), or are quite handy to have (xinclude processing, say). But even if we are only talking about basic XML parsing the parser has to be aware of quite a lot of aspects to be considered standard-conformant. I don't think that can be hacked together quickly.
Right. FWIW, I think Dan Nuffer's XML parser is not a hack. The spirit XML parsers implement the full XML grammar. http://spirit.sourceforge.net/repository/applications/show_contents.php Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Right. FWIW, I think Dan Nuffer's XML parser is not a hack. The spirit XML parsers implement the full XML grammar.
http://spirit.sourceforge.net/repository/applications/show_contents.php
My knowledge of XML is limited, but I think Dan Nuffer's parser will parse any valid XML. read_xml however discards all that goes beyond nodes, attributes, data and comments. Best regards, Marcin

On 4/24/06, Marcin Kalicinski <kalita@poczta.onet.pl> wrote:
Right. FWIW, I think Dan Nuffer's XML parser is not a hack. The spirit XML parsers implement the full XML grammar.
http://spirit.sourceforge.net/repository/applications/show_contents.php
My knowledge of XML is limited, but I think Dan Nuffer's parser will parse any valid XML. read_xml however discards all that goes beyond nodes, attributes, data and comments.
Isn't the property_tree XML parser originally based on Dan Nuffer's? Couldn't the productions/tokens from the Nuffer parser be added back to read_xml() so that it could at least accept the syntax for all XML files even if it doesn't implement the semantics? I think the runtime overhead of the additional productions in the grammar would be negligible for simple XML files that don't use the features and necessary for XML files that do. It seems to me this could clarify the scope of the parser. The documentation could read something like: "read_xml() preforms non-validated parsing of the W3C recommendation XML 1.1. In addition, as of version 1.3x, read_xml() parses but ignores the following W3C specifications: XML Names, XInclude, XLink/XPointer, XML Schema, XSLT, ..." ... changing version numbers as appropriate. Also, it may simplify maintenance as far as pulling bug-fixes/enhancements from the Nuffer parser code-base to property_tree. Daniel Walker

Daniel Walker wrote:
On 4/24/06, Marcin Kalicinski <kalita@poczta.onet.pl> wrote:
Right. FWIW, I think Dan Nuffer's XML parser is not a hack. The spirit XML parsers implement the full XML grammar.
http://spirit.sourceforge.net/repository/applications/show_contents.php
My knowledge of XML is limited, but I think Dan Nuffer's parser will parse any valid XML. read_xml however discards all that goes beyond nodes, attributes, data and comments.
Isn't the property_tree XML parser originally based on Dan Nuffer's? Couldn't the productions/tokens from the Nuffer parser be added back to read_xml() so that it could at least accept the syntax for all XML files even if it doesn't implement the semantics? I think the runtime overhead of the additional productions in the grammar would be negligible for simple XML files that don't use the features and necessary for XML files that do. It seems to me this could clarify the scope of the parser. The documentation could read something like:
"read_xml() preforms non-validated parsing of the W3C recommendation XML 1.1. In addition, as of version 1.3x, read_xml() parses but ignores the following W3C specifications: XML Names, XInclude, XLink/XPointer, XML Schema, XSLT, ..."
.... changing version numbers as appropriate. Also, it may simplify maintenance as far as pulling bug-fixes/enhancements from the Nuffer parser code-base to property_tree.
Maybe what is needed here is two functions: read_simple_xml(...); read_complex_xml(...); -Thorsten

Daniel Walker wrote:
On 4/24/06, Marcin Kalicinski <kalita@poczta.onet.pl> wrote:
My knowledge of XML is limited, but I think Dan Nuffer's parser will parse any valid XML. read_xml however discards all that goes beyond nodes, attributes, data and comments.
Isn't the property_tree XML parser originally based on Dan Nuffer's? Couldn't the productions/tokens from the Nuffer parser be added back to read_xml() so that it could at least accept the syntax for all XML files even if it doesn't implement the semantics? I think the runtime overhead of the additional productions in the grammar would be negligible for simple XML files that don't use the features and necessary for XML files that do. It seems to me this could clarify the scope of the parser. The documentation could read something like:
"read_xml() preforms non-validated parsing of the W3C recommendation XML 1.1. In addition, as of version 1.3x, read_xml() parses but ignores the following W3C specifications: XML Names, XInclude, XLink/XPointer, XML Schema, XSLT, ..."
... changing version numbers as appropriate. Also, it may simplify maintenance as far as pulling bug-fixes/enhancements from the Nuffer parser code-base to property_tree.
The property tree's parser is, I believe, either a very slightly modifed Dan Nuffer parser (just semantic actions were added, compared to the file I've seen), or built on the same principle: direct translation of the grammar spec in the XML specification. It is, with the exception of missing entities, a complete non-validating parser of the XML spec, as far as I can see, with the important exception of character set compatibility: the parser parses only files in the character set specified by the current global locale, and will completely ignore the character set specification of the header. Another missing part may be the parsing of the internal DTD subset, which might be (not sure yet) a required thing for non-validating parsers. In addition, it is an XML 1.0 parser. The Namespaces in XML, XInclude, XLink, XPointer, ... specifications are all built on top of XML; they are all well-formed XML. "Parsing but ignoring" them means nothing and can only lead to misunderstandings. Sebastian Redl

On 4/29/06, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
Daniel Walker wrote:
On 4/24/06, Marcin Kalicinski <kalita@poczta.onet.pl> wrote:
My knowledge of XML is limited, but I think Dan Nuffer's parser will parse any valid XML. read_xml however discards all that goes beyond nodes, attributes, data and comments.
Isn't the property_tree XML parser originally based on Dan Nuffer's? Couldn't the productions/tokens from the Nuffer parser be added back to read_xml() so that it could at least accept the syntax for all XML files even if it doesn't implement the semantics? I think the runtime overhead of the additional productions in the grammar would be negligible for simple XML files that don't use the features and necessary for XML files that do. It seems to me this could clarify the scope of the parser. The documentation could read something like:
"read_xml() preforms non-validated parsing of the W3C recommendation XML 1.1. In addition, as of version 1.3x, read_xml() parses but ignores the following W3C specifications: XML Names, XInclude, XLink/XPointer, XML Schema, XSLT, ..."
... changing version numbers as appropriate. Also, it may simplify maintenance as far as pulling bug-fixes/enhancements from the Nuffer parser code-base to property_tree.
The property tree's parser is, I believe, either a very slightly modifed Dan Nuffer parser (just semantic actions were added, compared to the file I've seen), or built on the same principle: direct translation of the grammar spec in the XML specification. It is, with the exception of missing entities, a complete non-validating parser of the XML spec, as far as I can see, with the important exception of character set compatibility: the parser parses only files in the character set specified by the current global locale, and will completely ignore the character set specification of the header.
If the missing entities were added, then couldn't we just call it a non-validating XML parser? The character set issues could be mentioned as a caveat in the documentation.
Another missing part may be the parsing of the internal DTD subset, which might be (not sure yet) a required thing for non-validating parsers.
I've tested this alittle and read_xml() does accept some DTDs. DTDs are part of the XML specification, though validation is not required.
In addition, it is an XML 1.0 parser. The Nuffer parser is XML 1.0, right? If that's the case, why not just re-incorporate missing features from the Nuffer parser and then the documentation could say read_xml() is a XML 1.0 parser. That seems less confusing to me.
The Namespaces in XML, XInclude, XLink, XPointer, ... specifications are all built on top of XML; they are all well-formed XML. "Parsing but ignoring" them means nothing and can only lead to misunderstandings.
"parsing but ignoring" may not be the best phrase. I wanted to say something to indicate that though the parser recognizes constructs used in namespaces, includes, etc. (because, yes, they are valid XML), the parser doesn't actually do anything (it just ignores them). For example, it parses the xmlns attribute in an entity but doesn't generate a unique qualified name that children of the entity inherit. Daniel Walker

Daniel Wessl?n wrote:
Vladimir Prus wrote:
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library?
When boost gets a full XML library then the property_tree XML parser could (should, IMO) be replaced with a wrapper over that one. What's the problem?
The problem is that property_tree interface is very different from W3DOM, so we'll have two interfaces for XML which are completely different. - Volodya

Vladimir Prus wrote:
Daniel Wessl?n wrote:
It would be pretty bad if Boost has property_tree that reads "simple" XML *and* another XML library, with different interface that has full support for XML. How am I to choose one or another library? When boost gets a full XML library then the property_tree XML parser could (should, IMO) be replaced with a wrapper over that one. What's the
Vladimir Prus wrote: problem?
The problem is that property_tree interface is very different from W3DOM, so we'll have two interfaces for XML which are completely different.
It's somewhat like situation with multi_index / bimap being discussed at the moment. Multi_index supports everything a bimap would, but its interface is more cumbersome. I for one won't use a W3DOM-like library if we get one, but I would happily use property_tree. I've also only used multi_index once, and that was to use it as a bidirectional map. Property_tree covers other areas as well as being a potential subset of an XML library, but I still hold there is value in such a subset. -- Daniel Wesslén

I seems like you is very confused yourself about positioning of your library.
You are right. I made an effort to change this state of things. Please read it in another thread "[Property Tree] Problem domain". Here's the shortcut http://tinyurl.com/fjmdd Best regards, Marcin

Marcin Kalicinski wrote:
I seems like you is very confused yourself about positioning of your library.
You are right. I made an effort to change this state of things. Please read it in another thread "[Property Tree] Problem domain". Here's the shortcut http://tinyurl.com/fjmdd
"Get it working for 80% of cases now, rather than for 100% never." This is a very good summary. I can add to 5. target for boost::serialization (ptree_archive). that once you have an in-memory archive, you can now edit the in-memory representation with a graphical editor, and deserialize the result back to C++. This allows you to edit your C++ data structures in a live program; you have no idea how convenient this is if you haven't tried it! Whether ptree supports this adequately remains to be seen. There's also 7. If you keep your program objects in a ptree, it is relatively easy to add a quick scripting interface. Given namespace.(namespace...).object.verb get an object pointer from the ptree by using "namespace.(namespace...).object" as a path, then invoke verb on that object. For all its simplicity, this is amazingly useful, too. You can build a key binding module on top of that: ctrl+o factory.document.open ctrl+n factory.document.new and so on. You can also add events to your buttons or menus in this way. It's not Python, but it may be enough.

get an object pointer from the ptree by using "namespace.(namespace...).object" as a path, then invoke verb on that object. For all its simplicity, this is amazingly useful, too. You can build a key binding module on top of that:
ctrl+o factory.document.open ctrl+n factory.document.new
and so on. You can also add events to your buttons or menus in this way. It's not Python, but it may be enough.
This reminds me of another way I used ptree, but because that was several years ago, the details escaped my memory. Basically, it was used as a simple "scripting language". The language was mostly used for evaluating conditions specified in text files. They were read into a property tree, and then a very simple function evaluated them. Hierarchical structure of the data turned out to be very well suited for specifying arbitrarily complex conditions. For example this condition: if (v1 == v2 && (v3 < v4 || v5 > v6)) was encoded like that in the file: Condition And { Equals { Value1 v1 Value2 v2 } Condition Or { Less { Value1 v3 Value2 v4 } Greater { Value1 v5 Value2 v6 } } } This is quite an obscure example, but hopefully shows that ptree has potential uses thay go beyond "configuration reading/writing". This of course could be done using a full blown scripting language like Python. But why summon the monster snake when ptree + a function that was slightly more than 1 screen were enough? Best regards, Marcin

"Peter Dimov" <pdimov@mmltd.net> wrote in message news:017f01c6679f$7e86eaf0$6507a8c0@pdimov2...
Marcin Kalicinski wrote:
I seems like you is very confused yourself about positioning of your library.
You are right. I made an effort to change this state of things. Please read it in another thread "[Property Tree] Problem domain". Here's the shortcut http://tinyurl.com/fjmdd
"Get it working for 80% of cases now, rather than for 100% never."
This is a very good summary.
I can add to
5. target for boost::serialization (ptree_archive).
that once you have an in-memory archive, you can now edit the in-memory representation with a graphical editor, and deserialize the result back to C++. This allows you to edit your C++ data structures in a live program; you have no idea how convenient this is if you haven't tried it!
Whether ptree supports this adequately remains to be seen.
I don't think this is the task that should be placed on ptree shoulders. Would we have generic tree generic tree_archive is indeed could be used for this purpose. There still remains the question how convenient it will be, given requirements of serialization library on how archive should look like (IOW what you really will be able to change other than field values - but this could be done directly obviously) Gennadiy

On Fri, 21 Apr 2006 10:54:57 -0400 "Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote:
I could achieve the same with multi_index serialized/deserialized from
different archives. And I do not see any advantages in your solution.
Gennadiy, You have stated several times that this library is not needed because it is easy to replace with some combination of multi_index, serialization, and program_options. Could you please provide us with an example? I think this is especially important because both MI and serialization have large learning curves, and a number of us have never really used either in real work. For myself, I've played with serialization only enough to realize that it is way too slow for any real work we are doing. This type of stuff is probably not performance prohibitive from a serialization POV. I have also tried program_options, but it is just too awkward to use for anything but trivial applications. I like what property_tree has to offer, but if it is trivially accomplished via existing means, I'd surely like to see an example. Thanks!

"Jody Hagins" <jody-boost-011304@atdesk.com> wrote in message news:20060424102955.2c83d6b2.jody-boost-011304@atdesk.com...
On Fri, 21 Apr 2006 10:54:57 -0400 "Gennadiy Rozental" <gennadiy.rozental@thomson.com> wrote:
I could achieve the same with multi_index serialized/deserialized from
different archives. And I do not see any advantages in your solution.
Gennadiy,
You have stated several times that this library is not needed because it is easy to replace with some combination of multi_index, serialization, and program_options. Could you please provide us with an example?
You better pick up the pace. We way beyond this already ;)) Try Problem Domain thread for more on my views.
I think this is especially important because both MI and serialization have large learning curves, and a number of us have never really used either in real work.
I would assume respective library authors may disagree.
For myself, I've played with serialization only enough to realize that it is way too slow for any real work we are doing. This type of stuff is probably not performance prohibitive from a serialization POV.
I don't see a performance to play any role in this discussion. Once you chose DOM model you already through performance out of the window.
I have also tried program_options, but it is just too awkward to use for anything but trivial applications.
Oh. Here we are in complete consensus. Gennadiy

Marcin Kalicinski wrote:
First and foremost I would like to remind everybody that we already have one library intended to cover this problem domain (completely unacceptable IMO - but that's another story).
You are talking about Program Options library. I don't quite agree it covers the same problem domain. I feel it is more focued on command-line than on reading configuration in general. The biggest difference is that property_tree is a DOM, not a linear structure.
If that's DOM, then why the interface is so radically different from W3W DOM spec? I understand that you mean "DOM" in the broad sense, but anyway, all libraries that pretend to working with XML are using specific DOM interface. It might not be excellent, but it allows one to write essentially the same code using all existing libs. - Volodya

You are talking about Program Options library. I don't quite agree it covers the same problem domain. I feel it is more focued on command-line than on reading configuration in general. The biggest difference is that property_tree is a DOM, not a linear structure.
If that's DOM, then why the interface is so radically different from W3W DOM spec? I understand that you mean "DOM" in the broad sense, but anyway, all libraries that pretend to working with XML are using specific DOM interface. It might not be excellent, but it allows one to write essentially the same code using all existing libs.
The reason why it does not use W3C DOM standard, is because this is not an XML library. It is meant to support as much hierarchical file formats as possible. I believe this can be achieved only through making it as general as possible - thus only one string per data and one string per key (apart from traits customization, which is another story). No "special" nodes, like comments, leaves/branches, attributes etc. The nice side effect of that simplicity is that the library is extremely easy to use in many cases. Think about _three_ lines of code and one include to read arbitrarily-typed value from a config file - in any supported format. I don't think this can be matched by W3C DOM. Best regards, Marcin

"Thorsten Ottosen" <thorsten.ottosen@dezide.com> wrote in message news:44453A4C.2040302@dezide.com... : * What is your evaluation of the design? I wanted to step back from the nit-picky talks about how ptree is not as good as this other boost library for this specific use... However, after thinking of how ptree compares to boost::serialization, I can't take away from my mind the idea that someone should assess the possibility to create a boost::serialization archive that reads/writes objects to/from an in-memory ptree. It seems feasible since boost::serialization can i/o to XML. Similarly, we should then be able to use boost::serialization to i/o ptree to an xml file - preferably resulting in the exact same data as if objects had been serialized directly to XML. Could we get an expert/author/maintainer of boost::serialization to assess this possiblity ? (I admit I barely tried boost::s11n) This might seem cumbersome to do at this point, but the potential for synergy (sharing a common file format back-end, and leveraging the current s11n infrastructure for converting objects to ptree textual data). Am I just crazy to believe that this could be an option? [naysayers please abstain if uninformed] Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

However, after thinking of how ptree compares to boost::serialization, I can't take away from my mind the idea that someone should assess the possibility to create a boost::serialization archive that reads/writes objects to/from an in-memory ptree. It seems feasible since boost::serialization can i/o to XML.
Similarly, we should then be able to use boost::serialization to i/o ptree to an xml file - preferably resulting in the exact same data as if objects had been serialized directly to XML.
I'm really pleased to say I have already done that. It is indeed simple, about 100 lines of code for input and output archives. The implementation of ptree_archive class is not first class at the moment (it's rather a quick hack), so I refrain from posting it at the moment. But please have a look at this sample of how serialization data looks in JSON format. To create that I just used write_json() after writing some simple C++ data structures to ptree by means of ptree_archive: { "d": { "class_id_type": "1", "class_name_type": "class Derived", "tracking_type": "0", "version_type": "0", "Base": { "class_id_optional_type": "0", "tracking_type": "0", "version_type": "0", "b": "78" }, "d": "111" }, "d2": { "class_id_type": "1", "Base": { "b": "11" }, "d": "146" }, "b": { "class_id_type": "0", "b": "52" }, "i": "3", "v": { "class_id_optional_type": "2", "tracking_type": "0", "version_type": "0", "count": "5", "item": { "count": "3", "item": "1", "item": "2", "item": "3" }, "item": { "count": "0" }, "item": { "count": "0" }, "item": { "count": "0" }, "item": { "count": "0" } } } Note how all the housekeeping stuff (class_ids, tracking_ids etc.) makes is unsuitable for hand-editing. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2bm6r$pnb$1@sea.gmane.org... :> However, after thinking of how ptree compares to boost::serialization, : > I can't take away from my mind the idea that someone should : > assess the possibility to create a boost::serialization archive : > that reads/writes objects to/from an in-memory ptree. : > It seems feasible since boost::serialization can i/o to XML. : > : > Similarly, we should then be able to use boost::serialization : > to i/o ptree to an xml file - preferably resulting in the : > exact same data as if objects had been serialized directly : > to XML. : : I'm really pleased to say I have already done that. It is indeed simple, : about 100 lines of code for input and output archives. The implementation of : ptree_archive class is not first class at the moment (it's rather a quick : hack), so I refrain from posting it at the moment. : : But please have a look at this sample of how serialization data looks in : JSON format. To create that I just used write_json() after writing some : simple C++ data structures to ptree by means of ptree_archive: Good. As a start this highlights some of the differences between ptree and s11n - which I didn't think of because I don't use s11n: Even though it supports XML, serialization is designed to function without field names. Adding XML tags is an afterthought, a limping add-on. The output is less suitable for human editing. At least when no dynamic typing is involved (e.g. an in-line field of an enclosing class), I would have hoped that no additional type and version tags would be needed. There sure are good reasons for that in s11n, but it does make the mechanism less suitable for human-editable data files. And probably also for XML support in general. I suspect also that the file-read system in s11n is sensitive to the order in which the fields are created. It would be great to find a common path between ptree and s11n. Maybe we will some day. But it is not reasonable to say that ptree is fully redundant with existing boost libraries. Not any more more than regex is redundant with boost::spirit. Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

On 4/22/2006 3:55, Ivan Vecerina wrote:
At least when no dynamic typing is involved (e.g. an in-line field of an enclosing class), I would have hoped that no additional type and version tags would be needed. There sure are good reasons for that in s11n, but it does make the mechanism less suitable for human-editable data files. And probably also for XML support in general. I suspect also that the file-read system in s11n is sensitive to the order in which the fields are created.
Just a note, type tracking can be disabled in the case you described and version tags can be disabled in general. In this case, hand-editing becomes feasible: boost::serialization XML archives are what I currently use for config files, though my needs may be more limited than most. You're right that the order of fields matters in boost::serialization, but the order of tags does matter in XML in general.

On Sun, 23 Apr 2006 11:01:10 -0400, Eugene Talagrand wrote
Just a note, type tracking can be disabled in the case you described and version tags can be disabled in general. In this case, hand- editing becomes feasible: boost::serialization XML archives are what I currently use for config files, though my needs may be more limited than most. You're right that the order of fields matters in boost::serialization, but the order of tags does matter in XML in general.
Well, the order of fields only matters in boost::serialization because all the current archive classes read in order from the serialized data. I assure you that a serialization archive can be built that doesn't depend directly on order. Essentially the archive would read and parse the data into an intermediate structure (property tree might be good :-) and then responds to the deserialization requests using a mapped based lookup instead of getting the next data from the file. The same exact implementation can be used to make a binding to a database buffer derived from an sql command -- this also might not have the fields in the same order as the object being serialized. Not providing support for this case was one of the reasons that boost::serialization was orginally rejected. Jeff

Jeff Garland wrote:
On Sun, 23 Apr 2006 11:01:10 -0400, Eugene Talagrand wrote
Just a note, type tracking can be disabled in the case you described and version tags can be disabled in general. In this case, hand- editing becomes feasible: boost::serialization XML archives are what I currently use for config files, though my needs may be more limited than most. You're right that the order of fields matters in boost::serialization, but the order of tags does matter in XML in general.
Well, the order of fields only matters in boost::serialization because all the current archive classes read in order from the serialized data.
I did it this way for two reasons: a) it was easier b) it supports arbitrary sized archives - they don't have to be loaded all into memory.
I assure you that a serialization archive can be built that doesn't depend directly on order. Essentially the archive would read and parse the data into an intermediate structure (property tree might be good :-) and then responds to the deserialization requests using a mapped based lookup instead of getting the next data from the file.
I'm surprised that no one (that I know of) hasn't done this.
The same exact implementation can be used to make a binding to a database buffer derived from an sql command -- this also might not have the fields in the same order as the object being serialized.
Not providing support for this case was one of the reasons that boost::serialization was orginally rejected.
Hmm it still doesn't permit re-ordering of the data items at any level - still it got accepted. Maybe I just outlasted my critics. Robert Ramey

On Sun, 23 Apr 2006 09:19:54 -0700, Robert Ramey wrote
Jeff Garland wrote:
Well, the order of fields only matters in boost::serialization because all the current archive classes read in order from the serialized data.
I did it this way for two reasons: a) it was easier b) it supports arbitrary sized archives - they don't have to be loaded all into memory.
Yep, it's the way most serialization libs work -- nothing wrong with it at all.
I assure you that a serialization archive can be built that doesn't depend directly on order. Essentially the archive would read and parse the data into an intermediate structure (property tree might be good :-) and then responds to the deserialization requests using a mapped based lookup instead of getting the next data from the file.
I'm surprised that no one (that I know of) hasn't done this.
Me too...
The same exact implementation can be used to make a binding to a database buffer derived from an sql command -- this also might not have the fields in the same order as the object being serialized.
Not providing support for this case was one of the reasons that boost::serialization was orginally rejected.
Hmm it still doesn't permit re-ordering of the data items at any level - still it got accepted. Maybe I just outlasted my critics.
Not sure what you mean. Surely if I write MyFancyArchive class I can simply load the data and lookup on the fieldname during the load step, right? I agree this isn't how the current archives work, but I don't believe this design is precluded. Of course it requires nvp style in the object eg: ar & make_nvp("field1", obj.field1); Jeff

Jeff Garland wrote:
On Sun, 23 Apr 2006 09:19:54 -0700, Robert Ramey wrote
Jeff Garland wrote:
I assure you that a serialization archive can be built that doesn't depend directly on order. Essentially the archive would read and parse the data into an intermediate structure (property tree might be good :-) and then responds to the deserialization requests using a mapped based lookup instead of getting the next data from the file. I'm surprised that no one (that I know of) hasn't done this.
Me too...
Been planning to for the last 6 months for some of our config files, but other priorities at wotk that might make us some money keep getting in the way :) Cheers Russell

In article <20060423164438.M9591@crystalclearsoftware.com>, "Jeff Garland" <jeff@crystalclearsoftware.com> wrote:
On Sun, 23 Apr 2006 09:19:54 -0700, Robert Ramey wrote
Jeff Garland wrote:
I assure you that a serialization archive can be built that doesn't depend directly on order. Essentially the archive would read and parse the data into an intermediate structure (property tree might be good :-) and then responds to the deserialization requests using a mapped based lookup instead of getting the next data from the file.
I'm surprised that no one (that I know of) hasn't done this.
Me too...
I have done it for serialization to Mac OS X property lists (which are basically map<string, variant<vector, map, string, integer>>).
Not sure what you mean. Surely if I write MyFancyArchive class I can simply load the data and lookup on the fieldname during the load step, right? I agree this isn't how the current archives work, but I don't believe this design is precluded. Of course it requires nvp style in the object eg:
ar & make_nvp("field1", obj.field1);
This design is not precluded, that's exactly what I did in my Mac OS X property list implementation. Ben -- I changed my name: <http://periodic-kingdom.org/People/NameChange.php>

The documentation may be viewed online at
BTW, this URL has timed-out every time I've tried the past week. But the docs were in the zip file in the boost vault.
* What is your evaluation of the design?
Seems good. I like that I can write get("myvalue",default) which will always return, or get("myvalue") which will throw if not found. The use of <xmlattr> to access attributes is ugly, but I don't have a better suggestion (except shortening it slightly to "<attr>"). One alternative would be a get_attr() family of functions but that they would be XML-specific, so that is uglier.
* What is your evaluation of the implementation?
Didn't look.
* What is your evaluation of the documentation?
Good enough. It needs a few minor fixes by a native English speaker. I didn't see the need for get_own(), so that could do with a motivating example.
* What is your evaluation of the potential usefulness of the library?
I think it fits the desired goal of reading in a configuration file. Versus Program Options: I've not used this library, but when I skimmed the library docs yesterday I didn't see examples of reading an XML configuration file. Or JSON. Versus Serialization: Hhhmmm... Firstly I've seen people saying the serialization library is a heavyweight solution. In terms of number of lines you need to write I've not found this. It is as simple to use as Property Tree. If I was reading/writing a config file and the editing of settings was only done through my program, then I would definitely use the serialization library. The problem with the serialization library is that the XML format is not easy to edit. I doubt it can be friendlier and still work, or Robert Ramey would have done it that way? There is a need to be able for a program to be able to write to its setting file but also have the settings file be easy to view and edit. The Property Tree library fills this need.
* Did you try to use the library? With what compiler?
No.
* How much effort did you put into your evaluation?
Just reading the docs once, and following the discussion here.
* Are you knowledgeable about the problem domain?
Not particularly.
* Do you think the library should be accepted as a Boost library?
Ideally I think it should be merged into the Boost Serialization library as additional format options (i.e. in addition to the current binary, text and xml formats, add easy-to-read-xml, json and ini files). If, as I suspect, that is technical not feasible, then I vote Yes.
- was the library suitable for your daily xml-tasks? If not, what was missing?
Probably not. XML files containing lots of data (as opposed to a relatively short, human-editable configuration file) usually use namespaces, entities, CDATA, etc. I usually use PHP for dealing with such files, and I'm satisfied with that. Darren

Darren Cook wrote:
* Do you think the library should be accepted as a Boost library?
Ideally I think it should be merged into the Boost Serialization library as additional format options (i.e. in addition to the current binary, text and xml formats, add easy-to-read-xml, json and ini files).
If, as I suspect, that is technical not feasible, then I vote Yes.
I have some problems parsing that statement. Do you only vote yes *iff* this is bundled with serialization? -Thorsten
participants (29)
-
Andy Little
-
Ben Artin
-
Daniel Walker
-
Daniel Wesslén
-
Darren Cook
-
David Abrahams
-
Eric Berdahl
-
Eugene Talagrand
-
Gennadiy Rozental
-
Ivan Vecerina
-
Jeff Flinn
-
Jeff Garland
-
Jody Hagins
-
Joel de Guzman
-
John Maddock
-
Jorge Lodos
-
Marcin Kalicinski
-
Martin Adrian
-
Matias Capeletto
-
Michael Fawcett
-
Pavel Vozenilek
-
Peter Dimov
-
Robert Ramey
-
Rune
-
Russell Hind
-
Sebastian Redl
-
Stefan Seefeld
-
Thorsten Ottosen
-
Vladimir Prus