[Boost Review] Property Tree Library accepted

Dear All, I have the great pleasure to announce that Marcin Kalicinski's Property Tree Library have been accepted into Boost (www.boost.org). The votes were as follows: - yes: 11 - no: 3 Congratulations to Marcin! I would like to thank the following people for doing a review: Sebastian Redl Matt Doyle Pavel Vozenilek Ivan Vecerina Gennadie Rozenal Tom Brinkman Darren Cook Alexander Belyakov John Maddock Daniel Wesslén Jose Paul A Bristow Jeff Hardy. Matias Capeletto I apologize if I have overlooked anybody. Also thanks to the *countless* people who participated in the dicsussion during the review: none mentioned, none forgotten (as we say in Denmark). Below follows more comments on the issues raised in the review. best regards Thorsten Ottosen, review manager Detailed review comments ------------------------ The general feedback was positive, with one reviewer even said he would use the library even if it was not accepted into Boost. Experienced boosters saw the general quality of the submission as higher than average. There was also some strong concerns by a few long-time boost members, most notably because of a potential overlap with Boost.Program Options. I think in the end it was fairly clear that the two libraries take different approaches to the problem. Even so, the property-tree library should clearly state the limitations of its program option parser component and via its documentation refer to Boost.Program Options for a more scallable solution. There were quite a big number of people that thought the library could and should be further polished / enhanced. There is nothing new to this, and I hope Marcin will work hard to incorporate/try out as many of the ideas from the review. (There have been quite a few libraries which went through substantial changes after the review and before the post-review.) When Marcin feels his design/functionality is rock-solid, he should do a post-review on the developer list to clear up remaining matters. Here's a list of ideas that should be considered for the post-review: 1. documentation needs smaller topics, more examples and syntax highlighting 2. implementation: it could be a possibility to use Boost.Multi Index to give a space-optimal implementation. We should not underestime the tasks the library vould be used for, so space/performce does matter. The idea of flattening queries with wildcard syntax "prop.foo.*" might be something to look into 3. support for non-standard strings was seen as a must 4. the issue of preserving white-spaces and comments was important to some people, which is understandable if the library is used to read and write config files 5. serialization support would be nice 6. there should be clear statements on the limitations of the supplied parsers 7. many would like to see the incorporation of a path concept 8. many were intersted in even more syntactic sugar to make the library even easier to use, eg. by using (),[] etc cleverly. Thanks again to Marcin for his submission. -Thorsten

Dear Everybody, It was a real pleasure to read this post. I am sure there are many people who feel the library needs polishing and rethinking in some areas. A lot of interesting ideas were proposed during the review, and I already started work on the next revision. I hope the changes I intend to do will make these people less concerned about the quality and usefulness of the library, while they will not detract from what other people found valuable. Below is a (possibly incomplete) list of improvements I intend to add in future revisions: - path class, orthogonal to the tree - configurable location of metadata generated by the parsers (e.g. XML attrributes information). It may be possible to achieve that for free with paths - generation of comments / layout metadata by parsers - splitting traits - support for using custom types for keys, again path class looks like a solution - documentation rewrite to QuickBook, and enhancements (thanks to Charles Brockman who provided tons of English-related corrections) I hope I can get most pressing items done quickly. Finally, I would like to thank all the people who contributed a review or comments, and really big thanks to Thorsten who managed the review. Thank you, Marcin

Marcin, comments below. Marcin Kalicinski wrote:
Dear Everybody,
It was a real pleasure to read this post. I am sure there are many people who feel the library needs polishing and rethinking in some areas. A lot of interesting ideas were proposed during the review, and I already started work on the next revision. I hope the changes I intend to do will make these people less concerned about the quality and usefulness of the library, while they will not detract from what other people found valuable. Below is a (possibly incomplete) list of improvements I intend to add in future revisions:
- path class, orthogonal to the tree - configurable location of metadata generated by the parsers (e.g. XML attrributes information). It may be possible to achieve that for free with paths - generation of comments / layout metadata by parsers - splitting traits - support for using custom types for keys, again path class looks like a solution - documentation rewrite to QuickBook, and enhancements (thanks to Charles Brockman who provided tons of English-related corrections)
I unfortunately did not have the time to review your library, even though I knew I would try to use it in my configuration file handling stuff. Hope you don't mind if I add a couple of wishes/suggestions based on my recent usage: - It should be possible pass the path separator char to the ctor, which then should be used as the default for all operations. This wish might be partly invalidated if/when you implement a path class. I still haven't stumbled upon a use case where different path separators need to be used within the same property tree. - Preserve the possibility (in path class or wherever) to use a plain text string to specifiy a path. Please. - Add a 'name()/full_name()' method to the interface; the name() method would return the current tree 'leaf' name and the full_name() returns the complete path from the root. - Add 'parent()' method. - Add support for relative paths; e.g. ptree::get("../sibling/arg") - I'm personally not found of e.g. operator[] acting the way as the std::map subscription operator. If you implement operator overloading for the path handling, please support failures being reported if the path elements does not exist (via supplied error handling policy or similar). Thanks for an already very usable library. Regards / Johan

- Add a 'name()/full_name()' method to the interface; the name() method would return the current tree 'leaf' name and the full_name() returns the complete path from the root. - Add 'parent()' method. - Add support for relative paths; e.g. ptree::get("../sibling/arg")
I think you have misunderstood the implementatation of the ptree. There are no "upward" links in the structure so there is no way for a node to know its parent or root. Personally I think it would be a good idea to let each node carry parent and/or root information. That way you could store tree-wide information (e.g. separator, locale, source) in the root. Only downside I can think of is O(N) swap instead of O(1).

Martin Adrian wrote:
- Add a 'name()/full_name()' method to the interface; the name() method would return the current tree 'leaf' name and the full_name() returns the complete path from the root. - Add 'parent()' method. - Add support for relative paths; e.g. ptree::get("../sibling/arg")
I think you have misunderstood the implementatation of the ptree. There are no "upward" links in the structure so there is no way for a node to know its parent or root.
Personally I think it would be a good idea to let each node carry parent and/or root information. That way you could store tree-wide information (e.g. separator, locale, source) in the root.
Only downside I can think of is O(N) swap instead of O(1).
That would not be a real swap() and should be called something else. -Thorsten

Personally I think it would be a good idea to let each node carry parent and/or root information. That way you could store tree-wide information (e.g. separator, locale, source) in the root.
Only downside I can think of is O(N) swap instead of O(1).
You mean O(N) if node carried root+parent, and O(1) if parent only, or am I missing something? I think it should only carry parent. Otherwise almost all mutating operations would invoke additional hierarchy walking, and caused updating of the root pointer everywhere. Root could be obtained from parents in O(log n), I think this is cheap enough. Anyway, I plan to get rid of separator overloads in favor of paths. I.e. if one wants a different separator, please use a path object. Thank you, Marcin

"Martin Adrian" <adrianm@touchdown.se> skrev i meddelandet news:loom.20060522T160059-968@post.gmane.org...
- Add a 'name()/full_name()' method to the interface; the name() method would return the current tree 'leaf' name and the full_name() returns the complete path from the root. - Add 'parent()' method. - Add support for relative paths; e.g. ptree::get("../sibling/arg")
I think you have misunderstood the implementatation of the ptree. There are no "upward" links in the structure so there is no way for a node to know its parent or root.
I didn't misunderstand it - I was just missing the functionality from a users' viewpoint (without caring about the implementation details). // Johan

[...] Hope you don't mind if I add a couple of wishes/suggestions based on my recent usage:
You're very welcome! Suggestions based on real-life experience are the most valuable. As you can see below I agree with most of your comments.
- It should be possible pass the path separator char to the ctor, which then should be used as the default for all operations. This wish might be partly invalidated if/when you implement a path class.
One problem with it is that different nodes in the same hierarchy may end up with different separators. Another problem is that every node will store a copy of the separator, a waste of memory.
- Preserve the possibility (in path class or wherever) to use a plain text string to specifiy a path. Please.
Yes, definitely. No worries :-) Judging by people's comments this has been one of the biggest selling points of the library. Judging by my experience with ptree, string paths are very useful. While path classes may provide additional flexibility and performance where it is needed, string paths will continue to provide simplicity and ease of use in a majority of real life cases. How exactly this will be implemented is another issue.
- Add a 'name()/full_name()' method to the interface; the name() method would return the current tree 'leaf' name and the full_name() returns the complete path from the root.
If done, this should return a path object which could optionally be turned into a string if needed. But it is dependent on having parent data member.
- Add 'parent()' method.
The reason why there is no parent() is that originally ptree held children by pointers, not by values. This allowed, among other things, storing a single child in many parents. Obviously, there was no good way to implement parent() then, so it wasn't there. I dropped the pointers long time ago and replaced them with values, because they were causing too much trouble and some of the advantages they provided could be simulated another way. Now that the tree only has one parent, parent() can be done easily. At the moment I don't see any reason it should not be done.
- Add support for relative paths; e.g. ptree::get("../sibling/arg")
Yes, if parent() is implemented, paths must support some sort of tree-walking.
- I'm personally not found of e.g. operator[] acting the way as the std::map subscription operator.
I've done an experiment with operator [] recently. John Maddock presented it quite nicely, proxy objects for nonexisting nodes, templated conversion operators, etc., so I didn't have to do much thinking. It is very clever and works, at least as far as I implemented it. But after the experiment, I think it is actually _too_ clever. In case of ptree, I believe operator [] based interface is inferior to named functions. The reason is readability. There are too many operations one can do on a node to cram them all into a single-parameter operator. From my experiment, it looked more like an exercise in C++ aerobatics, than a robust solution. Concluding, it has its virtues, and I don't shelve it yet, but I must try something else first. Regards, Marcin
participants (4)
-
Johan Nilsson
-
Marcin Kalicinski
-
Martin Adrian
-
Thorsten Ottosen