[Boost Review] Property tree review

newer
Re: [boost] [Boost-users] portable...

Jeff Hardy

29 Apr 2006 29 Apr '06

6:17 p.m.

I'm going to vote YES on inclusion of the Property Tree library. It has its faults, but I think it's useful enough right now to be part of Boost. * What is your evaluation of the design? The most valuable parts of the library are the parsers and the path functions. I agree with what others have mentioned: the paths should be seperate from the tree. I would be using it to parse various file formats into a tree, and then use my own path language on top of that. For me, then, the parsers are the most important part of this library. Having a (mostly) common structure is very nice. The reading and writing interface is as simple as possible; navigating the tree is also resonably simple. * What is your evaluation of the implementation? I didn't inspect the implementation closely, but I did notice some things during use. I don't like how XML attributes are stored. It should at least be an option to treat <x y="1" z="2" /> and <x> <y>1</y> <z>2</z> </x> as the same structure, without the "<xmlattrs>" node. If there's a confilict between an attribute name and a child element name, then reading/writing should fail. The ability to preserve comments in a file would be very nice. * What is your evaluation of the documentation? A bit sparse. The reference was mostly what I used, but more examples in the docs is always a good idea. A better "mission statement" about what the library is for would help as well; right now, it's not very clear what the author thinks the library is trying to do. The tree descriptions used in the descriptions of the parsers are not very clear; specifying what is in child.second.data() and what is in child.first should be explicit. An example of directly manipulating the tree (without using get()/put()) would be useful as well. * What is your evaluation of the potential usefulness of the library? As a general XML parser, it's not very useful (no namespaces, etc.); as a generic tree, it's not very useful (not general enough). As a utility for parsing various file formats into similar data structures (and writing them back), it's useful. Loading program configuration seems to be the primary goal, and it does that well enough. Trying to get the library to do other things might be stretching it too far. * Did you try to use the library? With what compiler? Did you have any problems? Yes, with MSVC 8. I had no problems beyond my own reading incomprehension. * How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? I adapted the backend of my own configuration system (which previously used TinyXML) to use property trees. However, I didn't use the get*() functions, since I have my own path system already built. * Are you knowledgeable about the problem domain? Resonably. I plan on using the library whether it is accepted into Boost or not, and I'd like to thank Marcin for his work. -Jeff Hardy

Show replies by date

Stephen Dolan

29 Apr 29 Apr

8:45 p.m.

Not a full review, just a comment: The proposed property tree library uses a small internal language which is passed to the various functions, for instance: file = ptree.get<std::string>("config.filename"); Personally, I don't think its a good idea to define new syntaxes and pass them around as strings. It would be a bit more effort to implement, but I think (something like) the following would be preferable: file = ptree.get<std::string>["config"]["filename"] I'm thinking largely of the dangers of passing a user-supplied string as part of the path, as in get<std::string>("config."+prop_name); since this can lead to insecurities as the query language gets more powerful. (e.g. sql injection in php and other languages, people exploiting perl scripts by passing carefully crafted strings which work their way into an eval, etc). Also, there is an additional runtime overhead in parsing the string, which can be avoided by doing it at compile-time using C++'s native syntax. Apart from that, it seems useful and worthy of inclusion in Boost.

Bronek Kozicki

9:01 p.m.

Stephen Dolan wrote:

...

The proposed property tree library uses a small internal language which is passed to the various functions, for instance:

file = ptree.get<std::string>("config.filename");

Personally, I don't think its a good idea to define new syntaxes and pass them around as strings. It would be a bit more effort to implement, but I think (something like) the following would be preferable:

file = ptree.get<std::string>["config"]["filename"]

I strongly agree with this point. In case I do not find time for a full review, this is the most important thing that I'd point out. I'd also like to add that there should be clear distinction between nodes and leafes, eg. operator() (T t) - returns leaf (I do not insist on syntax) operator() [T t] - returns node thus: file = ptree.get<std::string>["config"]["filename"]; would mean 'take whatever default value there is for node "filemane" which belongs to node "config"'; while: file = ptree.get<std::string>["config"]("filename"); would mean 'take value "filemane" which belongs to node "config"'. I'm fairly familiar with problem domain, as I implemented similar structure with rich interface for commercial product (the structure served as kind of DOM storage for EDIFACT documents). B.

Marcin Kalicinski

30 Apr 30 Apr

11:40 a.m.

...

[...] I'd also like to add that there should be clear distinction between nodes and leafes, eg. operator() (T t) - returns leaf (I do not insist on syntax) operator() [T t] - returns node

There are more operations one can perform on a node than just getting. So I would rather replace operator () overload with explicitly named functions: get, put, set, del. The use will look like that: int n = pt["a.very"]["long"]["path"].get<int>(); pt["a.very"]["long"]["path"].del(); pt["a.very"]["long"]["path"].put(17); Best regards, Marcin

Andy Little

3:19 p.m.

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e327rh$9tu$1@sea.gmane.org...

...

There are more operations one can perform on a node than just getting. So I would rather replace operator () overload with explicitly named functions: get, put, set, del.

The use will look like that:

int n = pt["a.very"]["long"]["path"].get<int>(); pt["a.very"]["long"]["path"].del(); pt["a.very"]["long"]["path"].put(17);

I would rather expect something like the following. This makes a path a simple object. Other required path functionality might be serialising to/from a file as has been wished for, but that is trivial and output specific IMO using namespace boost::assign; typedef std::string path_element1; std::vector<path_element1> path1; path1 += "a","very","long","path"; boost::property_tree<path_element1> pt1; int n1 =pt1.get(path1,n1); pt1[path1].del(); pt1[path1].put(17); typedef int path_element2; std::vector<path_element2> path2; path2 += 1,2,3,4; boost::property_tree<path_element2> pt2; int n2 = pt2.get(pt2,n2); pt2[path2].del(); pt2[path2].put(17); } regards Andy Little (FWIW a 5 minute impl of such tree with root and branch functionality follows ---> ----------------------------------- #include <cassert> namespace boost{ template <typename Key> struct node; /* ABC class for elements of member nodes of a property_tree */ template <typename Key> struct abc_node_element{ private: friend struct node<Key>; protected: const Key m_key; abc_node_element(Key const & key) : m_key(key){} virtual ~abc_node_element(){}; virtual abc_node_element* clone() const =0; abc_node_element():m_key(Key()){}; }; /* node_element implementation for member nodes of property_tree root or branch depends on Data parameter Data is a container (list/vector/property_tree etc) for branch or just data for data */ template <typename Key, typename Data> struct node_element : abc_node_element <Key>{ node_element(Key const & k, Data const & d) : abc_node_element<Key>(k), m_data (d){} Data m_data; void put(Data const & d){m_data = d;} abc_node_element<Key>* clone() const { return new node_element(this->m_key,this->m_data); } }; /* a member property_tree node holds a private polymorphic node_element */ template <typename Key> struct node{ node():m_element(0){} node& operator = (node const & in); node(node const &); template <typename Data> node( Key const & k, Data const & d) { m_element = new node_element<Key,Data>(k,d); } ~node(){ delete m_element;} template <typename AltData> void put(AltData const & d) { assert(this->m_element); Key k = this->m_element->m_key; delete m_element ; m_element = 0; m_element = new node_element<Key,AltData>(k,d); } void del(){delete m_element; m_element = 0;} template <typename T> T & get() { node_element<T>* pe = dynamic_cast<node_element<T>*> m_element; assert (pe); return pe->get(); } template <typename T> T const & get() const { const node_element<T>* pe = dynamic_cast<const node_element<T>*> m_element; assert (pe); return pe->get(); } private: abc_node_element<Key>* m_element; }; template <typename KeyType> struct property_tree{ node<KeyType> m_node; template <typename Path> node<KeyType> & operator[]( Path const & p); template <typename Path> node<KeyType> const & operator[]( Path const & p)const; template <typename Path, typename Data> Data& get(Path const & p, Data & d); }; }//boost //---------------------------- #include <boost/assign.hpp> #include <vector> #include <string> int main() { using namespace boost::assign; typedef std::string path_element1; std::vector<path_element1> path1; path1 += "a","very","long","path"; boost::property_tree<path_element1> pt1; int n1 =pt1.get(path1,n1); pt1[path1].del(); pt1[path1].put(17); typedef int path_element2; std::vector<path_element2> path2; path2 += 1,2,3,4; boost::property_tree<path_element2> pt2; int n2 = pt2.get(pt2,n2); pt2[path2].del(); pt2[path2].put(17); }

Matias Capeletto

29 Apr 29 Apr

9:07 p.m.

On 4/29/06, Stephen Dolan <stedolan2@gmail.com> wrote:

...

Not a full review, just a comment:

The proposed property tree library uses a small internal language which is passed to the various functions, for instance:

file = ptree.get<std::string>("config.filename");

Personally, I don't think its a good idea to define new syntaxes and pass them around as strings. It would be a bit more effort to implement, but I think (something like) the following would be preferable:

file = ptree.get<std::string>["config"]["filename"]

I'm thinking largely of the dangers of passing a user-supplied string as part of the path, as in get<std::string>("config."+prop_name); since this can lead to insecurities as the query language gets more powerful. (e.g. sql injection in php and other languages, people exploiting perl scripts by passing carefully crafted strings which work their way into an eval, etc). Also, there is an additional runtime overhead in parsing the string, which can be avoided by doing it at compile-time using C++'s native syntax.

If you are interested in variants of this aproach please search the list (path concept, operator/, operator[], will lead you to the posts). We have had quite active discussion in this topic. Many proposals from many people were develop and throw to the list. Marcin have said that he will indeed change the interface (i think in a way were the actual sintaxis is seamisly supportted). Now is up to him, to choose and merge the ideas in a coherent framework. Regards, Matias Capeletto PD: I don't know if i can, but i rise both hands to vote yes to ptree.

Thorsten Ottosen

10 p.m.

Matias Capeletto wrote:

...

On 4/29/06, Stephen Dolan <stedolan2@gmail.com> wrote:

...
Not a full review, just a comment:

...

PD: I don't know if i can, but i rise both hands to vote yes to ptree.

Sure you can :-) And I won't forget your oppinion, whether its for or against the library. Just note that review managers usually pay more attention to full reviews than comments. That's partly why one of the review questions is "how much time did you put into the review?". Of course, many other factors apply: knowledge of domain, how well respected you are to the review manager, how balanced and fair your review was etc. -Thorsten

Matias Capeletto

10:47 p.m.

On 4/29/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:

...

Matias Capeletto wrote:

...
On 4/29/06, Stephen Dolan <stedolan2@gmail.com> wrote:

...
Not a full review, just a comment:

I did not write this line... is from Stephen

...

...
PD: I don't know if i can, but i rise both hands to vote yes to ptree.

Sure you can :-) And I won't forget your oppinion, whether its for or against the library.

Just note that review managers usually pay more attention to full reviews than comments. That's partly why one of the review questions is "how much time did you put into the review?". Of course, many other factors apply: knowledge of domain, how well respected you are to the review manager, how balanced and fair your review was etc.

I give my full view in very long an boring post the second day of this formal review. I spend several hours in it... struggle to understand old the .hpp... i dont know if it was good, but surely was not just a comment :) I was only trying to guide Stephen to the long thread of separate path concept that we have a week ago... Regards Matias Capeletto

Thorsten Ottosen

30 Apr 30 Apr

5:44 a.m.

Matias Capeletto wrote:

...

On 4/29/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:

...
Matias Capeletto wrote:

...
On 4/29/06, Stephen Dolan <stedolan2@gmail.com> wrote:

...
Not a full review, just a comment:

I did not write this line... is from Stephen

Sorry about the misunderstanding. -Thorsten

Pavel Vozenilek

29 Apr 29 Apr

9:50 p.m.

"Stephen Dolan" wrote:

...

file = ptree.get<std::string>("config.filename");

Personally, I don't think its a good idea to define new syntaxes and pass them around as strings. It would be a bit more effort to implement, but I think (something like) the following would be preferable:

file = ptree.get<std::string>["config"]["filename"]

If you store the keys themselves in configuration or if you have to implement interface to a scripting engine the strings are very, very handy. /Pavel

Marcin Kalicinski

30 Apr 30 Apr

11:42 a.m.

Hi Stephen, Thank you for your comments.

...

The proposed property tree library uses a small internal language which is passed to the various functions, for instance:

file = ptree.get<std::string>("config.filename");

Personally, I don't think its a good idea to define new syntaxes and pass them around as strings. It would be a bit more effort to implement, but I think (something like) the following would be preferable:

file = ptree.get<std::string>["config"]["filename"]

I think both approaches have their merits. For example, the above would not permit storing paths in strings or files. I intend to allow both: // These will be equivalent file = pt.["config"]["filename"].get<std::string>(); // #1 file = pt.["config.filename"].get<std::string>(); // #2

...

Also, there is an additional runtime overhead in parsing the string, which can be avoided by doing it at compile-time using C++'s native syntax.

If performance is important, user will just use #1 syntax. Best regards, Marcin

Matias Capeletto

3:41 p.m.

On 4/30/06, Marcin Kalicinski <kalita@poczta.onet.pl> wrote:

...

I think both approaches have their merits. For example, the above would not permit storing paths in strings or files. I intend to allow both:

// These will be equivalent file = pt.["config"]["filename"].get<std::string>(); // #1 file = pt.["config.filename"].get<std::string>(); // #2

Marcin, correct me if i get you wrong... Your brand new cleaner interface for ptree will be like the one above? You will merge the path concept approach with the operator[] approach in this fashion?? PTREE IEEE (Easy to use, Efficient and Extensible Interface ) (*) Only two function for path handling... ptree & operator[](path p); const ptree & operator[](cpath p) const; (*) Four way of interact with data... (actions, see next point for get) get put set del (*) And two ways of using the get function... get<type>() --> throws get<type>(default) --> gives a default value and one specialized template for managing optionals... get< optional<type> >() If this is what you intend to do, i just have to congratulate you... i like very much the way the framework looks... (and you managed to cut functions and eliminate redundancy big time). Now it is very extensible, through creation of new paths, and the learning curve is almost zero. Give someone the the IEEE ten lines description that i just wrote to a casual user and it has to be a three year boy to misunderstand it. And give it to a experimented one and he really will appreciate the efficiency of operator[] and the expanding capacity of the path concept. IMO is a win win situation. I can wait for Thorsten call tonight... It is the first library review in with i participate, and i enjoyed it big time. Thanks a lot to everyone for give me the chance to be in it. I really, really learn a lot... :) Marcin, whatever the decision is... congratulations for a great work! Best regards Matias Capeletto

Thorsten Ottosen

3:47 p.m.

Matias Capeletto wrote:

...

On 4/30/06, Marcin Kalicinski <kalita@poczta.onet.pl> wrote:

...

I can wait for Thorsten call tonight...

It might take week or two to get the results. There's a lot of messages to read :-) -Thorsten

Matias Capeletto

3:51 p.m.

On 4/30/06, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:

...

Matias Capeletto wrote:

...
On 4/30/06, Marcin Kalicinski <kalita@poczta.onet.pl> wrote:

...
I can wait for Thorsten call tonight...

It might take week or two to get the results. There's a lot of messages to read :-)

ohh... :( (I think i have a lot to learn of how things work here) Good luck in your post thread swim! Regards Matias

Marcin Kalicinski

11:41 a.m.

Hi Jeff, Thank you for the review.

...

* What is your evaluation of the design? [...] I agree with what others have mentioned: the paths should be seperate from the tree.

Yes, this is probably the biggest generalization that can be made without compromising simplicity of use. It promises quite a lot, for example paths might be used to hide where XML attributes (and other metadata) are stored in the tree, so that they can always be accessed trasparently.

...

I don't like how XML attributes are stored. It should at least be an option to treat <x y="1" z="2" /> and <x> <y>1</y> <z>2</z> </x> as the same structure, without the "<xmlattrs>" node [...]

True. I have already seen many people request that, but so far everyone wanted the tree structure to be different ;-) I think there should be a generic way of telling the parser where it should put the metadata. I wonder if paths can be used to do it. For example: xml_path path("node1.node2.@attr"); int attr = pt[path].get<int>(); xml_path will handle finding attribute "attr". It will do it by inspecting metadata stored inside the tree. The metadata will determine how the XML is stored in ptree. If XML metadata is not present (for example the tree was created by reading JSON), it will fail. Note that metadata will be a part of the tree created by parser, it will not require complicating the node structure in any way.

...

The ability to preserve comments in a file would be very nice.

I have seen many people request it, so it definitely is. I don't yet know how hard it will be before I try to implement it.

...

* What is your evaluation of the documentation? [..] A better "mission statement" about what the library is for would help as well; right now, it's not very clear what the author thinks the library is trying to do.

I know. The docs are quite misleading at the moment. Some people thought the library is a replacement for program options, some thought it is a generic tree container. The only constant was that everybody was equally confused. I'm going to move docs to quickbook (when I manage to get the boost docs toolchain to work), and then reorganize it quite a bit. Best Regards, Marcin

7002

Age (days ago)

7003

Last active (days ago)

List overview

Download

14 comments

8 participants

participants (8)

Andy Little
Bronek Kozicki
Jeff Hardy
Marcin Kalicinski
Matias Capeletto
Pavel Vozenilek
Stephen Dolan
Thorsten Ottosen