Re: [boost] [Boost Review] Property Tree Library

20 Apr 2006

      Hi Pavel,  thanks for a review.
...
I took look on the V5 and collected few notes.
First of all, from some of your comments I see you must have looked at the 
older revision. Rev.5 does not have any pointers in the interface, and no 
longer has get_d, get_b variants.
...
Most important, the documentation still needs
quite a lot of work.
You can always say that about any documentation. Although it should be 
polished and possibly expanded in places (like traits customization), I 
think it is already quite large (over 150kb of text). It also contains a 
reference which covers 100% of interface functions in quite a detail. This 
is something that not every boost library has.
...
I read some noises about expanding the library to
be arbitrary tree and so on.
IME this is very hard to develop and of rather
low value for users.
Agreed. This is not a generic tree container. This library is about 
idiomatic access to configuration data, easy reading/writing of many file 
formats.
...
_____________________________________________________
1. docs: it is rather unusual for me read code
   without syntax highlighting.
The problem is, it is written and maintained in HTML. I could add code 
coloring, but that would render it close to unmaintainable. Any change would 
be a nightmare. I should have used QuickBook, but unfortunately I had 
problems setting up the toolchain. So I stuck to HTML. Btw. some other well 
estabilished libraries in boost also do not have code coloring, and I didn't 
notice anybody complaining.
...
_____________________________________________________
2. The first mention about use of exceptions in docs
   (in debug_settings::load) should by hyperlink to details.
I agree. I could add some more hyperlinks all over the docs. But whole 
synopsis section, and most importantly the reference, are thoroughly 
cross-linked.
...
3. The first mention of get_d() should explicitly say
   that "d" means default. If possible this word should
   be bolded.
get_d is not longer part of the library, it was removed in rev5. You must 
have looked at the older version.
...
_____________________________________________________
4. A curiousity: in docs you write:
Type of the value extracted
 is determined by type of second parameter, so we can simply
 write get_d(...) instead of get_d<int>(...). */
m_level = pt.get_d("debug.level", 0);
If I do not specify return type explicitly there
may be conversion made after function returns.
If I specify the type explicitly there may be
conversion of the default parameters when the
thread of execution enters the function.
I agree, but is it a problem? I don't think it will be an issue if you read 
long instead of int and have it implicitly converted. If you want int, you 
can always get it by using get<int>(...)
...
_____________________________________________________
5. docs, "Property tree as a container": it may
   be explicitly stated here (or linked from here)
   how validity of iterators changes after delete/update/etc.
Yes, that's right. This is not explicitly said anywhere, but iterators only 
get invalidated when element they point to is removed (like std::list). 
Insertions/erases of elements do not affect other iterators.
...
_____________________________________________________
6. Possible feature:
"...there is an additional indexing data structure
inside property tree that allows fast searches..."
There may be a type trait that disables creating
of such structure, where the search complexity
would degrade gracefuly to O(N).
I know, you said that before, I have it hovering on my list of things to be 
done. It has never got big enough priority though.
...
_____________________________________________________
7. The temptation to use std::lower_bound and similarly
   dangeroud std algorithm may be eliminated by defining
   unimplemented specialisations for ptree
You can always sort the tree and then use the algorithm safely, so banning 
it would not be wise.
...
--------------
In template<class Ch, ....
the "Ch" should be spelled fully. Ch gives no clue.
Dtto the "Tr".
No more templates on <Ch> in the library. You must have looked on the old 
version.
...
---------------
The parametrisation of ptree should go futher.
The basic_string<Ch> should be template
parameter. Some people may like (or be forced)
to use flex_string/AnsiString/CString/QString/
Exactly that was done in revision 5. basic_ptree is now parametrized on key 
type (besides other things).
...
stable_unique() operation may be added.
This would eliminate all duplicates while
keeping order of what is kept.
This can be provided as an generic external algorithm, no need to clutter 
the class interface. I don't think it belongs to ptree library, it's rather 
an extension to <algorithm> header.
...
A generic implementation of stable_unique
http://uk.builder.com/programming/c/0,39029981,20271583,00.htm
...
From what I seen the implementation requires random-access iterators, so it 
would not work with ptree.
...
"It has just one data (typically string) associated..."
The docs are littered with hardcoded references to string as a key/data 
type, because customization facilities were added late. Anyway, I think that 
using std::strings will be the most common case, and adding parentheses 
everywhere I talk about data/key types might do more harm than good.
...
_____________________________________________________
10. docs, "INI parser":
The fillings as "the reason", "probably", "contrary",
"actually" should be omitted. The text will be read
by developer under time pressure and they won't appreaciate it.
Yeah, my english could use some polishing. The text is probably overly 
complicated and littered with meaningless words in places. Again, I would 
appreciate if a native speaker skimmed over it and pointed me towards 
several most glaring mistakes (grammar/style etc.).
...
_____________________________________________________
15. The headers may have
#if (defined _MSC_VER) && (_MSC_VER >= 1200)
#  pragma once
#endif
I know, you suggested that before :-) Again, this is somewhere on my list of 
things to be done, but never got high enough priority. Btw. is compilation 
time really shortened that much? I think that it only removes the 
preprocessing step, which is done in a snap anyway. I haven't done any 
measurements so I might be wrong.
...
_____________________________________________________
16. the file property_tree/detail/ptree_interface.hpp
    should be moved a folder down as it is the most
    important header, not an "detail".
The top level folder _only_ contains files which are includable be the user, 
all the rest in is details. I want to stick to that.
...
_____________________________________________________
18. ptree_utils: the function widen/narrow are
    horribly inefficient. The "result" string
    should be reserve()d before characters are added.
I know there are performance issues with some of the parsers. I think these 
can be resolved safely later, because they do not have any impact on the 
interface.
...
The narrow<wchar_t> is flawed.
Why do you think it is flawed?
...
The function trim() is already available
    in Boost.String Algorithms.
I though I'd rather avoid dependencies if they do not provide critical 
functionality, like e.g. use of Spirit to parse XML.
...
_____________________________________________________
19. A nit: source files should end with newline
    (Standard says so). Some compilers may complain
    if they do not, e.g. ptree_interface.hpp.
As far as I know all source files end in newline. I doublechecked 
ptree_interface.hpp, and it does end in newline. I think gcc 
(with -Wall -pedantic) issues warining if a file does not end in newline, 
and library compiles without any warnings on gcc.
...
_____________________________________________________
20. Namespaces: I would prefere not to have yet
    another sub-namespace in boost.
The ptree should be moved down (via "using")
    so boost::ptree<...> would be valid.

...
The header is missing
#include <cctype>
...
From which file it is missing?
...
_____________________________________________________
22. Boost.Serialization should be supported.
    It is not that absurd as it may look - the tree
    may be part of application state that is saved/sent
    over network.
I was actually trying to get a step further: use property tree as a target 
(not source) for serialization. I.e. creating Archive which writes to a 
ptree. This opens door to many interesting possibilities, for example you 
can save your archives as JSON out of the box. I may post some source code 
in the future when I get it working. The only drawback to using ptree as a 
generic archive is performance, it is going to be much slower than a regular 
archive.
...
--------
I suspect the code:
Ptree *child = local.put(text, Str());
 child->push_back(std::make_pair(Str(), Ptree(child->data())));
is not exception safe (when if the Ptree contructor throws,
who will clear the child).
It is safe. The pointer returned points to child which is owned by local, 
which is stack based and will be cleaned up with all the children in case of 
exception. Btw. this is code from old version of the library.
...
_____________________________________________________
25. I tried to compile the test with BCB (version 5.8),
    from BDS 2006.
I know the library will have problems compiling on non-compliant tools. It 
woukd be nice to have it working on old stuff, but it probably requires so 
much work that it is rather low on my priority list.
...
I can help with BCB porting. It is also
possible to download free BCB compiler
(version 5.5, more-less the same as the
5.8 compiler).
That's great. If somebody else with better knowledge of BCB intricacies 
could do it, it would be nice. On the other hand I'm afraid of introducing 
too many maintenance problems (aka ifdefs), or compromising the 
design/performance of the library to make old stuff happy.

Thank you for another great review.
Marcin

Re: [boost] [Boost Review] Property Tree Library

Marcin Kalicinski