[Property Tree]Problem domain

From my standpoint library is missing several crucial features to qualify in
Hi, When I my review I never thought it will cause any long discussion. It's all very clear in my little corner. I must admit that my understanding significantly changed since. But my main point remains the same. I do not have time to reply to all the related messages. So let me give it another try to convey my point clearly: I don't see any reason to discuss library interfaces and even more implementation without clearly stated problem domain. How could you tell whether one interface/implementation is better than another without understanding how it's planned to be used? Now, based on library docs, we could try to give it some educated guess. Unfortunately this lead me to decide that this library is not acceptable in none of the potential problem domains. Don't get me wrong. The library could be used by some users. But the same is true for almost any one. But my criteria for library acceptance is really simple: library should be usable for most users in problem domain. "Some" users doesn't cut it. So let's see. Option 1: Library to support program runtime parameters originated in different sources ================= this domain. a) No decent CLA parser b) No conflict resolution. If your parameter coming from different sources it's obvious that at some point you need to decide which parameter to prevail: one from config file or one from CLA. c) No automated validations Configuration even more then any other human input require strict validation (more because it's frequently produced by third parties - not library designers. That includes end-users, operation department remote library user etc). We need some notion of "schema" support. d) No automated processing/notification. Many application prefer to build their runtime parameter support as event driven system. No support for this is presented. And these are just major deficiencies. Option 2: Library for permanent storage of application configuration that does not have to be human editable . ================== My primary objection in this domain is that an alternative better solution exist. I would use Serialization library and save/load directly C++ objects. As for the "non human editable" clause, IMO if you decide to present a way for the user to edit configuration through application, no need to support also editing using some external means. If you need some of the parameter to be externally editable (but not from within application) - use separate storage for them. Option 3: Generic facility to manipulate hieratical data with permanent storage support. ========================== My main objection in this domain is that author decided that his particular data structure would be good enough for all usages. The word "particular" is main offender here. I don't believe any "particular" data structure would be good enough. Some novice users may use simple structure with class Leaf and class Brunch. Some prefer generic tree one. Some need compile time polymorphism only. Some could use runtime one. Some Would use boost::variant as value. IMO save/load side of the library like this would need to be made to work with any data structure satisfying some concept. The same applies to access methods (This is the reason why you need to switch to free function based interfaces) Option 4: Facility to manipulate hieratical application configuration that have to be human editable and application writable at the same time. ======================= First of all I need to remark that this domain is kind of limited (and internally conflicting IMO - I don't see why you need configuration editable both ways). This option struggle from the same problem as option 3 - hardcoded in-memory presentation. Also I believe that this domain also require for loaders/savers to keep original formatting and/or comments. On the other hand This library doesn't need CLA/registry support in this case. While it's config files support should be more flexible. Author need to decide which option match his understanding. But IMO library in this current form isn't acceptable in any of them. Regards, Gennadiy

Even though I don't have the time to review this library. I have one comment about this... Gennadiy Rozental wrote:
Option 4: Facility to manipulate hieratical application configuration that have to be human editable and application writable at the same time. =======================
My understanding from previous discussions is that this is the problem domain for this library.
First of all I need to remark that this domain is kind of limited (and internally conflicting IMO - I don't see why you need configuration editable both ways).
You are kidding right? How many times have you opened up a Windows INI file in a text editor? How many times have you opened up an Apache like *.conf file in a text editor? How many times have you opened up a file inside "/etc" and changed it? For all those same cases there's usually also the option of a GUI program to do the editing for you. So I must ask; How can you think this domain is limited? -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

"Rene Rivera" <grafik.list@redshift-software.com> wrote in message news:444BE637.6080409@redshift-software.com...
Even though I don't have the time to review this library. I have one comment about this...
Gennadiy Rozental wrote:
Option 4: Facility to manipulate hieratical application configuration that have to be human editable and application writable at the same time. =======================
My understanding from previous discussions is that this is the problem domain for this library.
First of all I need to remark that this domain is kind of limited (and internally conflicting IMO - I don't see why you need configuration editable both ways).
You are kidding right? How many times have you opened up a Windows INI file in a text editor? How many times have you opened up an Apache like *.conf file in a text editor? How many times have you opened up a file inside "/etc" and changed it? For all those same cases there's usually also the option of a GUI program to do the editing for you.
Why would I want to do manual editing if there is a program that does that? Why would I choose to subject myself to the possibility of making error, to the routine work of satisfying particular file format? Not saying that it complicate my application development since now it's need to handle both proper format and anything I will manage to type in file. If I edit config file - that mean I don't have a tool to handle this for me. I am not saying this is completely impossible. But shouldn't be that wide spread need.
So I must ask; How can you think this domain is limited?
In a big picture of program runtime parameter handling - it's just a small friction IMO. Note though that even in this domain library is not acceptable. Gennadiy

Gennadiy Rozental wrote:
"Rene Rivera" wrote in message
Even though I don't have the time to review this library. I have one comment about this...
Gennadiy Rozental wrote:
Option 4: Facility to manipulate hieratical application configuration that have to be human editable and application writable at the same time. ======================= My understanding from previous discussions is that this is the problem domain for this library.
First of all I need to remark that this domain is kind of limited (and internally conflicting IMO - I don't see why you need configuration editable both ways). You are kidding right? How many times have you opened up a Windows INI file in a text editor? How many times have you opened up an Apache like *.conf file in a text editor? How many times have you opened up a file inside "/etc" and changed it? For all those same cases there's usually also the option of a GUI program to do the editing for you.
Why would I want to do manual editing if there is a program that does that?
Because some times there aren't such programs (I said usually above). And more often you can't use those programs, for many reasons, like not having access to the UI system the run on.
Why would I choose to subject myself to the possibility of making error, to the routine work of satisfying particular file format?
It has nothing to do with choice, it's the current situation in many cases. And if you are just starting on a *new* program would you write the configuration GUI before you write your program?
Not saying that it complicate my application development since now it's need to handle both proper format and anything I will manage to type in file.
But not if such a library as the Property Tree library which would do that for you.
If I edit config file - that mean I don't have a tool to handle this for me. I am not saying this is completely impossible. But shouldn't be that wide spread need.
OK, I can only surmise that you don't visit the Unix world often.
So I must ask; How can you think this domain is limited?
In a big picture of program runtime parameter handling - it's just a small friction IMO.
Perhaps, but vector is a *very* small fraction of the data structure doamin. Yet we still have such a utility.
Note though that even in this domain library is not acceptable.
I wasn't arguing that. I was just curious about the dismissal of the importance of the domain. And after all your subject on this thread is about domains ;-) -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - grafikrobot/yahoo

"Rene Rivera" <grafik.list@redshift-software.com> wrote in message news:444BEFD4.6030300@redshift-software.com...
Why would I want to do manual editing if there is a program that does that?
Because some times there aren't such programs (I said usually above).
This scenario does not apply. We are talking about files that are both human and application editable.
And more often you can't use those programs, for many reasons, like not having access to the UI system the run on.
So are you going to change application options and don't even run it to test that it works?
Why would I choose to subject myself to the possibility of making error, to the routine work of satisfying particular file format?
It has nothing to do with choice, it's the current situation in many cases. And if you are just starting on a *new* program would you write the configuration GUI before you write your program?
I would use human-editable only config at first. And switched to application-editable only once it's ready.
Not saying that it complicate my application development since now it's need to handle both proper format and anything I will manage to type in file.
But not if such a library as the Property Tree library which would do that for you.
I doubt it will be flexible enough. And also any such solution would be more complecated (eventually - error prone) in comparison with one that doesnt need to handle human input.
If I edit config file - that mean I don't have a tool to handle this for me. I am not saying this is completely impossible. But shouldn't be that wide spread need.
OK, I can only surmise that you don't visit the Unix world often.
These days I develop exclusively on *nix systems. I don't have any GUI though. So all my config files are human editable.
So I must ask; How can you think this domain is limited?
In a big picture of program runtime parameter handling - it's just a small friction IMO.
Perhaps, but vector is a *very* small fraction of the data structure doamin. Yet we still have such a utility.
I think vector should be insulted by your assesment ;) I may disagree also. Gennadiy

[...] So let me give it another try to convey my point clearly: I don't see any reason to discuss library interfaces and even more implementation without clearly stated problem domain.
I admit that the documentation fails to specify the problem domain. The tutorial presents the library as a config-file reader and stops there. Hence the original discussion, whether it overlaps 1:1 with program options or serialization, etc. Its my fault, I will try to correct this unpleasant state of matters.
How could you tell whether one interface/implementation is better than another without understanding how it's planned to be used?
*** The goals *** =========================== 1. Allow efficient and intuitive manipulation of hierarchical, human-readable data structures in memory. Like a DOM, but with as simple to use interface as possible! The interface must leverage C++ strenghts (type safety, templates to automatically convert types), and maybe more importanly C++ idioms (ptree = std container, iterators, algorithms). Tradeoffs: no stress on performance, no following of W3C DOM standard. 2. A facility to load and save most popular, human readable data file formats (and present them in the form of above structure). Tradeoffs: no stress on supporting each format in its entirety. Omit less used and more complicated parts of the file formats. Get it working for 80% of cases now, rather than for 100% never. I believe the library meets the above goals reasonably well at the moment. With all the proposals I got I think I can make it do it even better. *** Problem domain *** ========================== First of all, I don't think author of almost any library can definitely say: this library will be used to do A and B, but not C. Loading human readable data, allowing its manipulation in memory, and saving it back again. This can be used for a plethora of things, and nobody cannot possibly be expected to enumerate all of them. I will try to present a sample: 1. Loading and saving program data. For example in a GUI editor for something that is persisted in an XML format with externally defined layout. Cannot use boost::serialization, because it isn't format-flexible. You get your XML out of it if you really want, but only in vanilla flavour. No saying what should go where. 2. Serialization (in primitive form, but supports human-editable formats). I say "primitive", because I'm now an initiate - I used boost::serialization in some of my projects. Before I did that, I would say ptree offered "adequate" serialization facilities. 3. Loading of program options, aka program_options overlap 4. Passing of data between parts of the same program. This is a little more tricky than the others. For example, consider the parts to be (1) a fancy menu system for a video game, and (2) the game itself. Menus build a ptree and pass it to the game. Why? You can save the generated ptree and reuse it later without starting the menus. You can hand-edit it and configure the game wihout having the menus at all. This is extremely handy if menus and game are developed separately. You may find yourself in a situation where you are working on the game, but menus do not exist yet in an usable state. You do not need to "fake" a menu system to test your game. You just handcraft the ptrees. (i.e write some text files in your fav format). 5. A use that surfaced lately: target for boost::serialization (ptree_archive). Why? Can save archives to all formats supported by ptree, effectively extending boost::serialization for free. And (speculating), maybe it can be used to automatically reorganize XML layout, so that comments on boost::serialization in #1 no longer apply. 6. Translation from one format to another. I'm sure others could extend that list. Best regards, Marcin

"Marcin Kalicinski" <kalita@poczta.onet.pl> wrote in message news:e2h11e$659$1@sea.gmane.org... [...]
*** The goals *** ===========================
1. Allow efficient and intuitive manipulation of hierarchical, human-readable data structures in memory.
I am not sure what you mean but human-readable (do you plan to read bits in a memory like a book?;o), but this looks like quite a generic goal. Usually such a goal require a generic component to match. The one you present doesn't match this task.
Like a DOM, but with as simple to use interface as possible!
I believe interfaces could be made simpler if you would limit scope of the tree usage.
The interface must leverage C++ strengths (type safety, templates to automatically convert types), and maybe more importantly C++ idioms (ptree = std container, iterators, algorithms). Tradeoffs: no stress on performance,
Why do you think it's an acceptable tradeoff for generic hierarchical data structure? Also as one of the goal you stated "efficient". Doesn't this look like internal contradiction? std containers doesn't exhibit this tradeoff.
no following of W3C DOM standard.
There are reasons people write standards. Among them are portability - any application that expect DOM interface wouldn't be able to use your tree, and recorded experience - people thought about proper interfaces - why not follow them?. If you target "Like DOM" data structure - why not follow standard?
2. A facility to load and save most popular, human readable data file formats (and present them in the form of above structure). Tradeoffs: no stress on supporting each format in its entirety. Omit less used and more complicated parts of the file formats. Get it working for 80% of cases now, rather than for 100% never.
I believe the library meets the above goals reasonably well at the moment. With all the proposals I got I think I can make it do it even better.
As of now your library doesn't support format preservation and comments. isn't it?
*** Problem domain *** ==========================
First of all, I don't think author of almost any library can definitely say: this library will be used to do A and B, but not C. Loading human readable data, allowing its manipulation in memory, and saving it back again. This can be used for a plethora of things, and nobody cannot possibly be expected to enumerate all of them. I will try to present a sample:
1. Loading and saving program data. For example in a GUI editor for something that is persisted in an XML format with externally defined layout. Cannot use boost::serialization, because it isn't format-flexible. You get your XML out of it if you really want, but only in vanilla flavor. No saying what should go where.
What registry parser/loader doing in your library then? Do you expect users to edit registry manually? If not Serialization library should be able to deal with this task better.
2. Serialization (in primitive form, but supports human-editable formats). I say "primitive", because I'm now an initiate - I used boost::serialization in some of my projects. Before I did that, I would say ptree offered "adequate" serialization facilities.
By serialization we usually mean C++ objects <-> permanent storage conversion. You library presents some kind of intermediate state. I am not sure it qualifies to be named like this. But that's is minor point.
3. Loading of program options, aka program_options overlap
What CLA parsing doing here? You couldn't save it back. It's rarely hierarchical. Your parser doesn't do anything one expect from reasonable CLA parser. I don't see this as problem domain for you.
4. Passing of data between parts of the same program. This is a little more tricky than the others. For example, consider the parts to be (1) a fancy menu system for a video game, and (2) the game itself. Menus build a ptree and pass it to the game. Why? You can save the generated ptree and reuse it later without starting the menus. You can hand-edit it and configure the game without having the menus at all. This is extremely handy if menus and game are developed separately. You may find yourself in a situation where you are working on the game, but menus do not exist yet in an usable state. You do not need to "fake" a menu system to test your game. You just handcraft the ptrees. (i.e write some text files in your fav format).
At the beginning I only need load part. Later I wont need human readability support (actually i believe it would be even dangerous). I don't see that as a good example. I personally doesn't find ptree tradeoffs needed. I would write menu in C++ directly at the beginning and then switched to serialization lib. In general I believe this domain is artificial.
5. A use that surfaced lately: target for boost::serialization (ptree_archive). Why? Can save archives to all formats supported by ptree, effectively extending boost::serialization for free. And (speculating), maybe it can be used to automatically reorganize XML layout, so that comments on boost::serialization in #1 no longer apply.
This is not currently part of the library, so may not be considered to be in it's domain. In a future we may discuss it.
6. Translation from one format to another.
This is natural part of 2 Gennadiy

Gennadiy Rozental wrote:
[...] Option 3: Generic facility to manipulate hieratical data with permanent storage support. ==========================
My main objection in this domain is that author decided that his particular data structure would be good enough for all usages. The word "particular" is main offender here. I don't believe any "particular" data structure would be good enough. Some novice users may use simple structure with class Leaf and class Brunch. Some prefer generic tree one. Some need compile time polymorphism only. Some could use runtime one. Some Would use boost::variant as value. IMO save/load side of the library like this would need to be made to work with any data structure satisfying some concept. The same applies to access methods (This is the reason why you need to switch to free function based interfaces)
I agree. That's why I asked in another thread if the access methods are loosely or tightly coupled with property_tree. Marcin himself wrote in another message that using property_tree "is a matter of internal implementation". Adding some more flexibility here would be nice and seems to be possible. Boris
[...]

"Boris" <boriss@web.de> wrote in message news:e2i1qg$or0$1@sea.gmane.org... : Gennadiy Rozental wrote: : > [...] Option 3: : > Generic facility to manipulate hieratical data with permanent storage : > support. : > ========================== : > : > My main objection in this domain is that author decided that his : > particular data structure would be good enough for all usages. The : > word "particular" is main offender here. I don't believe any : > "particular" data structure would be good enough. Some novice users may : > use simple structure with class : > Leaf and class Brunch. Some prefer generic tree one. Some need : > compile time polymorphism only. Some could use runtime one. Some : > Would use boost::variant as value. : > IMO save/load side of the library like this would need to be made to : > work with any data structure satisfying some concept. The same : > applies to access methods (This is the reason why you need to switch : > to free function based interfaces) : : I agree. That's why I asked in another thread if the access methods are : loosely or tightly coupled with property_tree. Marcin himself wrote in : another message that using property_tree "is a matter of internal : implementation". Adding some more flexibility here would be nice and seems : to be possible. I agree as well. As much as I initially disliked the ptree data structure (I am used to a polymorphic record/array/leaf node), I think that it has the true merit of providing a reasonable common in-memory representation for a variety of back-ends formats (XML, JSON, MSWin registry, Init, ...) I would have liked to see a class that enforces a longer list of invariants (e.g. enforcing only pure array//record//leaf nodes), but this would restrict the range of the supported formats. I think that by introducing this library into boost, we provide a platform for further developments, with a good potential for synergy with po & s11n. However, I cannot agree to see this common in-memory representation burdened and crippled by what is currently a too narrow and too limited value conversion interface. So my vote is to keep this value conversion & path handling separate: either in a distinct wrapper (path parsing & value conversion) class, or as free functions. If this requires delaying the adoption of ptree, or adopting it without a value conversion interface, then so be it. But as a minimal change least-effort option, moving the member functions out into a dedicated namespace would be acceptable to me. Of course this is just my vote, with all due respect to Marcin's excellent work and valuable contribution. Ivan -- http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

"Gennadiy Rozental" <gennadiy.rozental@thomson.com> writes:
When I my review I never thought it will cause any long discussion. It's all very clear in my little corner. I must admit that my understanding significantly changed since. But my main point remains the same. I do not have time to reply to all the related messages. So let me give it another try to convey my point clearly: I don't see any reason to discuss library interfaces and even more implementation without clearly stated problem domain. How could you tell whether one interface/implementation is better than another without understanding how it's planned to be used?
Yes, please! Gennadiy, you may not have noticed this but I said the same yesterday. So you're not alone.
Now, based on library docs, we could try to give it some educated guess. Unfortunately this lead me to decide that this library is not acceptable in none of the potential problem domains.
For the record, I have not made any educated guesses, nor have I reached any conclusions. I just want to lend support to the idea that this library needs a very clear statement of domain, and it is needed immediately. -- Dave BTW, thank you very much for taking the time to explain your points clearly and patiently. It may take longer for you to write them, but it's much easier for someone like me to read. -- Dave Abrahams Boost Consulting www.boost-consulting.com

domain. How could you tell whether one interface/implementation is better than another without understanding how it's planned to be used?
Yes, please!
Gennadiy, you may not have noticed this but I said the same yesterday. So you're not alone.
The library goals and problem domain are already in this thread (since yesterday evening). I hope people will find them informative, so that they can voice a more educated vote on the library. Best regards, Marcin
participants (6)
-
Boris
-
David Abrahams
-
Gennadiy Rozental
-
Ivan Vecerina
-
Marcin Kalicinski
-
Rene Rivera