Library for configuration file parsing

Denis Shevchenko

25 Nov 2010 25 Nov '10

4:59 p.m.

Hi all! Is there any interest in a library for configuration file parsing? I know there is a good library Program_options, but it supports (asit is written in the documentation) "parsing of simple INI-like configuration files". If all you need for configure your programs is INI-like files - use Program_options and forget about this letter. But if you want to use a more complex configuration files, you may be interested in my library, Configurator. simplest.conf: Host = 127.0.0.1 Code for work with this config: cf::configurator configurator; configurator.add_option( "Host" ); configurator.parse( "/some/path/to/simplest.conf" ); std::cout << "Value of host: " << configurator.get_value( "Host" )<< std::endl; Features - Header-only - Allows to set option's default value or/and necessity. - Allows to use arbitrary nesting of sections. - Provides "standard" checks of value's semantic, like correct path, IP validity, email correctness, etc. - Provides "extended" checks of value's semantic, like time period and file size. - Provides common checks of options and sections, like duplication, incorrection, etc. - Supports single-line and multi-line comments (in C++ style). - Allow to register values with multi-values. - Allows to set another "name-value" separator (including space symbol), instead default '='.

Show replies by date

Vladimir Prus

26 Nov 26 Nov

10:04 a.m.

Denis Shevchenko wrote:

...

Hi all!

Is there any interest in a library for configuration file parsing?

I know there is a good library Program_options, but it supports (asit is written in the documentation) "parsing of simple INI-like configuration files". If all you need for configure your programs is INI-like files - use Program_options and forget about this letter. But if you want to use a more complex configuration files, you may be interested in my library, Configurator.

You might want to clarify what are "more complex configuration files". Also, note that the property_tree library has support for some flavour(s) of configuration files, so you might want to compare your solution with it. - Volodya

Denis Shevchenko

11:11 a.m.

On 26.11.2010 13:04, Vladimir Prus wrote:

...

Denis Shevchenko wrote:

...
Hi all!

Is there any interest in a library for configuration file parsing?

I know there is a good library Program_options, but it supports (asit is written in the documentation) "parsing of simple INI-like configuration files". If all you need for configure your programs is INI-like files - use Program_options and forget about this letter. But if you want to use a more complex configuration files, you may be interested in my library, Configurator. You might want to clarify what are "more complex configuration files". Also, note that the property_tree library has support for some flavour(s) of configuration files, so you might want to compare your solution with it.

- Volodya

_______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost Hi, Volodya!

This is example of more complex configuration file: -------------------------------------------------------------------------------------- /* * more_complex.conf */ DbName : my_database DbHost : localhost DbPort : 100 DbUser : some_user DbPassword : some_password <Server> Host : 12.45.46.15 // IP semantic may be check. Port : 1080 // Value must be unsigned integer number. Admin/*this is comment with unnecessary info asf*/istrator : admin@example.com // e-mail semantic may be check. StorePath : /some/path // path semantic may be check. <Internal> Logfile : /some/path/to/logfile // path semantic may be check. MaxLogFileSize : 10 MB // size semantic may be check, result in bytes. </Internal> ReconnectPeriod : 10 m // time-period semantic may be check, result in seconds. </Server> <Plugins> plugins : a_plug b_plug plugins : c_plug plugins : d_plug plugins : e_plug </Plugins> ---------------------------------------------------------------------------------------- I know boost::property_tree, this is very good library, but (imho) it is designed specifically to advanced work with XML (though not only with him). In other words, primaryarea of this library is notwork with configuration files(imho). I could not find in the documentation (or in examples) of boost::property_tree features such as option's default value or necessary. My library designed exactly for work with configuration files and not for anything more, so it usageis very simple for THIS task.

Sebastian Redl

11:47 a.m.

On 26.11.2010 12:11, Denis Shevchenko wrote:

...

This is example of more complex configuration file: --------------------------------------------------------------------------------------

/* * more_complex.conf */

DbName : my_database DbHost : localhost DbPort : 100 DbUser : some_user DbPassword : some_password

<Server> Host : 12.45.46.15 // IP semantic may be check. Port : 1080 // Value must be unsigned integer number. Admin/*this is comment with unnecessary info asf*/istrator : admin@example.com // e-mail semantic may be check.

StorePath : /some/path // path semantic may be check.

<Internal> Logfile : /some/path/to/logfile // path semantic may be check. MaxLogFileSize : 10 MB // size semantic may be check, result in bytes. </Internal>

ReconnectPeriod : 10 m // time-period semantic may be check, result in seconds. </Server>

<Plugins> plugins : a_plug b_plug plugins : c_plug plugins : d_plug plugins : e_plug </Plugins> ----------------------------------------------------------------------------------------

PropertyTree could support such a format. However, it would not support schema support for configuration files, e.g. check semantics or automatically have default values available. It wouldn't be too hard to write a check for any given tree, though.

...

I know boost::property_tree, this is very good library, but (imho) it is designed specifically to advanced work with XML (though not only with him).

...

In other words, primaryarea of this library is notwork with configuration files(imho). IMO it is, and since I'm the maintainer, my opinion is not humble. :-) I could not find in the documentation (or in examples) of boost::property_tree features such as option's default value or necessary. There is no schema definition or validation in PTree. However, there is

No. The PTree XML parser is too basic to claim that it is for advanced work with XML. There is no support for advanced XML features at all. the get_default function. Schemas for PTree could be designed as a layer around PTree though. Sebastian

Denis Shevchenko

12:15 p.m.

Imho, one of the strengths of my library is informative error messages. Almost any error in configuration file will be "clearly" detected (with specifying the place where it detected). For example: ------------------------------------------------- <Server> <Security> Usr = user // There is mistake in name of option "User" </Security> </Server> ------------------------------------------------- Error message: [Configurator] Incorrect option detected in configuration file: 'Server> Security> Usr' Or: ------------------------------------------------- <Server> <Security> User = user <Security>// There is mistake in name of section, user forget '/' symbol </Server> ------------------------------------------------- Error message: [Configurator] Duplication of open tag for section 'Server > Security' detected! Also will be detected problems such as unclosed (or unopened) multi-line comments, meaningless strings (if user forget the comment out of it), dissymmetry of sections, etc. Imho, if the config file is large and there are a lot of sections, such messages will be useful. Errors in user code also detected, for example: configurator.in( "Server" ).in( "Security" ).add_option_here( "User" ); // ... std::string user = configurator.from( "Server" ).from( "Security" ).get_value_from_here( "Usr" ); Error message: [Configurator] You request a value of option 'Server > Security > Usr', but such option not registered!

Giorgio Zoppi

2 Dec 2 Dec

1:08 p.m.

2010/11/26 Denis Shevchenko <for.dshevchenko@gmail.com>:

...

On 26.11.2010 13:04, Vladimir Prus wrote:

...
Denis Shevchenko wrote:

...
Hi all!

Is there any interest in a library for configuration file parsing?

I know there is a good library Program_options, but it supports (asit is written in the documentation) "parsing of simple INI-like configuration files". If all you need for configure your programs is INI-like files - use Program_options and forget about this letter. But if you want to use a more complex configuration files, you may be interested in my library, Configurator.

You might want to clarify what are "more complex configuration files". Also, note that the property_tree library has support for some flavour(s) of configuration files, so you might want to compare your solution with it.

- Volodya

_______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hi, Volodya!

This is example of more complex configuration file: -------------------------------------------------------------------------------------- /* * more_complex.conf */

DbName : my_database DbHost : localhost DbPort : 100 DbUser : some_user DbPassword : some_password

<Server> Host : 12.45.46.15 // IP semantic may be check. Port : 1080 // Value must be unsigned integer number. Admin/*this is comment with unnecessary info asf*/istrator : admin@example.com // e-mail semantic may be check.

StorePath : /some/path // path semantic may be check.

<Internal> Logfile : /some/path/to/logfile // path semantic may be check. MaxLogFileSize : 10 MB // size semantic may be check, result in bytes. </Internal>

ReconnectPeriod : 10 m // time-period semantic may be check, result in seconds. </Server>

<Plugins> plugins : a_plug b_plug plugins : c_plug plugins : d_plug plugins : e_plug </Plugins> ----------------------------------------------------------------------------------------

I know boost::property_tree, this is very good library, but (imho) it is designed specifically to advanced work with XML (though not only with him). In other words, primaryarea of this library is notwork with configuration files(imho). I could not find in the documentation (or in examples) of boost::property_tree features such as option's default value or necessary.

My library designed exactly for work with configuration files and not for anything more, so it usageis very simple for THIS task.

Why don't use boost::spirit? Cheers, Giorgio.

Denis Shevchenko

2:02 p.m.

On 02.12.2010 16:08, Giorgio Zoppi wrote:

...

Why don't use boost::spirit?

Cheers, Giorgio. Hello Giorgio!

As I said, one thing what library can do, and another thing - what it is intended for. Destination of Boost.Spirit, as it is written in the documentation: "LL parser framework represents parsers directly as EBNF grammars in inlined C++". Destination of my Configurator: "easy and flexible work with configuration files". My library designed ONLY for this task and nothing more, because I adhere to the principle of "one task - one library".Even a C++-novice can use my library within 3 minutes after downloading, because it has a very simple interface, but it lets to work with files on the level of complexity comparable to httpd.conf. My target - useful features on the one hand and maximum simplicity of usage on the other. - Denis

Hartmut Kaiser

2:58 p.m.

...

On 02.12.2010 16:08, Giorgio Zoppi wrote:

...
Why don't use boost::spirit?

Hello Giorgio!

As I said, one thing what library can do, and another thing - what it is intended for. Destination of Boost.Spirit, as it is written in the documentation: "LL parser framework represents parsers directly as EBNF grammars in inlined C++". Destination of my Configurator: "easy and flexible work with configuration files". My library designed ONLY for this task and nothing more, because I adhere to the principle of "one task - one library".Even a C++-novice can use my library within 3 minutes after downloading, because it has a very simple interface, but it lets to work with files on the level of complexity comparable to httpd.conf.

My target - useful features on the one hand and maximum simplicity of usage on the other.

Sorry, but this does not answer the OP's question. All of your stated goals could be achieved by using Spirit underneath as the parser engine for your config file format, no? Regards Hartmut --------------- http://boost-spirit.com

Denis Shevchenko

3:15 p.m.

On 02.12.2010 17:58, Hartmut Kaiser wrote:

...

All of your stated goals could be achieved by using Spirit underneath as the parser engine for your config file format, no?

Regards Hartmut Strictly speaking, yes.

Be honest, I did not use Spirit because (imho) it's difficult-to-study library. I may be wrong, but I was easier to write own parser engine using Boost.String algo, Boost.Regex and std::algorithm (the more so becausethe complexity of parsing there is not so high). - Denis

Dean Michael Berris

3:27 p.m.

On Thu, Dec 2, 2010 at 11:15 PM, Denis Shevchenko <for.dshevchenko@gmail.com> wrote:

...

On 02.12.2010 17:58, Hartmut Kaiser wrote:

...
All of your stated goals could be achieved by using Spirit underneath as the parser engine for your config file format, no?

Regards Hartmut

Strictly speaking, yes.

Be honest, I did not use Spirit because (imho) it's difficult-to-study library. I may be wrong, but I was easier to write own parser engine using Boost.String algo, Boost.Regex and std::algorithm (the more so becausethe complexity of parsing there is not so high).

Have you tried it yet? Unfortunately I have the reverse of your experience. Parsing with a big switch and implementing my own DFA for incremental HTTP parsing is so ugly and unintuitive that I've had to rely too much on trial and error while doing it. I long for the day when I can just define a restartable Boost.Spirit based composed parser and not have to worry about the parsing details -- I really want to define my parsers now in a declarative way that Boost.Spirit across all the versions I've used (Classic and v2.x), and hopefully that won't be too far in the future. Just my $0.02 worth. -- Dean Michael Berris deanberris.com

Denis Shevchenko

3:41 p.m.

On 02.12.2010 18:27, Dean Michael Berris wrote:

...

Have you tried it yet?

Yes, Dean, I tried it. Spirit is very powerful library. But I create my Configurator without using the Spirit (for better or worse)... - Denis

Hartmut Kaiser

3:55 p.m.

...

Unfortunately I have the reverse of your experience. Parsing with a big switch and implementing my own DFA for incremental HTTP parsing is so ugly and unintuitive that I've had to rely too much on trial and error while doing it. I long for the day when I can just define a restartable Boost.Spirit based composed parser and not have to worry about the parsing details -- I really want to define my parsers now in a declarative way that Boost.Spirit across all the versions I've used (Classic and v2.x), and hopefully that won't be too far in the future.

Dean, have you seen this: http://article.gmane.org/gmane.comp.parsers.spirit.general/21109 ? Regards Hartmut --------------- http://boost-spirit.com

Dean Michael Berris

4:30 p.m.

On Thu, Dec 2, 2010 at 11:55 PM, Hartmut Kaiser <hartmut.kaiser@gmail.com> wrote:

...

...
Unfortunately I have the reverse of your experience. Parsing with a big switch and implementing my own DFA for incremental HTTP parsing is so ugly and unintuitive that I've had to rely too much on trial and error while doing it. I long for the day when I can just define a restartable Boost.Spirit based composed parser and not have to worry about the parsing details -- I really want to define my parsers now in a declarative way that Boost.Spirit across all the versions I've used (Classic and v2.x), and hopefully that won't be too far in the future.

Dean, have you seen this:

http://article.gmane.org/gmane.comp.parsers.spirit.general/21109

?

Yes Hartmut, I almost jumped for joy seeing that message. I think the universe was cooperating when my question was answered. ;) However it still depends on Boost.Coroutine which isn't yet part of the official Boost distribution. The last thing I want is to host Boost.Coroutine in cpp-netlib and then make that a requirement for the review when I submit cpp-netlib for review and inclusion into Boost. ;) That's not such a bad thing though, but really what I want to see is something more generic than this approach. It's very closely tied to Asio and abstracting the input stream through a synchronous interface -- or with continuations/co-routines -- albeit works is not really the way I want to go. Something that allows me to create a function object out of Spirit Qi expressions, which in itself keeps state and can be passed around, serialized, and is iterator agnostic, would really be what I'm looking for. Using coroutines is cute and good for all intents and purposes, but a composed function object parser with internal state is much more generic and more usable in contexts other than through asynchronous IO. Maybe if Boost.Accumulators and Boost.Spirit hooked up and had a baby, that'd be what would describe what I'm looking for. I hope that made sense. :) -- Dean Michael Berris deanberris.com

Michael Caisse

4 Dec 4 Dec

2:34 a.m.

On 12/02/2010 07:15 AM, Denis Shevchenko wrote:

...

On 02.12.2010 17:58, Hartmut Kaiser wrote:

...
All of your stated goals could be achieved by using Spirit underneath as the parser engine for your config file format, no?

Regards Hartmut Strictly speaking, yes.

Be honest, I did not use Spirit because (imho) it's difficult-to-study library. I may be wrong, but I was easier to write own parser engine using Boost.String algo, Boost.Regex and std::algorithm (the more so becausethe complexity of parsing there is not so high).

- Denis _______________________________________________

I'm surprised (shocked) at the number of people who initially respond with this point-of-view. I can only assume that it is the difficulty of "seeing" BNF looking statements in C++. It must be the fact that it is a DSEL and people just don't think that could be possible so it looks foreign and they stop at that. Surely it isn't reading BNF or some other well described grammar format that is causing problems. If that is it ... then there is a more serious problem that a library isn't going to fix. Spirit itself (Qi and Karma) are quite simple and in my experience helping people on IRC, within a couple hours most are able to solve simple problems and by the end of a couple days they are proficient and prolific. The documentation for Spirit is excellent and in my opinion, some of the best in all of Boost. For additional docs, have a look at my boostcon slides and video. slides: <http://www.objectmodelingdesigns.com/boostcon10/> video: <http://blip.tv/file/4143337 > It would be helpful if you can be more specific about what causes difficulty for you. We are constantly trying to improve the documentation and tutorials to make them more accessible. You will see within the first couple slides in my presentation that I condemn ad-hoc parsers thrown together using string, regex, algorithm and the like simply because the implementor thinks the parsing task is "simple". Spirit allows you to write simple parsers in-line with a syntax that is easy to understand and maintain. There seems to be an attitude within our profession in general that writing ugly code because the task at hand isn't too complex makes it ok. I have a tendency to disagree. michael -- Michael Caisse Object Modeling Designs www.objectmodelingdesigns.com

Denis Shevchenko

5:20 a.m.

On 04.12.2010 05:34, Michael Caisse wrote:

...

I'm surprised (shocked) at the number of people who initially respond with this point-of-view.

michael Hello Michael!

This point of view is honest. Most probably, I am acquainted with Spirit better (and probably in the near future, and maybe I'll rewrite my code with Spirit), but NOW my library is written without Spirit... And then... IMHO, every library (as programmer's tool) is characterized by three "fundamental features": 1. Itscapabilities (that is, what library CAN do) 2. Its interface (that is, how EASY it's to use its capabilitiesin user's code) 3. Its realization (that is, HOW it's works inside) Programmer (as user of library)PRIMARILY concerned about 1 and 2, but not about 3. This does not mean that the 3 is not important, but programmer CAN use librarywithout knowing anything about how it is works inside. In other words, whether it will be written with Spirit, or with Regex, or with std::algorithm, or in any other way - this is (IMHO)important from a technical point of view, but not from a practical. For example, I constantly use Boost libraries in my work, but I don't know realization's details of many of them. Do you agree with me, Michael? - Denis

Denis Shevchenko

7:42 a.m.

On 04.12.2010 05:34, Michael Caisse wrote:

...

slides: <http://www.objectmodelingdesigns.com/boostcon10/> video: <http://blip.tv/file/4143337 > Thank you, Michael, for slides. I'll definitely study it.

- Denis

Denis Shevchenko

7:20 a.m.

Hi all! Those who are interested in my Confogurator, perhaps it will be interesting. String-based interface that I used was a bad idea, because probably errors will be detected only at runtime. Moreover, it was a my mistake to use location of option in configuration file as identifier in user's code. It was a very stupid design mistake. Templates and preprocessor - and here is what I got: Example config: ------------------------------------- <System> <UdpServer> Host = 127.0.0.1 Port = 100 </UdpServer> </System> ------------------------------------- // in some .hpp file user definesoptions, // in terms of their conceptual destination, // but not their location in config: BOOST_CONFIGURATOR_OPTION( UdpHost ) BOOST_CONFIGURATOR_OPTION( UdpPort ) // Work with options: boost::cf::configurator conf; conf.add< UdpHost >().location( "System::UdpServer::Host" ); conf.add< UdpPort >().location( "System::UdpServer::Port" ); // in other place... std::string host = conf.get_value_of< UdpHost >(); unsigned int port = 0; conf.get_value_of< UdpPort >( port ); // or like this: unsigned int port = conf.get_value_of< UdpPort, unsigned int >(); So user does not use any string-identifier of option, but only its type. Advantages of such solutions (imho): - possible errors of option name will be detected at compile-time, - if location of option in config file has changed, user must rewrite only one string (in location() function), - I can not imagine a more simple interface... :-) What do you think about it? -Denis

Stephen Nuchia

26 Nov 26 Nov

9:04 p.m.

...

...
From: Denis Shevchenko [mailto:for.dshevchenko@gmail.com] Is there any interest in a library for configuration file parsing?

Having done this once, just before TCL was announced, I won't ever do it again. Ousterhout's reasoning is, in my opinion, unassailable. Configuration files might as well be written in a full-featured, widely-understood embedded scripting language. http://www.stanford.edu/~ouster/cgi-bin/papers/tcl-usenix.pdf That would be Python now, right?

Denis Shevchenko

11:31 p.m.

On 27.11.2010 00:04, Stephen Nuchia wrote:

...

...
...
From: Denis Shevchenko [mailto:for.dshevchenko@gmail.com] Is there any interest in a library for configuration file parsing? Having done this once, just before TCL was announced, I won't ever do it again. Ousterhout's reasoning is, in my opinion, unassailable. Configuration files might as well be written in a full-featured, widely-understood embedded scripting language.

http://www.stanford.edu/~ouster/cgi-bin/papers/tcl-usenix.pdf

That would be Python now, right?

_______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost Hmm... Sorry, but I do not quite understand your idea... TCL, Python... I'm not familiar with script languages.

I propose a C++-solution which seems to me a easy-to-use and flexible and which I use myself for all my Linux-daemons. And I suggested that if this solutionseems convenientto me, It may seem convenient for others developers. I'm just trying to determine the interest in it...

Jeff Benshetler

27 Nov 27 Nov

12:48 a.m.

I've switched to using Python for configuration files about 6 years ago, and stopped writing "little languages" of my own. There is a great deal of flexibility and extensibility from using a general purpose language, and it has a fully documented syntax with good error messages. On Fri, Nov 26, 2010 at 5:31 PM, Denis Shevchenko <for.dshevchenko@gmail.com

...

wrote:

...

On 27.11.2010 00:04, Stephen Nuchia wrote:

...
From: Denis Shevchenko [mailto:for.dshevchenko@gmail.com]

...
...
Is there any interest in a library for configuration file parsing?

Having done this once, just before TCL was announced, I won't ever do it again. Ousterhout's reasoning is, in my opinion, unassailable. Configuration files might as well be written in a full-featured, widely-understood embedded scripting language.

http://www.stanford.edu/~ouster/cgi-bin/papers/tcl-usenix.pdf

That would be Python now, right?

_______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hmm... Sorry, but I do not quite understand your idea... TCL, Python... I'm not familiar with script languages.

I propose a C++-solution which seems to me a easy-to-use and flexible and which I use myself for all my Linux-daemons. And I suggested that if this solutionseems convenientto me, It may seem convenient for others developers. I'm just trying to determine the interest in it...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Denis Shevchenko

4:57 a.m.

On 27.11.2010 03:48, Jeff Benshetler wrote:

...

I've switched to using Python for configuration files about 6 years ago, and stopped writing "little languages" of my own. There is a great deal of flexibility and extensibility from using a general purpose language, and it has a fully documented syntax with good error messages. It's good idea, butyou are talking about using ANOTHER programming language, Python.Butif I had never heard about Python (especially if I'm a newbie in C++)?

I'm talking about using pure C++, and my solution is so easy to use (imho), that it could use even a newbie.

Mika Heiskanen

5:11 a.m.

On Sat, 2010-11-27 at 06:57 +0200, Denis Shevchenko wrote:

...

On 27.11.2010 03:48, Jeff Benshetler wrote:

...
I've switched to using Python for configuration files about 6 years ago, and stopped writing "little languages" of my own. There is a great deal of flexibility and extensibility from using a general purpose language, and it has a fully documented syntax with good error messages. It's good idea, butyou are talking about using ANOTHER programming language, Python.Butif I had never heard about Python (especially if I'm a newbie in C++)?

I'm talking about using pure C++, and my solution is so easy to use (imho), that it could use even a newbie.

I on the other hand was pretty much excited about the possibility of scripting my configuration, so I googled for more. I assumed you are using Boost.Python, but unfortunately I found the documentation lacking and did not see an easy way to read higher level structures from the configuration files, nor could I find any simple examples to do so. Are there any, or would you care to provide one, Jeff? --> Mika Heiskanen

Hal Finkel

28 Nov 28 Nov

2:24 a.m.

On Fri, 2010-11-26 at 18:48 -0600, Jeff Benshetler wrote:

...

I've switched to using Python for configuration files about 6 years ago, and stopped writing "little languages" of my own. There is a great deal of flexibility and extensibility from using a general purpose language, and it has a fully documented syntax with good error messages.

I think some care is required here. Bringing in an embedded scripting language to a project is not necessarily straightforward, easy or desirable, even if using Boost.Python, luabind, etc. While I think the advantages of using an embedded scripting language are clear, there can also be disadvantages: 1. Scripting languages, even simple ones like Lua, are large, external codebases, generally written in C. These codes can have porting issues, packaging/distribution/installation issues, thread-safety issues, etc. In addition, non-trivial build-process modifications might be necessary. 2. Execution safety might be hard to guarantee when using an embedded scripting language, especially for those that allow the loading of external modules. It might be reasonable to read a configuration file from an untrusted source while not being reasonable to execute a script from an untrusted source. In addition, using the scripting language could be significantly slower than using a dedicated configuration-file parser. A script could enter an infinite loop, as an extreme example. In short, there could be security implications. 3. Many scripting languages have their own I/O facilities, which may or may not be customizable or able to interface with iostreams classes, and are often tied to low-level operating-system I/O primitives. 4. A scripting language might not match user expectations. Especially on POSIX-style systems, users (and even more so, administrators) have a general understanding of the "look and feel" of a configuration file. The syntax used by the Apache httpd web sever, for example, is very popular, is easy to transform in an ad-hoc way using standard command-line tools and is straightforward to understand. Denis's Configurator library seems to use the same general style. 5. Error checking when using the scripting language may be harder, especially if the script configures the application by invoking application-side object creation. Since the scripting language can "do anything," there are lots of possible invalid uses to guard against. Also, since there are many different possible ways to use the scripting language for configuration, there will be no real uniformity across applications, even those which use the same embedded language. Finally, scripting languages and configuration-file languages are designed with different priorities: scripting languages are often designed to make writing programs easy while configuration-file syntax is designed to be simple both to read and to write. There is, of course, a large spectrum on the scripting language side: one might reasonably argue that it is easier to make hard-to-read programs in Perl than in Lua or Python, but it is hard to make a hard-to-read Apache httpd configuration file; That can be important. I think a library for configuration-file parsing would be quite useful. There are may use cases where an embedded scripting language is best, but I think there are also many for which a dedicated configuration-file-parsing library is superior. -Hal

...

On Fri, Nov 26, 2010 at 5:31 PM, Denis Shevchenko <for.dshevchenko@gmail.com

...
wrote:

...
On 27.11.2010 00:04, Stephen Nuchia wrote:

...
From: Denis Shevchenko [mailto:for.dshevchenko@gmail.com]

...
...
Is there any interest in a library for configuration file parsing?

Having done this once, just before TCL was announced, I won't ever do it again. Ousterhout's reasoning is, in my opinion, unassailable. Configuration files might as well be written in a full-featured, widely-understood embedded scripting language.

http://www.stanford.edu/~ouster/cgi-bin/papers/tcl-usenix.pdf

That would be Python now, right?

_______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hmm... Sorry, but I do not quite understand your idea... TCL, Python... I'm not familiar with script languages.

I propose a C++-solution which seems to me a easy-to-use and flexible and which I use myself for all my Linux-daemons. And I suggested that if this solutionseems convenientto me, It may seem convenient for others developers. I'm just trying to determine the interest in it...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Robert Ramey

6:38 a.m.

or you could save a lot of trouble and just use the serialization library. Robert Ramey

sguazt

7:40 a.m.

On Sun, Nov 28, 2010 at 7:38 AM, Robert Ramey <ramey@rrsd.com> wrote:

...

or you could save a lot of trouble and just use the serialization library.

Personally, I like to keep configuration file as user friendly as possible, I don't know in detail the capabilities of the boost serialization library, but, as I can see from the tutorial page, its output is not so "friendly" (e.g., see http://www.boost.org/doc/libs/1_45_0/libs/serialization/example/demofile.txt and http://www.boost.org/doc/libs/1_45_0/libs/serialization/example/demo_save.xm...). Said that, while the proposed library looks promising, I would avoid to call it "configurator" since it seems limited to parse INI files. Configuration file can be done in very different format. INI is just a choice, but not the only one. For instance, in the past I used to write conf file in XML. Now I'm quite happy with YAML (http://www.yaml.org), which IMHO combines a good tradeoff between simpleness, compactness and power. -- Marco

Denis Shevchenko

29 Nov 29 Nov

7:18 a.m.

On 28.11.2010 10:40, sguazt wrote:

...

Said that, while the proposed library looks promising, I would avoid to call it "configurator" since it seems limited to parse INI files. Configuration file can be done in very different format. INI is just a choice, but not the only one. Hi, sguazt!

INI-file is a simplest format supported by my library (I understand that many programmers is quite enough INI-files). But Configurator supports also: - option's default value and necessity, - arbitrary nesting of sections, - checks of value's semantic, "classical" (IPv4 and IPv6, path, email) and extended (file 'size' and 'time period'), - options with multi-values (something like in Apache httpd.conf), - reparsing with using initial default values of options (will be added in next version). See example of advanced usage of Configurator: http://opensource.dshevchenko.biz/configurator/examples/advanced.

Paul A. Bristow

5:44 p.m.

...

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Denis Shevchenko Sent: Monday, November 29, 2010 7:18 AM To: boost@lists.boost.org Subject: Re: [boost] Library for configuration file parsing

On 28.11.2010 10:40, sguazt wrote:

...
Said that, while the proposed library looks promising, I would avoid to call it "configurator" since it seems limited to parse INI files. Configuration file can be done in very different format. INI is just a choice, but not the only one. Hi, sguazt!

INI-file is a simplest format supported by my library (I understand that many programmers is quite enough INI-files). But Configurator supports also: - option's default value and necessity, - arbitrary nesting of sections, - checks of value's semantic, "classical" (IPv4 and IPv6, path, email) and extended (file 'size' and 'time period'), - options with multi-values (something like in Apache httpd.conf), - reparsing with using initial default values of options (will be added in next version).

See example of advanced usage of Configurator: http://opensource.dshevchenko.biz/configurator/examples/advanced.

This looks rather useful for those who need to do these things *with checking*. (And who wants to leave the hapless user in a muddle because they didn't check the IP address given isn't a valid IP address?) So I would encourage getting it in the sandbox (and in the 'standard' Boost folders structure). This will allow potential users to give it a proper test drive. BUT to be considered for a Boost Library, every file must contain the Boost licence conditions. (IMO you can have the MIT licence too - since they say the same, but to keep thing simple, we have decided we need to avoid any lawyers' doubt by requiring the Boost license). Paul --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com

Denis Shevchenko

7:39 p.m.

On 29.11.2010 20:44, Paul A. Bristow wrote:

...

So I would encourage getting it in the sandbox (and in the 'standard' Boost folders structure). This will allow potential users to give it a proper test drive.

Hello, Paul! As far as I could read in http://www.boost.org/development/submissions.html, library first needs to be placed in Boost Vault in .zip format. And before that the library should be brought into compliance with the requirements of "Boost Library Requirements and Guidelines". Or I can put the library in the Sandbox before that?

...

(IMO you can have the MIT licence too - since they say the same, but to keep thing simple, we have decided we need to avoid any lawyers' doubt by requiring the Boost license).

I'll change the license to the Boost license. :-) - Denis

Denis Shevchenko

2 Dec 2 Dec

2:33 p.m.

On 29.11.2010 20:44, Paul A. Bristow wrote:

...

So I would encourage getting it in the sandbox (and in the 'standard' Boost folders structure).

This will allow potential users to give it a proper test drive.

BUT to be considered for a Boost Library, every file must contain the Boost licence conditions.

Paul Hello again, Paul!

As you recommended to me, I put my library in 'sandbox'. Now it there: https://svn.boost.org/svn/boost/sandbox/configurator/ Of course, it under Boost Software License now. :-) - Denis

Bjørn Roald

28 Nov 28 Nov

8:52 a.m.

On 11/28/2010 07:38 AM, Robert Ramey wrote:

...

or you could save a lot of trouble and just use the serialization library.

If you are just referring to getting an application configuration stored and loaded by the same software, you may be right. But given the OP list of features, are you sure you would save a lot of trouble? I think there must be a number of those features that are not really supported well by serialization. I assume you are referring to use an XML archive as it shall be simple to edit. An XML schema could provide defaults, options, and validation rules, but the serialization library does not dive that deep into the XML world, does it? Sure, some of these things can probably be tweaked using custom validation code, and so forth, but what trouble is that saving. Configuration files like the OP addresses is something that most often are hand-tweaked through simple tools like a text editor. So in a sense it is a Human Machine Interface (HMI). I am not opposed to using XML for program configurations, but I am concerned it is too big and complex of an of an hammer in most cases. XML may make a lot of sense if the configurations require a lot of structure, support for extensibility, and you are addicted to the syntax. Personally I am somewhat allergic to XML as HMI. I think XML belong to the domain of machines talking to machines. This is this the domain I also think serialization is designed for and not what I think the OP is addressing. Configuration is really an activity when a system is instantiated, like when you install software on a host. Configuration is used to override hard-coded source code values. Some of these values are impossible to assume useful defaults for in source or detect during installation. Such values typically become required configuration values. Serialization typically comes to this game a step later when the deployed system is running. Configured values can then again be overridden by system, user, or document preferences and made persistent through use of serialization. If the same serialization code is used as part of an installer or configuration tool providing a good HMI, then serialization may be a good choice for storing the configuration. So this all flows somewhat together in some cases while they remain very separate activities in other cases. -- Bjørn

Rob Riggs

5:36 p.m.

...

4. A scripting language might not match user expectations. Especially on POSIX-style systems, users (and even more so, administrators) have a general understanding of the "look and feel" of a configuration file. For that, Boost Program Options offers a reasonable option. I think

On 11/27/2010 07:24 PM, Hal Finkel wrote: that this proposal for is for something with more features than one has with Program Options. Rob

Hal Finkel

8:21 p.m.

On Sun, 2010-11-28 at 10:36 -0700, Rob Riggs wrote:

...

...
4. A scripting language might not match user expectations. Especially on POSIX-style systems, users (and even more so, administrators) have a general understanding of the "look and feel" of a configuration file. For that, Boost Program Options offers a reasonable option. I think

On 11/27/2010 07:24 PM, Hal Finkel wrote: that this proposal for is for something with more features than one has with Program Options.

I think that is partially true. Boost.Program_options offers the parse_config_file function, but that only parses a simple INI-like format. For example, it is not natural to use named nested subsections in an INI file. Regardless, even parse_config_file will often be more appropriate than using an embedded scripting language. -Hal

...

Rob

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Marsh Ray

10:25 p.m.

On 11/28/2010 02:21 PM, Hal Finkel wrote:

...

Regardless, even parse_config_file will often be more appropriate than using an embedded scripting language.

What I would really like is a clean and simple JSON library. Last time I looked around (a year or two ago) it seemed like there were a lot of 50-80% side projects, none of which gave me the warm fuzzies about being tested and maintained. Many would parse but not generate, or vice versa. The DOM v SAX architectural decisions seem relevant too. - Marsh

Dean Michael Berris

10:54 p.m.

On Mon, Nov 29, 2010 at 6:25 AM, Marsh Ray <marsh@extendedsubset.com> wrote:

...

On 11/28/2010 02:21 PM, Hal Finkel wrote:

...
Regardless, even parse_config_file will often be more appropriate than using an embedded scripting language.

What I would really like is a clean and simple JSON library.

At the risk of sounding PR'ish...

...

Last time I looked around (a year or two ago) it seemed like there were a lot of 50-80% side projects, none of which gave me the warm fuzzies about being tested and maintained. Many would parse but not generate, or vice versa. The DOM v SAX architectural decisions seem relevant too.

It's actually on the list of things for me to do on cpp-netlib for 0.9 -- I'm working on cleaning up the internals of the library, and then preparing to do higher level utilities that will make web application or web service (REST+JSON) development with C++ easier. One of the things that I will be working on is a simple, robust, and type-safe way for doing JSON parsing/generation using Boost.Spirit. I'm positive there's already an example of how to do it with Boost.Spirit's Qi/Karma and I'm almost sure that I'll start with those. The idea with the utility library is that it will be usable in many different contexts -- and I'm actually prioritizing the parsing of HTTP requests that have JSON payload in PUT/POST requests. Of course that's just work waiting to be done -- if you have specific use cases in mind aside from just (simple) configuration file parsing, I'd definitely appreciate guidance/thoughts on what you would look for in a JSON parsing/generation library. HTH -- Dean Michael Berris deanberris.com

Marsh Ray

29 Nov 29 Nov

12:54 a.m.

On 11/28/2010 04:54 PM, Dean Michael Berris wrote:

...

On Mon, Nov 29, 2010 at 6:25 AM, Marsh Ray<marsh@extendedsubset.com> wrote:

...
What I would really like is a clean and simple JSON library.

At the risk of sounding PR'ish...

...
Last time I looked around (a year or two ago) it seemed like there were a lot of 50-80% side projects, none of which gave me the warm fuzzies about being tested and maintained. Many would parse but not generate, or vice versa. The DOM v SAX architectural decisions seem relevant too.

It's actually on the list of things for me to do on cpp-netlib for 0.9

Cool!

...

-- I'm working on cleaning up the internals of the library, and then preparing to do higher level utilities that will make web application or web service (REST+JSON) development with C++ easier.

One of the things that I will be working on is a simple, robust, and type-safe way for doing JSON parsing/generation using Boost.Spirit.

Honestly, the couple of times I've tried to use Spirit I have not been successful. I've done a few templates in my time but that library blows my mind. The concept is brilliant - and its implementation, heroic. But trying to actually use it tends to just make me feel dumb. I look at the diagrams at http://www.json.org/ and I see a simple byte-by-byte (or character-by-character) state machine. The kind of thing that's been done since the early C compilers, only much simpler. Something I could understand in a debugger or, more importantly, review for security in a network-facing application.

...

I'm positive there's already an example of how to do it with Boost.Spirit's Qi/Karma and I'm almost sure that I'll start with those.

I hate to say it, but what I want is not that. I can't put Spirit code out on a network-facing environment for the same reason that I can't put a Haskell program out in such an environment - I don't understand it under the hood well enough to reason about the upper limits on its runtime resource consumption. (Actually, in the Haskell case it's not clear that anyone does. :-)

...

The idea with the utility library is that it will be usable in many different contexts -- and I'm actually prioritizing the parsing of HTTP requests that have JSON payload in PUT/POST requests.

Of course that's just work waiting to be done -- if you have specific use cases in mind aside from just (simple) configuration file parsing, I'd definitely appreciate guidance/thoughts on what you would look for in a JSON parsing/generation library.

Haha, cool, I get to play the customer for once. My wishlist/thoughts: * An interface based on UTF-8 encoded std::strings. Locales and other string encodings are not helpful to me. * Require minimal header dependencies. For example, I take std::vector, map, string, shared_ptr, and BOOST_FOREACH as a given. But other big header trees should have a justification. * You mentioned type-safe. But the documents are completely dynamic, there's no schema. I'd rather just have everything presented as strings, but maybe the library would do reasonable automatic conversions on output. I would not want incoming untrusted JSON to create objects of attacker chosen types unless the interface makes the code state its expectations and throws an exception. Like dynamic_cast to a reference type (not like to a pointer type which defaults to a null pointer crash). * Some types I see as valuable to work with are "string of arbitrary text" (e.g. an unqualified std::string), "string claimed to be JSON" (we received it), and "string of known-valid JSON" (we generated or validated it). These are things that tend to get confused in applications, can result in security holes (double escaping bugs), and that stricter typing could help. * What would make it really industrial-strength (i.e., good enough for web apps) is a first-class mechanism for declaring limits on total memory usage and object allocation count before beginning a parsing or generation operation. * The DOM could have an interface sort of like: void f(shared_ptr<json::dom_node> jdn) { shared_ptr<json::object> jo = jdn->as<json::object>(); // throws if somehow not a json::object ^^^^^^^^^^^^ std::string username; BOOST_FOREACH(json::object_pair & jopr, jo->pairs()) { // Iteration actively randomizes the order. // It's not significant according to the spec, right? :-) if (jopr->name() == "username") username = jopr->value_as<std::string>(); // throws if ^^^^^^^^ throws if not a json::string node } ... } * shared_ptr is great, but an intrusive_ptr could be good too. Hopefully cyclic references shouldn't be a problem, but a whole-document pool deallocator could be helpful. I like a convention where node types expose a typedef like 'sptr_type' with its preferred smart pointer type. * It doesn't have to be a header-only library. It'd be better to have the interface small and simple. * Interfacing to boost::serialization could be cool, but it's probably not the primary use case right now. * I don't much care about what type of exception gets thrown. Anything under std::exception is fine. It would be good to have line and char position information for parsing errors. * It would be cool if the parser could be incrementally spoon-fed input data and code could pull data out of the generator incrementally as well. This would facilitate usage with ASIO-like callbacks. * A simple pair of functions for escaping and unescaping according to the actual JSON rules for the between-doublequotes context. * And a pony. Thanks, - Marsh

Rob Riggs

4:09 a.m.

On 11/28/2010 05:54 PM, Marsh Ray wrote:

...

Honestly, the couple of times I've tried to use Spirit I have not been successful. I've done a few templates in my time but that library blows my mind. The concept is brilliant - and its implementation, heroic. But trying to actually use it tends to just make me feel dumb. You are not the only one. Something I could understand in a debugger or, more importantly, review for security in a network-facing application. +1

The use of JSON will go far beyond that of a config file parser. The number of "eyes" that can grok Spirit code and accurately review it are vanishingly small. Rob

Joel de Guzman

5:56 a.m.

On 11/29/2010 12:09 PM, Rob Riggs wrote:

...

On 11/28/2010 05:54 PM, Marsh Ray wrote:

...
Honestly, the couple of times I've tried to use Spirit I have not been successful. I've done a few templates in my time but that library blows my mind. The concept is brilliant - and its implementation, heroic. But trying to actually use it tends to just make me feel dumb. You are not the only one. Something I could understand in a debugger or, more importantly, review for security in a network-facing application. +1

The use of JSON will go far beyond that of a config file parser. The number of "eyes" that can grok Spirit code and accurately review it are vanishingly small.

Why do you have to grok the internals of a library in order to use it successfully? I bet these same people can't grok template heavy code like MPL internals too. Yet, MPL is core to many libraries in Boost. Are you saying that people shouldn't use any library that depends on MPL too? Or are you singling out Spirit? If so, why? I've almost had enough of this FUD mongering. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Denis Shevchenko

7:42 a.m.

In future plans to add: 1. Support of multi-sections with one name at one level of nesting(for some programs this is a useful feature). 2. Reparsing with ability of using initial default values of options. Example: configurator.in( "Server" ).add_here( "Host" ).default_value( "127.0.0.1" ); Config: Host 12.56.78.44 std::string host = configurator.from( "Server" ).get_from_here( "Host" ); In this case host's value is 12.56.78.44. But what if we reparse config (during program execution)? If before reparsing we change Host to "34.77.88.05", value and thus will. But if we comment out Host: // Host 12.56.78.44 value of host will be again "127.0.0.1", as has been set at the beginning. Adding support for Unicode is not planned yet, although probably will. - Denis

Marsh Ray

10:56 p.m.

On 11/28/2010 11:56 PM, Joel de Guzman wrote:

...

On 11/29/2010 12:09 PM, Rob Riggs wrote:

...
On 11/28/2010 05:54 PM, Marsh Ray wrote:

...
Honestly, the couple of times I've tried to use Spirit I have not been successful. I've done a few templates in my time but that library blows my mind. The concept is brilliant - and its implementation, heroic. But trying to actually use it tends to just make me feel dumb. You are not the only one. Something I could understand in a debugger or, more importantly, review for security in a network-facing application. +1

The use of JSON will go far beyond that of a config file parser. The number of "eyes" that can grok Spirit code and accurately review it are vanishingly small.

Why do you have to grok the internals of a library in order to use it successfully?

For most programs - you don't. Most code is written to the garbage-in-garbage-out standard. For valid inputs it is tested to produce valid outputs. Well-written code goes the extra mile and tries to have well-defined handling even for invalid input. But there's a third level for code in internet-facing apps that handles untrusted data. In this case it has to be resilient against even actively malicious inputs crafted by a skilled attacker. Proving the negative (there are no possible inputs which can produce undesired behavior) is usually harder than proving the positive (these valid inputs produce desired behavior). Experience shows that this kind of code doesn't happen by accident, even from the most solid traditional software process. The state of secure coding art seems to be today much like where "software quality" processes were way back when we were just discovering things like unit testing. So we still fall back to subjective human judgement in most cases: has the code been reviewed for security? Did the reviewers find it comprehensible? In good taste? How many security holes did they find? Are the developers willing to stand behind? Even beyond its original design requirements?

...

I bet these same people can't grok template heavy code like MPL internals too.

:-)

...

Yet, MPL is core to many libraries in Boost. Are you saying that people shouldn't use any library that depends on MPL too?

I'm not saying anyone "shouldn't use" anything. I am saying that handling malicious crafted input tends to be outside the initial design requirements of most code (in cases where it's not, the developers will likely mention it somewhere). So taking general-purpose code and using it in security-critical contexts does require care. It's a bit like using commercial off-the shelf parts in your space satellite. Probably that transistor will work just fine, but you have to do some extra work to requalify it for your application. But with respect to something like MPL, yes, it's a consideration. Generative templates are an interesting case because they don't interpret data at runtime. But understanding exactly what code they do generate is still important though. The attacker may be reading your code bottom-up with a disassembler, so the potential exists for him to understand the generated code better than you do.

...

Or are you singling out Spirit? If so, why?

Wasn't intending to, but maybe a little bit. In this case: 1. I tried and could not make a Spirit parser for my input format after some amount of time. 2. I'm sure I could have gotten something working eventually, but I would not have felt comfortable with my understanding of all its failure modes. 3. JSON is an intentionally simple thing to parse. A competent C programmer could write a parser by hand in a short time. This exact thing is probably available under a free license. 4. JSON is used specifically for internet-facing apps due to its integration with Javascript. I.e., it's right there on the front lines of the attack surface.

...

I've almost had enough of this FUD mongering.

I'm sorry that you feel that way. It was not my intention to cast FUD. C++ and Boost are well suited for secure coding because of the support for type safety, automatic cleanup, and overall deterministic execution. But probably nothing will replace the need to understand what's going on under the hood as well as the attacker. - Marsh

Joel de Guzman

30 Nov 30 Nov

12:10 a.m.

On 11/30/2010 6:56 AM, Marsh Ray wrote:

...

...
Why do you have to grok the internals of a library in order to use it successfully?

For most programs - you don't. Most code is written to the garbage-in-garbage-out standard. For valid inputs it is tested to produce valid outputs.

Well-written code goes the extra mile and tries to have well-defined handling even for invalid input.

But there's a third level for code in internet-facing apps that handles untrusted data. In this case it has to be resilient against even actively malicious inputs crafted by a skilled attacker. Proving the negative (there are no possible inputs which can produce undesired behavior) is usually harder than proving the positive (these valid inputs produce desired behavior). Experience shows that this kind of code doesn't happen by accident, even from the most solid traditional software process.

The state of secure coding art seems to be today much like where "software quality" processes were way back when we were just discovering things like unit testing. So we still fall back to subjective human judgement in most cases: has the code been reviewed for security? Did the reviewers find it comprehensible? In good taste? How many security holes did they find? Are the developers willing to stand behind? Even beyond its original design requirements?

[snip]

...

...
I've almost had enough of this FUD mongering.

I'm sorry that you feel that way. It was not my intention to cast FUD.

C++ and Boost are well suited for secure coding because of the support for type safety, automatic cleanup, and overall deterministic execution. But probably nothing will replace the need to understand what's going on under the hood as well as the attacker.

If that is your reasoning then you can't use Boost (nor cpp-netlib for that matter). "The number of "eyes" that can grok" Boost internals "and accurately review it are vanishingly small." Yet, anyone claiming to be a C++ security expert positioned to review code for "software quality" should be able to review any and all sort of C++ code, even the template heavy code like MPL. Spirit is not special. It uses the same template metaprogramming techniques that is prevalent in most of Boost libraries. There is no black magic there. [I'll try to reply to your Spirit related comments in another post. As I see it, your comments there (relating to not being able to make it your code work and that JSON is easy to code in C) is totally irrelevant.] Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Marsh Ray

2:09 a.m.

On 11/29/2010 06:10 PM, Joel de Guzman wrote:

...

On 11/30/2010 6:56 AM, Marsh Ray wrote:

...
C++ and Boost are well suited for secure coding because of the support for type safety, automatic cleanup, and overall deterministic execution. But probably nothing will replace the need to understand what's going on under the hood as well as the attacker.

If that is your reasoning then you can't use Boost (nor cpp-netlib for that matter).

Yeah. I've met a few developers, hackers, etc working in the data security space. I don't think any of them were particularly into of template metaprogramming. I doubt any would consider using MPL, for instance.

...

"The number of "eyes" that can grok" Boost internals "and accurately review it are vanishingly small."

That's neither good nor bad, but it does put Boost in a particular place in the design space. Actually, I do think it's good - we need projects that push out the corners to expand that space, and that's what I love about Boost. But a more typical software project being built by mortal men on a schedule usually can't afford the conceptual costs of bringing in more than one or maybe two of those corners. As square as this sounds, you've got to keep a foothold in the mainstream.

...

Yet, anyone claiming to be a C++ security expert positioned to review code for "software quality" should be able to review any and all sort of C++ code, even the template heavy code like MPL.

That's a really good point, but I think on balance I disagree. Here's why: Someone might know everything there is to know about, say, programming in C, but not be qualified to review hard-real-time robotics systems written in C (though they would obviously have a lot to contribute). So there's a distinction between the language and the problem domain and you need to be able to understand both. Template instantiation is a Turing-complete language and MPL specifically seeks to enable programming in it. So I think it's reasonable that even an experienced team of C++ practitioners might decide that some particular big chunk of metaprogramming wasn't worth the conceptual overhead for a particular project. This stuff has a cost, abstraction isn't free. Witness the threads on template error messages. But when it works out, it can be dynamite: STL.

...

Spirit is not special. It uses the same template metaprogramming techniques that is prevalent in most of Boost libraries. There is no black magic there.

Haha, I don't think you'll find many people (outside of this list) who don't think it is exactly black magic. :-) Personally I think it's pretty awesome.

...

[I'll try to reply to your Spirit related comments in another post. As I see it, your comments there (relating to not being able to make it your code work and that JSON is easy to code in C) is totally irrelevant.]

All I was trying to say was that I could really use a no-fuss, bulletproof JSON facility and Spirit seemed like overkill for this simple format. - Marsh

Joel de Guzman

7:10 a.m.

On 11/30/2010 10:09 AM, Marsh Ray wrote:

...

This stuff has a cost, abstraction isn't free. Witness the threads on template error messages. But when it works out, it can be dynamite: STL.

...
Spirit is not special. It uses the same template metaprogramming techniques that is prevalent in most of Boost libraries. There is no black magic there.

Haha, I don't think you'll find many people (outside of this list) who don't think it is exactly black magic. :-)

Personally I think it's pretty awesome.

...
[I'll try to reply to your Spirit related comments in another post. As I see it, your comments there (relating to not being able to make it your code work and that JSON is easy to code in C) is totally irrelevant.]

All I was trying to say was that I could really use a no-fuss, bulletproof JSON facility and Spirit seemed like overkill for this simple format.

I'd like to convince you otherwise. Spirit is meant to be for such simple to moderate parsing tasks such as parsing complex numbers CSV, arithmetic expressions and definitely JSON. The stigma of Spirit (and any TMP heavy libraries for that matter) stems from 1) Outrageous error messages and 2) Long compile times. Possibly third would be the steep learning curve, but that may very well be connected to 1 -It's a pain to get past the error messages, I know; one misstep and you'll get tons of undecipherable errors. That, and the perception of complexity (the black art, if you will), makes it hard for Spirit to be accepted into the mainstream. In reality, beneath the TMP (some say gimmickry), Spirit is actually quite simple. What I wish to do is to dispel that image of complexity and make it simple (again as it was in the beginning). That is our new mission. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Marsh Ray

4:54 p.m.

On 11/30/2010 01:10 AM, Joel de Guzman wrote:

...

On 11/30/2010 10:09 AM, Marsh Ray wrote:

...
All I was trying to say was that I could really use a no-fuss, bulletproof JSON facility and Spirit seemed like overkill for this simple format.

I'd like to convince you otherwise. Spirit is meant to be for such simple to moderate parsing tasks such as parsing complex numbers CSV, arithmetic expressions and definitely JSON.

OK, I'm up for that. I have a "hard real time deadline" today but am willing to give it another go afterwards. Perhaps I should email you off-list once I get the project open again in the next day or two? Or if others say they're interested we could take it to the -users list.

...

The stigma of Spirit (and any TMP heavy libraries for that matter) stems from 1) Outrageous error messages and 2) Long compile times. Possibly third would be the steep learning curve, but that may very well be connected to 1 -It's a pain to get past the error messages, I know; one misstep and you'll get tons of undecipherable errors.

I know. FWIW, I felt like I could get through the template errors reasonably well.

...

That, and the perception of complexity (the black art, if you will), makes it hard for Spirit to be accepted into the mainstream. In reality, beneath the TMP (some say gimmickry), Spirit is actually quite simple.

Hmmmm... I'm keeping an open mind. :-)

...

What I wish to do is to dispel that image of complexity and make it simple (again as it was in the beginning). That is our new mission.

I made a generative template system for a lightweight key-record data persistence system here at work. The record types are usually updated by someone else...who doesn't even know much C++! So I know it can be done. - Marsh

Jeff Flinn

6:31 p.m.

Marsh Ray wrote:

...

On 11/30/2010 01:10 AM, Joel de Guzman wrote:

...
On 11/30/2010 10:09 AM, Marsh Ray wrote: ... That, and the perception of complexity (the black art, if you will), makes it hard for Spirit to be accepted into the mainstream. In reality, beneath the TMP (some say gimmickry), Spirit is actually quite simple.

Hmmmm... I'm keeping an open mind. :-)

Especially letting go of a procedural mindset in favor of a declarative one will ease the transition. Jeff

Joel de Guzman

11:46 p.m.

On 12/1/2010 12:54 AM, Marsh Ray wrote:

...

On 11/30/2010 01:10 AM, Joel de Guzman wrote:

...
On 11/30/2010 10:09 AM, Marsh Ray wrote:

...
All I was trying to say was that I could really use a no-fuss, bulletproof JSON facility and Spirit seemed like overkill for this simple format.

I'd like to convince you otherwise. Spirit is meant to be for such simple to moderate parsing tasks such as parsing complex numbers CSV, arithmetic expressions and definitely JSON.

OK, I'm up for that. I have a "hard real time deadline" today but am willing to give it another go afterwards.

Perhaps I should email you off-list once I get the project open again in the next day or two? Or if others say they're interested we could take it to the -users list.

The Spirit general list is the best forum for this. There are lots of very helpful folks there. I'm but one of 'em. There's a good chance that some folks have had the same problems that you faced. See: http://boost-spirit.com/home/feedback-and-support/

...

...
The stigma of Spirit (and any TMP heavy libraries for that matter) stems from 1) Outrageous error messages and 2) Long compile times. Possibly third would be the steep learning curve, but that may very well be connected to 1 -It's a pain to get past the error messages, I know; one misstep and you'll get tons of undecipherable errors.

I know. FWIW, I felt like I could get through the template errors reasonably well.

...
That, and the perception of complexity (the black art, if you will), makes it hard for Spirit to be accepted into the mainstream. In reality, beneath the TMP (some say gimmickry), Spirit is actually quite simple.

Hmmmm... I'm keeping an open mind. :-)

...
What I wish to do is to dispel that image of complexity and make it simple (again as it was in the beginning). That is our new mission.

I made a generative template system for a lightweight key-record data persistence system here at work. The record types are usually updated by someone else...who doesn't even know much C++! So I know it can be done.

The thing is, with declarative systems, as soon as you get it working, (and get past the error messages) it just works! Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Robert Ramey

1 Dec 1 Dec

5:08 p.m.

Marsh Ray wrote:

...

All I was trying to say was that I could really use a no-fuss, bulletproof JSON facility and Spirit seemed like overkill for this simple format.

The benefit of using spirit to make a parser is that reduces the task to rendering the grammer as BNF or PEG syntax. Subsequent maintainence is reduced to tweaking the grammar. Writing a "correct" and "robust" parser for some grammer x in C or C++ is much more work than it first appears. It is also extremely tedious and error prone. The worst is that it is very easy to make a "first cut" which works "pretty well" and that decieves one into thinking that he's about done when in fact there is a lot of tweaks in the pipeline end point cases, and lots of unanticipated errors. Using spirit mostly eliminates these problems. Spirit does take some time to learn. Using a declarative syntax takes getting used to. And there are hiccups with TMP. But all in all, it is a much better way to get the most of one's brain. It results in a better defined, more complete, more robust, more correct and cheaper to maintain and upgrade final product. I personally would be suspicious of a library submission which could use it but declines to do so. Actually, parsing a JSON file would best rendered as an example application for spirit. Robert Ramey

...

- Marsh _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Marsh Ray

8:59 p.m.

On 12/01/2010 11:08 AM, Robert Ramey wrote:

...

The benefit of using spirit to make a parser is that reduces the task to rendering the grammer as BNF or PEG syntax. Subsequent maintainence is reduced to tweaking the grammar.

Firstly, you are preaching to the choir. I am usually the one within the organization to be saying these things, so it feels weird to argue the other side the least little bit. Consider me to be telling you what my coworkers/boss would be expected to say, knowing me as they do.

...

Writing a "correct" and "robust" parser for some grammer x in C or C++ is much more work than it first appears. It is also extremely tedious and error prone. The worst is that it is very easy to make a "first cut" which works "pretty well" and that decieves one into thinking that he's about done when in fact there is a lot of tweaks in the pipeline end point cases, and lots of unanticipated errors.

Dude, it's something like 17 states: http://www.json.org/ Consider our "risk to schedule" added by a couple of hundred lines of non-abstract C code vs. mastering a new Boost metaprogramming DSEL. There are also several MIT-like licensed implementations available to choose from. And of course there's always the "everybody on the team needs to know it in case that one guy gets hit by a bus" (i.e. fired).

...

Using spirit mostly eliminates these problems. Spirit does take some time to learn. Using a declarative syntax takes getting used to. And there are hiccups with TMP. But all in all, it is a much better way to get the most of one's brain.

Well said.

...

It results in a better defined, more complete, more robust, more correct and cheaper to maintain and upgrade final product.

I would like to be able to say that from my own experience.

...

I personally would be suspicious of a library submission which could use it but declines to do so. Actually, parsing a JSON file would best rendered as an example application for spirit.

Yeah, I think it would make a great example. When I had tried it, I referred to the calculator examples a lot. But they calculate the result and return just that! I wanted to build something very close to the parse tree, but had significant trouble figuring out how to construct the node types and pass them upwards. The features for automatically constructing the return value were the most promoted by the documentation. It seemed to work great for things like ints and doubles, things that have an obvious primitive representation in C++. But the automatic facility seemed to just get in the way as soon as I wanted to return custom structs. There was the XML DOM example, but it was not simple enough for me. Thanks, I'm joining the Spirit list now. - Marsh

Mika Heiskanen

9:25 p.m.

On Wed, 2010-12-01 at 22:59 +0200, Marsh Ray wrote:

...

On 12/01/2010 11:08 AM, Robert Ramey wrote:

...
I personally would be suspicious of a library submission which could use it but declines to do so. Actually, parsing a JSON file would best rendered as an example application for spirit.

Yeah, I think it would make a great example.

Once again: http://www.codeproject.com/KB/recipes/JSON_Spirit.aspx Unicode support is lacking in features though. Regards, --> Mika Heiskanen

Joel de Guzman

11:36 p.m.

On 12/2/2010 5:25 AM, Mika Heiskanen wrote:

...

On Wed, 2010-12-01 at 22:59 +0200, Marsh Ray wrote:

...
On 12/01/2010 11:08 AM, Robert Ramey wrote:

...
I personally would be suspicious of a library submission which could use it but declines to do so. Actually, parsing a JSON file would best rendered as an example application for spirit.

Yeah, I think it would make a great example.

Once again:

http://www.codeproject.com/KB/recipes/JSON_Spirit.aspx

Unicode support is lacking in features though.

Spirit2 has unicode support. I'll most probably write a robust JSON parser very soon. It will play along perfectly with the S-expression example and the universal-tree AST (utree) Hartmut and I have been tinkering with for the past BoostCon. Our current goal is to 1) ease the learning curve 2) provide a clear and easy development cycle for incremental development. A JSON parser would be a good candidate for a proof of concept towards that goal. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Stewart, Robert

2 Dec 2 Dec

12:53 p.m.

Joel de Guzman wrote:

...

On 12/2/2010 5:25 AM, Mika Heiskanen wrote:

...
On Wed, 2010-12-01 at 22:59 +0200, Marsh Ray wrote:

...
On 12/01/2010 11:08 AM, Robert Ramey wrote:

...
I personally would be suspicious of a library submission which could use it but declines to do so. Actually, parsing a JSON file would best rendered as an example application for spirit.

Yeah, I think it would make a great example.

Once again:

http://www.codeproject.com/KB/recipes/JSON_Spirit.aspx

Unicode support is lacking in features though.

Spirit2 has unicode support. I'll most probably write a robust JSON parser very soon. It will play along perfectly with the S-expression example and the universal-tree AST (utree) Hartmut and I have been tinkering with for the past BoostCon.

Such non-trivial examples, developed as tutorials, would be tremendously useful to improve the experience of those trying to use Spirit for non-trivial problems.

...

Our current goal is to 1) ease the learning curve 2) provide a clear and easy development cycle for incremental development.

These are excellent goals and will go a very long way to increasing the use of Spirit if met. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Dean Michael Berris

3:23 p.m.

On Thu, Dec 2, 2010 at 8:53 PM, Stewart, Robert <Robert.Stewart@sig.com> wrote:

...

Joel de Guzman wrote:

...
Spirit2 has unicode support. I'll most probably write a robust JSON parser very soon. It will play along perfectly with the S-expression example and the universal-tree AST (utree) Hartmut and I have been tinkering with for the past BoostCon.

Such non-trivial examples, developed as tutorials, would be tremendously useful to improve the experience of those trying to use Spirit for non-trivial problems.

+1 Actually I can't wait to get my hands on a book dedicated to Boost.Spirit. :D

...

...
Our current goal is to 1) ease the learning curve 2) provide a clear and easy development cycle for incremental development.

These are excellent goals and will go a very long way to increasing the use of Spirit if met.

Actually I have higher hopes for this. Really, a REPL for C++ template metaprogramming is badly needed IMO -- this necessitates an isolation of the template system in C++ into its own programming language which would really benefit from a dedicated compiler/interpreter. -- Dean Michael Berris deanberris.com

Robert Ramey

1 Dec 1 Dec

11:32 p.m.

Marsh Ray wrote:

...

On 12/01/2010 11:08 AM, Robert Ramey wrote:

...
It results in a better defined, more complete, more robust, more correct and cheaper to maintain and upgrade final product.

I would like to be able to say that from my own experience.

I'm basing my comments on my experience using a spirit xml parser for the serialization library input. I didn't (and still don't think parsing xml should be that hard. When I was faced with it my reaction was dread at having to do a small recurrsive descent or state machine parser for the umpteenth time.

...

When I had tried it, I referred to the calculator examples a lot. But they calculate the result and return just that! I wanted to build something very close to the parse tree, but had significant trouble figuring out how to construct the node types and pass them upwards.

I'll confess that using spirit was a lot harder than I thought it was going to be. This was mainly due to the fact the that declarative approach was sort of new to me. Then there was usage of things what I wasn't really familiar with at the time functors, binders, etc. But I also wanted to become familiar with these things. I did - with a little help - manage to get it working and was was extremely pleased with the result. There was the grammar displayed in a formal verifiable way totally separate from the other code - of which there was very little. It was much easier to be confident that it was correct - assuming spirit actually worked - which apparently it does.

...

The features for automatically constructing the return value were the most promoted by the documentation. It seemed to work great for things like ints and doubles, things that have an obvious primitive representation in C++. But the automatic facility seemed to just get in the way as soon as I wanted to return custom structs. There was the XML DOM example, but it was not simple enough for me.

Thanks, I'm joining the Spirit list now.

lol - I never hoped to be THAT persuasive. You might take a look at the xml_grammer code in the serialization library. I think it shows how valuable this tool can be. I turned a messy job into a work of art. Also note this made it easy to support a wide character version with ease. Finally, note that I have had my complaints and criticisms of spirit so I'm not really considered a member of the spirit "booster" team. I just took a look at the spirit parser cited in a previous email. This seems pretty good. I reminds me of my own experience - lots of cryptic functors and stuff and at the end - a very simple to understand listing of the actual JSON grammar. So it's a lot of work to setup but easy to maintain. Actually I see that he includes a JASON writer as well. If I had nothing else to do, I might be inclined to just make json_?archive classes out of these. Robert Ramey

Joel de Guzman

2 Dec 2 Dec

12:04 a.m.

On 12/2/2010 4:59 AM, Marsh Ray wrote:

...

On 12/01/2010 11:08 AM, Robert Ramey wrote:

...
The benefit of using spirit to make a parser is that reduces the task to rendering the grammer as BNF or PEG syntax. Subsequent maintainence is reduced to tweaking the grammar.

Firstly, you are preaching to the choir. I am usually the one within the organization to be saying these things, so it feels weird to argue the other side the least little bit. Consider me to be telling you what my coworkers/boss would be expected to say, knowing me as they do.

...
Writing a "correct" and "robust" parser for some grammer x in C or C++ is much more work than it first appears. It is also extremely tedious and error prone. The worst is that it is very easy to make a "first cut" which works "pretty well" and that decieves one into thinking that he's about done when in fact there is a lot of tweaks in the pipeline end point cases, and lots of unanticipated errors.

Dude, it's something like 17 states: http://www.json.org/

Consider our "risk to schedule" added by a couple of hundred lines of non-abstract C code vs. mastering a new Boost metaprogramming DSEL.

There are also several MIT-like licensed implementations available to choose from. And of course there's always the "everybody on the team needs to know it in case that one guy gets hit by a bus" (i.e. fired).

I'd say that anyone parsing text should at least have some knowledge of formal grammars and BNF. Reading a few lines of lines of declarative EBNF/PEG vs. hundred lines of non-abstract C code? You know what I'll prefer. This debate reminds me of the original Mozilla HTML parser written wayyy back using a hand-crafted state machine and a humongous switch. Sure it will work, but then you have to add another feature sometime soon and it gets pretty-exponentially nasty. In fact, I'd argue the other way: *THAT* should be your concern for potential security holes.

...

...
Using spirit mostly eliminates these problems. Spirit does take some time to learn. Using a declarative syntax takes getting used to. And there are hiccups with TMP. But all in all, it is a much better way to get the most of one's brain.

Well said.

...
It results in a better defined, more complete, more robust, more correct and cheaper to maintain and upgrade final product.

I would like to be able to say that from my own experience.

Robert has been using Spirit with Boost Serialization's XML parser. He wrote it once many years ago and had no problem with it at all (at least none that I am aware of). It's churning along just fine and he does not have to go back to it, let alone stepping through the code. It just works!

...

...
I personally would be suspicious of a library submission which could use it but declines to do so. Actually, parsing a JSON file would best rendered as an example application for spirit.

Yeah, I think it would make a great example.

When I had tried it, I referred to the calculator examples a lot. But they calculate the result and return just that! I wanted to build something very close to the parse tree, but had significant trouble figuring out how to construct the node types and pass them upwards.

This is the typical hurdle. We intend to seriously tackle that very problem. For now, additional information would help a lot. I'd recommend going through Michael Caisse's splendid BoostCon slides.

...

The features for automatically constructing the return value were the most promoted by the documentation. It seemed to work great for things like ints and doubles, things that have an obvious primitive representation in C++. But the automatic facility seemed to just get in the way as soon as I wanted to return custom structs. There was the XML DOM example, but it was not simple enough for me.

Thanks, I'm joining the Spirit list now.

See you there. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

Sohail Somani

29 Nov 29 Nov

7:11 a.m.

On 10-11-28 11:09 PM, Rob Riggs wrote:

...

...
Honestly, the couple of times I've tried to use Spirit I have not been successful. I've done a few templates in my time but that library blows my mind. The concept is brilliant - and its implementation, heroic. But trying to actually use it tends to just make me feel dumb. You are not the only one. Something I could understand in a debugger or, more importantly, review for security in a network-facing application. +1

The Spirit concepts are straightforward if you read the documentation. Debugging is actually pretty easy if you use the built-in debug facilities. I have not yet had to use it under a debugger and I use various versions of Spirit all the time. I guess I'm lucky, awesome or the library provides enough tools for me. If you compare Spirit to ANTLR, sure, stepping through ANTLR is easier. But I try not to have to step through code. zzzzzzz -- Sohail Somani -- iBlog : http://uint32t.blogspot.com iTweet: http://twitter.com/somanisoftware iCode : http://bitbucket.org/cheez

Mika Heiskanen

7:49 a.m.

On Mon, 2010-11-29 at 09:11 +0200, Sohail Somani wrote:

...

On 10-11-28 11:09 PM, Rob Riggs wrote:

...
...
Honestly, the couple of times I've tried to use Spirit I have not been successful. I've done a few templates in my time but that library blows my mind. The concept is brilliant - and its implementation, heroic. But trying to actually use it tends to just make me feel dumb. You are not the only one. Something I could understand in a debugger or, more importantly, review for security in a network-facing application. +1

The Spirit concepts are straightforward if you read the documentation.

I've been using JSON Spirit which available from http://www.codeproject.com/KB/recipes/JSON_Spirit.aspx The code seems readable enough. --> Mika Heiskanen

Torri, Stephen CIV NSWCDD, W15

30 Nov 30 Nov

1:46 p.m.

...

From: boost-bounces@lists.boost.org on behalf of Rob Riggs Sent: Sun 11/28/2010 11:09 PM To: boost@lists.boost.org Subject: Re: [boost] Library for configuration file parsing

The use of JSON will go far beyond that of a config file parser. The number of "eyes" that can grok Spirit code and accurately review it are vanishingly small.

Please enlighten me as to how you came to the conclusion that the "number of 'eyes' that can grok Spirit code and accurately review it are vanishingly small"? Stephen

Denis Shevchenko

29 Nov 29 Nov

6:48 a.m.

On 28.11.2010 05:24, Hal Finkel wrote:

...

I think a library for configuration-file parsing would be quite useful. There are may use cases where an embedded scripting language is best, but I think there are also many for which a dedicated configuration-file-parsing library is superior. -Hal Hi, Hal!

If you want - see Configurator's page: http://opensource.dshevchenko.biz/configurator/. Unfortunately, full user's guide is not ready yet, but examples explain almost all. On 28.11.2010 12:07, Bjørn Roald wrote:

...

Both program options and serialization provide that to some degree. One thing - what library can do, and another thing - what it is intended. I adhere to the principle "One task - one library". That's why I like to use Boost-libraries. As far as I can see the documentation, Boost.Serialization is certainly not designed for work with a configuration file.

...

I think program options lacks some of the desired flexibility for structure, necessity and validation, but that could possibly be added rather than making an all new library which will be confusing me and others even more. Well, Bjørn, I think it's a question for Vladimir Prus, not for me. :-)

Some time ago I started to use Program_options, but later the possibility of a simple INI-like file was not enough for my programs. I had no purpose to create own solution, so I started looking for a ready free library for configuration file parsing (moreover, it should not be GPL-solution, because it needs me for commercial use). And since I could not find a library corresponding to the principles of simplicity (easy-to-learn and easy-to-use), I decided to create own library under MIT Licence. And when the library has been successful (imho), I decided to determine the interests in it there, in Boost. - Denis

Bjørn Roald

8:58 p.m.

Denis, please reply to my posting when you reply to me. Pasting part of my post into another thread and replying there is confusing. On 11/29/2010 07:48 AM, Denis Shevchenko wrote:

...

On 28.11.2010 12:07, Bjørn Roald wrote:

...
Both program options and serialization provide that to some degree. One thing - what library can do, and another thing - what it is intended. I adhere to the principle "One task - one library". That's why I like to use Boost-libraries. As far as I can see the documentation, Boost.Serialization is certainly not designed for work with a configuration file.

Sure, I agree and I was only referring to the fact that these libraries seems to support more than one external data representation of something that internally is used the same way by the library client code.

...

...
I think program options lacks some of the desired flexibility for structure, necessity and validation, but that could possibly be added rather than making an all new library which will be confusing me and others even more. Well, Bjørn, I think it's a question for Vladimir Prus, not for me. :-)

I guess I can ask Vladimir to extend his library, but I did not and do not think I will. I did however ask you why you selected to write a new library rather than proposing an extension to program option. But, you did not paste that part of my post here, so it got lost ;-)

...

Some time ago I started to use Program_options, but later the possibility of a simple INI-like file was not enough for my programs. I had no purpose to create own solution, so I started looking for a ready free library for configuration file parsing (moreover, it should not be GPL-solution, because it needs me for commercial use). And since I could not find a library corresponding to the principles of simplicity (easy-to-learn and easy-to-use), I decided to create own library under MIT Licence.

Have you considered what it would involve to make program option suitable for your use-cases?

...

And when the library has been successful (imho), I decided to determine the interests in it there, in Boost.

Good, It is always a good thing with new proposals :-) It may be your proposal is superior to existing libraries - I do not know. However we do have existing libraries in this domain, even if the features vary. I am concerned if Boost keep adding new libraries covering the same ground because someone felt like writing a new one from scratch. In your case it sounds like it is developed it without Boost in mind, and now proposed as seemed to be useful in Boost. I would be less concerned if I saw a good reasoning why your advanced features could or should not be added to Program Options or possibly Property Tree. Failing to understand this, I will assume boost would be better off with enhancements to existing libraries. In that case I will encourage you to contribute to that end. -- Bjørn

Denis Shevchenko

30 Nov 30 Nov

5:17 a.m.

...

I did however ask you why you selected to write a new library rather than proposing an extension to program option. Have you considered what it would involve to make program option suitable for your use-cases? The fact that I had to make the solution for commercial use, and do it quickly. I just have not had the time to start cooperating with the authors of existing Boost libraries on the subject. And honestly, at

On 29.11.2010 23:58, Bjørn Roald wrote: that moment I did not think about this scenario... And now library is ready,and I use it for all my commercial developments for several months. Website was created, written documentation. That is, it's already fully prepared and independent solution...

...

In your case it sounds like it is developed it without Boost in mind, and now proposed as seemed to be useful in Boost. Yes, it was. This library was created using Boost, but when I created it I did not expectthat it may be part of Boost.

I understand your concern, Bjørn, and strictly speaking you are right... - Denis

Denis Shevchenko

8:20 a.m.

On 29.11.2010 23:58, Bjørn Roald wrote:

...

However we do have existing libraries in this domain, even if the features vary. Hello again, Bjørn!

I venture to suggest to consider my library in Boost Vault. You are right, there are exists libraries of the same domain. But I see that in Boost, for example, exists TWO libraries for work with regular expressions, Regex and Xpressive. And this is good (imho), because user can choose between libraries (for example, between header-only and linking variants). Program_options is auto linking, my library is header-only. :-)

Bjørn Roald

1 Dec 1 Dec

4:27 a.m.

On 11/30/2010 09:20 AM, Denis Shevchenko wrote:

...

On 29.11.2010 23:58, Bjørn Roald wrote:

...
However we do have existing libraries in this domain, even if the features vary. Hello again, Bjørn!

I venture to suggest to consider my library in Boost Vault. You are right, there are exists libraries of the same domain. But I see that in Boost, for example, exists TWO libraries for work with regular expressions, Regex and Xpressive. And this is good (imho), because user can choose between libraries (for example, between header-only and linking variants). Program_options is auto linking, my library is header-only. :-)

There is nothing preventing acceptance other than acceptance in a review. So I wish you good luck :-) All I am expressing is that I would like to see effort in the same domain consolidated if that work out for the authors. I also think you would have a better chance getting your features into boost that way, but that is only my opinion. As for Regex and Xpressive, you are right. This is one example of boost libraries covering the same domain. However they AFAIK have different implementation approaches. Also their API vary in non trivial ways. Expressive support a completely different approach to instantiate regular expressions at compile-time rather than run-time. Regex is tied to a proposal for the C++ standard library. So I think that there is a certain amount of rationale for making and keeping separate solutions. -- Bjørn

Denis Shevchenko

29 Nov 29 Nov

12:27 p.m.

On 28.11.2010 05:24, Hal Finkel wrote:

...

I think a library for configuration-file parsing would be quite useful. There are may use cases where an embedded scripting language is best, but I think there are also many for which a dedicated configuration-file-parsing library is superior.

Hello again, Hal! One of the features of Configurator is semantics check. I understand that semantics check is not a mandatory featureof such library, but I check "standard values" only. IPv4, IPv6, email, path (in current filesystem) - it is "global and immutable concepts". Imho, checking these values at the time of parsing the configuration file is very useful. Check the extended semantics ('time period' and 'size') was added because I use it myself ('size' for max size of log file, and 'time period' for different periodic thread tasks).Perhaps it will be useful not only for me... :-) What do you think about it?

Hal Finkel

3:16 p.m.

On Mon, 2010-11-29 at 15:27 +0300, Denis Shevchenko wrote:

...

On 28.11.2010 05:24, Hal Finkel wrote:

...
I think a library for configuration-file parsing would be quite useful. There are may use cases where an embedded scripting language is best, but I think there are also many for which a dedicated configuration-file-parsing library is superior.

Hello again, Hal!

One of the features of Configurator is semantics check. I understand that semantics check is not a mandatory featureof such library, but I check "standard values" only. IPv4, IPv6, email, path (in current filesystem) - it is "global and immutable concepts". Imho, checking these values at the time of parsing the configuration file is very useful.

Check the extended semantics ('time period' and 'size') was added because I use it myself ('size' for max size of log file, and 'time period' for different periodic thread tasks).Perhaps it will be useful not only for me... :-)

What do you think about it?

Do you just mean a kind of parse-time validation framework (where the parser checks that an e-mail setting looks like a valid e-mail address, etc.)? That can be useful, but make sure that you can get warnings instead of errors (for e-mail this is important). path checking is also useful, but failure to exist is not always an error, sometimes the application can create a missing directory, and complain only if it cannot do so. In short, the behavior must be customizable. On a different subject, the functionality that I think is missing from configuration-file parsing libraries (generally, and I've looked at quite a few) is a change-event-based interface to support runtime reloading. It would be nice to be able to respond to "configuration events" instead of iterating through a large tree structure. Then when user requests the configuration be reloaded, the library only sends change events for those things which have actually changed (this requires matching named sections, since the order may have changed). I should have the option of reusing the same event loop for initial loading as well. I coded this for an application once, and it was a pain, but worked well since the application "did what the user expected", so I think it would be great to have this "configuration diff" pattern encapsulated in a publicly-available library (it was a server app, and it was important that connections logically unaffected by a config change stay up, for example). -Hal

...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Denis Shevchenko

3:51 p.m.

On 29.11.2010 18:16, Hal Finkel wrote:

...

Do you just mean a kind of parse-time validation framework (where the parser checks that an e-mail setting looks like a valid e-mail address, etc.)? Yes. That can be useful, but make sure that you can get warnings instead of errors (for e-mail this is important). Hm... But what if user inputs invalid e-mail address by mistake? Why should this be a warning and not a error? path checking is also useful, but failure to exist is not always an error, sometimes the application can create a missing directory... Ops, that's what I do not think (I thought about creating a file, but not about creating a directory)... Thank you very much. In short, the behavior must be customizable. Thanks, I'll remember this. ... the functionality that I think is missing from configuration-file parsing libraries is a change-event-based interface to support runtime reloading. Do you mean an "automatic" reparsing in case the configuration file has been modified? Hm, it's interesting... Then when user requests the configuration be reloaded, the library only sends change events for those things which have actually changed (this requires matching named sections, since the order may have changed). I should have the option of reusing the same event loop for initial loading as well. Yes, you are right, this is a good proposal.

Thank you again, Hal. Today is my TODO file clearly increase. :-) - Denis

Hal Finkel

5:49 p.m.

On Mon, 2010-11-29 at 18:51 +0300, Denis Shevchenko wrote:

...

On 29.11.2010 18:16, Hal Finkel wrote:

...
Do you just mean a kind of parse-time validation framework (where the parser checks that an e-mail setting looks like a valid e-mail address, etc.)? Yes. That can be useful, but make sure that you can get warnings instead of errors (for e-mail this is important). Hm... But what if user inputs invalid e-mail address by mistake? Why should this be a warning and not a error?

Many times it should be an error, but sometimes it should be a warning. In some sense, it depends on how robust your validation routine is (does it except '+' with a mailbox name, or multiple mailboxes, etc.), but sometimes people use a custom syntax for private (internal) messaging, and since, in the end, the e-mail address is just an uninterpreted string being passed to the SMTP server (or similar), the user sometimes needs the ability to use private extensions.

...

...
path checking is also useful, but failure to exist is not always an error, sometimes the application can create a missing directory... Ops, that's what I do not think (I thought about creating a file, but not about creating a directory)... Thank you very much. In short, the behavior must be customizable. Thanks, I'll remember this. ... the functionality that I think is missing from configuration-file parsing libraries is a change-event-based interface to support runtime reloading. Do you mean an "automatic" reparsing in case the configuration file has been modified? Hm, it's interesting...

Generally, a reload is externally triggered (via a signal or some administrative interface). Reloading any time the file changes is a bad idea, since it could be in an inconsistent state. In my opinion, just provide a reload function.

...

...
Then when user requests the configuration be reloaded, the library only sends change events for those things which have actually changed (this requires matching named sections, since the order may have changed). I should have the option of reusing the same event loop for initial loading as well. Yes, you are right, this is a good proposal.

I'm glad you like it ;) -Hal

...

Thank you again, Hal

...

Today is my TODO file clearly increase. :-)

- Denis _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Denis Shevchenko

7:19 p.m.

On 29.11.2010 20:49, Hal Finkel wrote:
> Many times it should be an error, but sometimes it should be a warning.
> In some sense, it depends on how robust your validation routine is (does
> it except '+' with a mailbox name, or multiple mailboxes, etc.), but
> sometimes people use a custom syntax for private (internal) messaging,
> and since, in the end, the e-mail address is just an uninterpreted
> string being passed to the SMTP server (or similar), the user sometimes
> needs the ability to use private extensions.
You are right, Hal, I'll remember this.
> In my opinion, just provide a reload function.
Already. There are two functions:
- configurator::reparse()
- configurator::reparse( const std::string& path_to_new_configuration_file )

First variant for reparsing "old" configuration file, second - for 
reparsing new configuration file.

Stewart, Robert

7:39 p.m.

Denis Shevchenko wrote:
> On 29.11.2010 20:49, Hal Finkel wrote:
>
> > In my opinion, just provide a reload function.
> Already. There are two functions:
> - configurator::reparse()
> - configurator::reparse( const std::string& path_to_new_configuration_file )
>
> First variant for reparsing "old" configuration file, second - for
> reparsing new configuration file.

One cannot reparse what hasn't been parsed.  I suggest "reparse" and "parse" or "reload" and "load" for those two functions.  (The latter pair is slightly easier to say.)

Have you considered the ability to parse multiple files thereby augmenting the data with each new file's content rather than replacing what had been parsed already?  In that case, the parse()/load() function would need to distinguish between replacing and augmenting the existing content or you'd need to distinguish the function names further:

   // reload contents from files used to generate current contents
   reload();

   // replace current contents with that from a new file
   replace(std::string const &);

   // augment current contents with that from a new file
   append(std::string const &);

Do you support programmatic manipulations of the configuration values?  That implies returning references to the configuration values or providing a pub/sub mechanism for learning of changes.  That, in turn, means that reloading or replacing content must invalidate outstanding references/subscriptions (though subscriptions could survive if the value in question exists after the update and still refers to the same type, if there are types).

Do you support references among configuration values?  Doing so permits some helpful use cases.  (Just think about augmenting a shell's PATH versus replacing it outright.)

_____
Rob Stewart                           robert.stewart@sig.com
Software Engineer, Core Software      using std::disclaimer;
Susquehanna International Group, LLP  http://www.sig.com

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Denis Shevchenko

8:43 p.m.

On 29.11.2010 22:39, Stewart, Robert wrote:

...

One cannot reparse what hasn't been parsed. I suggest "reparse" and "parse" or "reload" and "load" for those two functions. (The latter pair is slightly easier to say.) Hello Robert!

"Parse" and "load". Strictly speaking you are right, because parsing is just one of the steps work with the configuration file. So from a technical point of view the name "load" preferably than "parse". Thanks.

...

Have you considered the ability to parse multiple files thereby augmenting the data with each new file's content rather than replacing what had been parsed already?

Confess, I have not thought about it. Do you think this possibility is really useful?

...

That, in turn, means that reloading or replacing content must invalidate outstanding references/subscriptions I do not use references. If in configuration file exists option "Host":

Host 44.67.42.90 ------------------------ and we register it: cf::configurator.add_option( "Host" ).default_value( "127.0.0.1" ); then after parsing we have object cf::option with name "Host" and with value "44.67.42.90" (that is, in fact, two std::string). If user obtains it value: std::string host = cf::configurator.get_value( "Host" ); he just gets a copy of "44.67.42.90". And if configuration will be changed: ------------------------ Host 56.89.45.44 ------------------------ and after that user calls reparse() function, value of corresponding object cf::option (with name "Host") will be replaced by "56.89.45.44". After that user can obtain (already new) value: std::string host = cf::configurator.get_value( "Host" ); And if after that user again changes configuration file and comment out "Host" option: ------------------------ // Host 56.89.45.44 ------------------------ and calls reparse() function, and obtains it value: std::string host = cf::configurator.get_value( "Host" ); he gets a copy of initial default value "127.0.0.1". - Denis

Stewart, Robert

9:01 p.m.

Denis Shevchenko wrote:

...

On 29.11.2010 22:39, Stewart, Robert wrote:

...
Have you considered the ability to parse multiple files thereby augmenting the data with each new file's content rather than replacing what had been parsed already?

Confess, I have not thought about it. Do you think this possibility is really useful?

Yes. We use it constantly. In one configuration system, files are segregated by user, machine, domain, etc., and appropriate files are selected at runtime, if found. In another, the files are grouped by library or external entity for which configuration is needed and a list of files is applied to a given executable. BTW, one must control overriding configuration data. By default, I recommend that redefining a configuration value be an error. Then, use some explicit syntax that indicates the desire to override/define a configuration value. Having that, it can also be useful to have syntax that overrides a configuration value if it exists or ignores the override otherwise. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Denis Shevchenko

30 Nov 30 Nov

6:20 p.m.

Hello friends! If someone wants to try out my library at work - I have already uploaded it in Boost Vault, name 'configurator-0.9.2.zip'. I apologize for the miserable documentation - it will improve in the nearfuture. Would be very grateful for any comments, critique and bug reports. - Denis

Sohail Somani

28 Nov 28 Nov

4:46 p.m.

On 10-11-26 4:04 PM, Stephen Nuchia wrote:

...

...
...
From: Denis Shevchenko [mailto:for.dshevchenko@gmail.com] Is there any interest in a library for configuration file parsing?

Having done this once, just before TCL was announced, I won't ever do it again. Ousterhout's reasoning is, in my opinion, unassailable. Configuration files might as well be written in a full-featured, widely-understood embedded scripting language.

http://www.stanford.edu/~ouster/cgi-bin/papers/tcl-usenix.pdf

That would be Python now, right?

I've switched to using Python for configuration files. Among other things, the main benefit is that it allows the user to do weird crap you never thought of and never need to think of. It's good being able to say yes to nearly every request in relation to configuration files without writing more code :-) But why stop at configuration files? Application extensions can also be written in Python. -- Sohail Somani -- iBlog : http://uint32t.blogspot.com iTweet: http://twitter.com/somanisoftware iCode : http://bitbucket.org/cheez

Pierre Morcello

5:30 p.m.

Concerning the OP, I think this looks like an opportunity to do an interesting boost.spirit example/sample. Best regards, Pierre Morcello

Bjørn Roald

9:07 a.m.

On 11/25/2010 05:59 PM, Denis Shevchenko wrote:

...

Hi all!

Is there any interest in a library for configuration file parsing?

Yes, but why do you propose a new library rather than proposing extensions to existing ones? I really would like a better separation of how this sort of data is stored externally and how it is accessed internally by application code through the library API. Both program options and serialization provide that to some degree. I am not sure of how this is with PropertyTree which is a third boost alternative. I think program options lacks some of the desired flexibility for structure, necessity and validation, but that could possibly be added rather than making an all new library which will be confusing me and others even more. In any case, thanks for working on this, -- Bjørn

5347

Age (days ago)

5356

Last active (days ago)

List overview

Download

72 comments

23 participants

participants (23)

Bjørn Roald
Dean Michael Berris
Denis Shevchenko
Giorgio Zoppi
Hal Finkel
Hartmut Kaiser
Jeff Benshetler
Jeff Flinn
Joel de Guzman
Marsh Ray
Michael Caisse
Mika Heiskanen
Paul A. Bristow
Pierre Morcello
Rob Riggs
Robert Ramey
Sebastian Redl
sguazt
Sohail Somani
Stephen Nuchia
Stewart, Robert
Torri, Stephen CIV NSWCDD, W15
Vladimir Prus