[metaparse] Review period starts May 25th and ends June 7th - ongoing
Dear all, The review of the metaparse library started last Monday. Please consider taking the time to review it and post comments or reviews on this list. Thanks, Christophe ------------------------------------------------------------------------------------------ Dear all, The review of the Metaparse library starts next Monday, May 25th and ends June 7th. Metaparse was written by Abel Sinkovics. Metaparse is a parser generator library for template metaprograms. The purpose of this library is to support the creation of parsers that parse at compile time. This library is intended to be used for embedded domain specific language creation for C++ and can help libraries provide a better interface. It is similar to Boost.Spirit, however while parsers built with Spirit parse at run-time, parsers built with Metaparse parse at compile-time. It makes it possible for library authors to provide an interface that follows the common notation of the problem domain of the library. (eg. SQL queries, grammars, etc written in their common notation instead of similar-looking C++ expressions). For example there is a (yet incomplete) interface for Boost.Spirit that makes it possible to write MPLLIBS_REGEX("^[abcd][3-8]\\.foo$") instead of bos >> set[as_xpr('a')|'b'|'c'|'d'] >> range('3','8') >> '.' >> 'f' >> 'o' >> 'o' >> eos and make the parser built with Metaparse generate the latter expression from the former one. (It can be found here: https://github.com/istvans/mpllibs/tree/master/libs/xlxpressive) Since the library is based on template metaprogramming, the DSLs can be used for type validation as well. This is demonstrated by the type-safe printf implementation ( http://abel.web.elte.hu/mpllibs/safe_printf/index.html). It type-checks the arguments of a printf call based on the format string and has zero runtime overhead compared to unchecked printf. Parsers can be constructed in a declarative manner and (even though this is template metaprogramming based) remain readable. The implementation reflects the grammar it parses (with additional elements for semantic actions). Even though the library requires the parser author to write template metaprograms, the development of parsers is well supported by tools like Metashell and MDB. An important aspect of parsers (built with this library or another one) is error reporting for invalid input. The library offers tools for the parser authors to be able to provide useful error messages in case of parsing errors. Here is an example error report by a parser built with Metaparse (the parser can be found in the Getting Started guide of the library): ..... x__________________PARSING_FAILED__________________x<1, 5, unpaired<1, 1, literal_expected<')'>>> .... It is an error report letting the developer know that a closing paren is missing at line 1, column 5 and the unclosed opening paren is in line 1, column 1. Metaparse can be downloaded from Github: https://github.com/sabel83/mpllibs/tree/master/mpllibs/metaparse And a very complete tutorial: http://abel.web.elte.hu/mpllibs/metaparse/getting_started.html The tutorial also offers the nice feature of a link to metashell for quick-starting trying out the library: http://abel.web.elte.hu/shell/metashell.html Boost.Msm v3 also implements a new compile-time strings based front-end called eUML2 to demonstrate the power of the library and the conciseness and expressiveness it allows. Please have a look at the documentation: https://htmlpreview.github.io/?https://raw.githubusercontent.com/boostorg/ms... Everybody on this list is invited to participate in this formal review. I hope to see your review of Metaparse, including your vote on acceptance, and I welcome your participation in the ensuing discussions on the Boost developers' mailing list. Please always include in your review a vote as to whether the library should be accepted into Boost. Additionally please consider giving feedback on the following general topics: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? - With what compiler? - Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? Regards, Christophe Review Manager
Christophe Henry
Dear all,
The review of the metaparse library started last Monday. Please consider taking the time to review it and post comments or reviews on this list.
[...]
Abel, I am currently going through Metaparse's tutorial, and here are some questions and comments I have so far. I will be posting more questions by replying to this message if some come up as I progress through the tutorial. I will provide a proper review of the library when I'm done. 1. What is the `mpl_` namespace? Is it documented in the tutorial for Metaparse? If it is a shortcut for `boost::mpl`, why the trailing underscore? 2. I personally am not using Metashell for the tutorial because it does not properly support the '<' and '>' characters on OS X (I think that's a shellinabox bug). As a result, I find it slightly difficult to follow the code blocks in the tutorial because of the newline escapes (`\`) and numbered names (`exp_parser16`) required for the code to be pasteable in Metashell. I would personnally prefer a more classical approach with normal code blocks not expecting the user to follow along in Metashell, but I realize this is a matter of personal taste. Regards, Louis
Hi Louis, On 2015-05-29 19:00, Louis Dionne wrote:
I am currently going through Metaparse's tutorial, and here are some questions and comments I have so far. I will be posting more questions by replying to this message if some come up as I progress through the tutorial. I will provide a proper review of the library when I'm done. Thank you for checking it.
1. What is the `mpl_` namespace? Is it documented in the tutorial for Metaparse? If it is a shortcut for `boost::mpl`, why the trailing underscore? It comes from boost::mpl and is not related to Metaparse:
#include
boost::mpl::int_<13> mpl_::int_<13>
This could be mentioned the first time it pops up in the tutorial.
2. I personally am not using Metashell for the tutorial because it does not properly support the '<' and '>' characters on OS X (I think that's a shellinabox bug). As a result, I find it slightly difficult to follow the code blocks in the tutorial because of the newline escapes (`\`) and numbered names (`exp_parser16`) required for the code to be pasteable in Metashell. I would personnally prefer a more classical approach with normal code blocks not expecting the user to follow along in Metashell, but I realize this is a matter of personal taste. I'm aware of a shellinabox bug related to Firefox. Chromium seems to handle it properly on Linux & Windows. Another option is installing Metashell locally (all you need are the Boost and the Metaparse headers on your header path).
Since the tutorial is about getting the right types as the result of the right metafunction calls (in this case the parsers), it is mostly about checking if you got the types you expected. You can use other, more classical approaches (eg. displaying the types in error messages, asserting for type equality, pretty printing the types, etc), but that is more involved in each iteration. (They could be probably mentioned in the tutorial though). Multi-line code examples: one option is adding a copy-paste friendly version of each code snippet to the tutorial, so the text remains pure html (no scripting). Another option is to use some (probably javascript-based) library that displays the code readable but also makes it easy to copy it. (I tried to avoid this as to the best of my knowledge the documentation is expected to be pure html). Regards, Ábel
On May 30, 2015, at 1:43 AM, Abel Sinkovics
Multi-line code examples: one option is adding a copy-paste friendly version of each code snippet to the tutorial, so the text remains pure html (no scripting). Another option is to use some (probably javascript-based) library that displays the code readable but also makes it easy to copy it. (I tried to avoid this as to the best of my knowledge the documentation is expected to be pure html).
For those using metashell, using the getting started guide is also somewhat complicated by the `...>` shell continuation prompt in the code quotations. After a couple of iterations, I found a decent workaround, which is to open the markdown source libs/metaparse/doc/getting_started.md in an editor, and remove `...>` from the document with search and replace. That way I can follow along in a document which does have the backslashes, and copy and paste with ease. Cheers, Gordon
On May 30, 2015, at 1:43 AM, Abel Sinkovics
wrote: Multi-line code examples: one option is adding a copy-paste friendly version of each code snippet to the tutorial, so the text remains pure html (no scripting). Another option is to use some (probably javascript-based) library that displays the code readable but also makes it easy to copy it. (I tried to avoid this as to the best of my knowledge the documentation is expected to be pure html). For those using metashell, using the getting started guide is also somewhat complicated by the `...>` shell continuation prompt in the code quotations.
After a couple of iterations, I found a decent workaround, which is to open the markdown source libs/metaparse/doc/getting_started.md in an editor, and remove `...>` from the document with search and replace.
That way I can follow along in a document which does have the backslashes, and copy and paste with ease. I've added a "copy-paste friendly version" link to the code examples in
Hi, On 2015-05-30 18:13, Gordon Woodhull wrote: the tutorial to make it easy to copy multi-line examples to Metashell. The html build of the docs at http://abel.web.elte.hu/mpllibs/metaparse has also been updated. If you've been using that, you need to hit refresh in your browser to get the updated version. Regards, Ábel
Louis Dionne
[...]
I am currently going through Metaparse's tutorial, and here are some questions and comments I have so far. I will be posting more questions by replying to this message if some come up as I progress through the tutorial. I will provide a proper review of the library when I'm done.
Abel, I have a few more comments and questions. 3. First, a non-technical point about the tutorial. I will reiterate my position about the format of the tutorial being slightly inadequate. I was around the end of the tutorial and I wanted to copy/paste everything we had so far so I could play with it. To do this, I had to copy/paste bits of solutions going up to the beginning of the tutorial. All in all, it took me about 15 minutes to put the right bits together and to get the final example compiling in a single file on my machine. The format of the tutorial is 100% incremental, but my reality when reading a tutorial is that I go to it and then come back, and I expect the milestones to be larger-grained than what Metaparse provides. I think a solution that would address 90% of my complaint would be to provide "what we got so far" code blocks on important milestones. That way, one could simply copy/paste the whole code block and get started right away. Other than this, I like the way you introduce concepts and I never felt lost reading the tutorial. Good job on the redaction. 4. I would like to see metaparse provide a master header including everything else (like `boost/metaparse.hpp`). I don't mean that such a header is terribly useful when actually coding, because you'll never want to pull the whole thing in production code. However, it makes the process of learning and hacking stuff around so much easier, cause you just have to #include the whole thing and get started. Since being easily teachable and hackable is an important feature of a library, I think a master header would be valuable. Also, it is a trivial change. 5. I would like to see charts showing compilation times in relation to the complexity of the parser as well as the length of the input. For example, I would like to see how the simple calculator example behaves at compile-time for different input sizes, and how the one with error handling does, and so on. However, since I know it's a total pain in the ass to set this up oneself (I did it for MPL11 and Hana), I could set it up for you, show you how to write the benchmarks with my framework and then you would do the rest of the work yourself. To make this more concrete, here's what I have been able to generate in about 10 minutes of work. It is a benchmark of the final parser we end up with (with error handling and all) for different sizes of input: http://postimg.org/image/buuz4988b/ I must agree this is not conclusive at all, but like I said I hacked it out in 10 minutes and I don't know the internals of the library. 6. While doing the above benchmarks, Metaparse threw up with strings of about 37 characters long. I believe this limitation has something to do with how the MPLLIBS_STRING macro is implemented. However, there are other alternatives to this macro when C++11 is available. See for example how I do it in Hana here: https://goo.gl/9dxVtx There's also a GNU extension supported by both Clang and GCC which allows compile-time strings to be defined using a user-defined literal. You can also see how Hana uses it through the above link. I think it would be a very, very valuable addition to the library to support longer strings via those techniques when the compiler supports it. Regards, Louis
Hi Louis, On 2015-06-03 23:42, Louis Dionne wrote:
Louis Dionne
writes: 3. First, a non-technical point about the tutorial. I will reiterate my position about the format of the tutorial being slightly inadequate. I was around the end of the tutorial and I wanted to copy/paste everything we had so far so I could play with it. To do this, I had to copy/paste bits of solutions going up to the beginning of the tutorial. All in all, it took me about 15 minutes to put the right bits together and to get the final example compiling in a single file on my machine. The format of the tutorial is 100% incremental, but my reality when reading a tutorial is that I go to it and then come back, and I expect the milestones to be larger-grained than what Metaparse provides.
I think a solution that would address 90% of my complaint would be to provide "what we got so far" code blocks on important milestones. That way, one could simply copy/paste the whole code block and get started right away. Other than this, I like the way you introduce concepts and I never felt lost reading the tutorial. Good job on the redaction. The "getting_started" example consists of headers that are milestones for the tutorial. For example to get all the definitions before section 5.2.1, you should include "5_2_1.hpp". This is mentioned at the beginning ("1.1 Testing environment") of the tutorial. Is that what you meant?
4. I would like to see metaparse provide a master header including everything else (like `boost/metaparse.hpp`). I don't mean that such a header is terribly useful when actually coding, because you'll never want to pull the whole thing in production code. However, it makes the process of learning and hacking stuff around so much easier, cause you just have to #include the whole thing and get started. Since being easily teachable and hackable is an important feature of a library, I think a master header would be valuable. Also, it is a trivial change. It is doable (actually, such master header already exists, but not as part of the library, but as part of the tests - all_headers.hpp). I'm happy to add it, but probably in a way to generate a warning from it.
5. I would like to see charts showing compilation times in relation to the complexity of the parser as well as the length of the input. For example, I would like to see how the simple calculator example behaves at compile-time for different input sizes, and how the one with error handling does, and so on. However, since I know it's a total pain in the ass to set this up oneself (I did it for MPL11 and Hana), I could set it up for you, show you how to write the benchmarks with my framework and then you would do the rest of the work yourself. To make this more concrete, here's what I have been able to generate in about 10 minutes of work. It is a benchmark of the final parser we end up with (with error handling and all) for different sizes of input: http://postimg.org/image/buuz4988b/ I must agree this is not conclusive at all, but like I said I hacked it out in 10 minutes and I don't know the internals of the library. I did that for regular expressions and printf. See:
http://abel.sinkovics.hu/download.php?fn=dsltemp.pdf, section 5.3 (printf) http://abel.sinkovics.hu/download.php?fn=dsltemp.pdf, slide 162 (regex and the string macro) They can (and will) be added to the documentation as well. Error handling has not been checked. If you have some reusable environment, I'd be glad to use it (I've done the measurements with ad-hoc scripts).
6. While doing the above benchmarks, Metaparse threw up with strings of about 37 characters long. I believe this limitation has something to do with how the MPLLIBS_STRING macro is implemented. However, there are other alternatives to this macro when C++11 is available. See for example how I do it in Hana here: https://goo.gl/9dxVtx There's also a GNU extension supported by both Clang and GCC which allows compile-time strings to be defined using a user-defined literal. You can also see how Hana uses it through the above link. I think it would be a very, very valuable addition to the library to support longer strings via those techniques when the compiler supports it. The maximum length of the string can be controlled by the MPLLIBS_LIMIT_STRING_SIZE macro (see http://abel.web.elte.hu/mpllibs/metaparse/MPLLIBS_STRING.html).
I've tried embedding the string literal in a lambda, but I couldn't instantiate a template with the type of the lambda (lambda in an unevaluated context). I haven't checked how to take advantage of non-standard compiler extensions where available, but it is a great idea that definitely worth investigation. Regards, Ábel
Abel Sinkovics
Hi Louis,
On 2015-06-03 23:42, Louis Dionne wrote:
Louis Dionne
writes: [...] The "getting_started" example consists of headers that are milestones for the tutorial. For example to get all the definitions before section 5.2.1, you should include "5_2_1.hpp". This is mentioned at the beginning ("1.1 Testing environment") of the tutorial. Is that what you meant?
Yes, this is what I meant. My bad. However, I still think it is not a really user-friendly way of making that code available. I'd prefer a more "standard" style of tutorial, where you can just copy/paste code from the webpage as you go, and into a source file not Metashell. However, this is a non technical point and I'd rather not lose time over it since I'm satisfied with the overall tutorial.
[...]
If you have some reusable environment, I'd be glad to use it (I've done the measurements with ad-hoc scripts).
Well, I use ad-hoc scripts too, but they are pretty cool because everything can be controlled from CMake and they generate Javascripts charts that can be embedded in the HTML documentation :-). Also, my benchmarks are run on Travis and pushed to a special branch on GitHub whenever I push to master, which is handy for keeping them up to date. I'll try to send you a pull request setting up the basic system, but that might have to wait until the end of my own review. Also, what will happen with the CMake-based build system in the new Metaparse repository? Regards, Louis
Abel Sinkovics
writes: Yes, this is what I meant. My bad. However, I still think it is not a really user-friendly way of making that code available. I'd prefer a more "standard" style of tutorial, where you can just copy/paste code from the webpage as you go, and into a source file not Metashell. However, this is a non technical point and I'd rather not lose time over it since I'm satisfied with the overall tutorial. What I'm thinking of doing is adding "what we have so far" links to the beginning of the sections (similar to the "copy-paste friendly version"
Well, I use ad-hoc scripts too, but they are pretty cool because everything can be controlled from CMake and they generate Javascripts charts that can be embedded in the HTML documentation :-). Also, my benchmarks are run on Travis and pushed to a special branch on GitHub whenever I push to master, which is handy for keeping them up to date. I'll try to send you a pull request setting up the basic system, but that might have to wait until the end of my own review. Also, what will happen with the CMake-based build system in the new Metaparse repository? So if I'm getting it right, your benchmarks are re-generated after every
Hi Louis, On 2015-06-04 00:27, Louis Dionne wrote: links) that point to pages containing only the code (so it is easy to copy them). Note that everything you can copy to Metashell, you can copy to a "regular" source file as well. The only difference is that it is your job to display the type then. push to GitHub? Very nice :) CMake build system: the Metaparse repository uses Boost's Jamfiles (no CMake). Regards, Ábel
Abel Sinkovics
Hi Louis,
[...]
What I'm thinking of doing is adding "what we have so far" links to the beginning of the sections (similar to the "copy-paste friendly version" links) that point to pages containing only the code (so it is easy to copy them).
That would be fine by me.
[...]
So if I'm getting it right, your benchmarks are re-generated after every push to GitHub? Very nice :)
Yes, that's right. It's lovely but it was a huge pain to set up.
CMake build system: the Metaparse repository uses Boost's Jamfiles (no CMake).
Unfortunately, my benchmarking setup requires CMake. Something might still be doable. Louis
Hi Louis, On 2015-06-05 03:22, Louis Dionne wrote:
CMake build system: the Metaparse repository uses Boost's Jamfiles (no CMake). Unfortunately, my benchmarking setup requires CMake. Something might still be doable. What are you planning to do with your benchmarking setup if your library gets accepted into Boost? Will your CMake and Boost build systems coexist?
Regards, Ábel
Louis Dionne
[...]
I am currently going through Metaparse's tutorial, and here are some questions and comments I have so far. I will be posting more questions by replying to this message if some come up as I progress through the tutorial. I will provide a proper review of the library when I'm done.
What follows is my official review of the Metaparse library ----------------------------------------------------------- I vote YES for the acceptation of Metaparse into Boost.
- What is your evaluation of the design?
I think the design is fine. Not an expert, but it looks like a classic parser combinator library, except the author had to deal with the quirks of C++ template metaprogramming. It is built using the well known and proven concepts established by the MPL: metafunctions, metafunction classes and type-level sequences, which is a safe bet. That being said, I would be really really curious to try and implement something similar with value-level syntax, similarly to the way Hana does compile-time computations with a value-level syntax. I think we might be able to achieve something with the same syntax as Spirit but for building compile-time parsers too. However, this is just my own curiosity speaking and it does not take anything away from Metaparse.
- What is your evaluation of the implementation?
I have not looked at the implementation in depth, but I think the author pretty much pushed C++03 to its limits. The only comment I would have would be to provide a modernized version of the MPLLIBS_STRING macro when the compiler supports it, because that is a huge bottleneck for using the library.
- What is your evaluation of the documentation?
Despite the concerns I raised in my other messages, I think the tutorial is very well written and is very instructive. I think it covers the necessary material to get started and I think it manages to do so quickly enough. The reference is simple but well organized and easy to read. There are usage examples for every component I looked at, which is really useful.
- What is your evaluation of the potential usefulness of the library?
I think the library can be useful, but it is obviously targeted towards library writers rather than application developers. I think Metaparse plays in the same field as Proto; they are "meta" libraries used to build other libraries.
- Did you try to use the library? - With what compiler? - Did you have any problems?
Yes. I cloned the mpllibs/ repository and was able to build it using the CMake-based build system. I built/ran all the unit tests and the examples with the following compilers: Clang 3.4.2, 3.5.2, 3.6.1, trunk GCC 5.1, trunk Everything went fine, except for a link error due to Boost.Test with C++14 compilers, but I don't think that's Metaparse's fault. I was building against Boost 1.58. The whole process was very easy; it took me under 5 minutes to set it up. I did not try to use the Boost.Build system.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
I read all the "Getting Started" part of the documentation, and I had a look at the reference too. I looked a bit at the implementation itself, but not for very long. I played with the final calculator example on my own machine for about an hour or so. All in all, I would say I have spent +6h on this review.
- Are you knowledgeable about the problem domain?
I have used Spirit and other "parser generators" in the past, so I am familiar to the domain of parsing. I have used Proto, MPL, Fusion and I have written the MPL11 and Hana libraries, so I have a good knowledge of the domain of metaprogramming. Regards, Louis
Hi Louis, Thank you for the review. On 2015-06-05 05:07, Louis Dionne wrote:
Louis Dionne
writes: - What is your evaluation of the design? That being said, I would be really really curious to try and implement something similar with value-level syntax, similarly to the way Hana does compile-time computations with a value-level syntax. I think we might be able to achieve something with the same syntax as Spirit but for building compile-time parsers too. However, this is just my own curiosity speaking and it does not take anything away from Metaparse. I agree that it would be an interesting experiment to see how far we can get with that approach (and/or if the current Metaparse can be extended with such techniques).
- What is your evaluation of the implementation? I have not looked at the implementation in depth, but I think the author pretty much pushed C++03 to its limits. The only comment I would have would be to provide a modernized version of the MPLLIBS_STRING macro when the compiler supports it, because that is a huge bottleneck for using the library.
That will be done (probably with relying on Boost.Config heavily to find the best suitable implementation).
- Did you try to use the library? - With what compiler? - Did you have any problems? Yes. I cloned the mpllibs/ repository and was able to build it using the CMake-based build system. I built/ran all the unit tests and the examples with the following compilers: Clang 3.4.2, 3.5.2, 3.6.1, trunk GCC 5.1, trunk Nice test environment :)
Regards, Ábel
As this is a formal review, there are some formalia that does not appear to be in place. I apologize in advance for the pedantry. The code base does not follow the Boost directory structure, nor does it appear to use Boost.Build. The code is not located within a boost namespace. The documentation does not appear to be in QuickBook. More importantly, it is unclear to me exactly how much is part of the review. For instance, are the examples [1] part of the submission? You only point to the "Getting Started" documentation, but what about the rest of the documentation? Are both v1 and v2 part of the submission (and if so, why does it contain v1?) [1] https://github.com/sabel83/mpllibs/tree/master/libs/metaparse/example
Hi Bjorn, On 2015-05-31 12:41, Bjorn Reese wrote:
The code base does not follow the Boost directory structure, nor does it appear to use Boost.Build. When the project was created, it did follow the directory structure Boost had at the time (but was using the mpllibs namespace instead of boost to make it clear that it is not part of Boost). It is part of a repository that follows the old directory structure (there are a few other libs in the same project as well). I did not make the same structural change in Mpllibs that Boost did, however, if Metaparse gets accepted I'll move it to its own repository and change the namespace, etc (apply other changes that are needed for the integration).
The code is not located within a boost namespace. It is located in mpllibs::metaparse. It was intentionally designed in a way that in case it gets accepted, it can be (easily for the developers and users of the library) moved to the boost::metaparse namespace.
The documentation does not appear to be in QuickBook. The documentation is in MarkDown format, from which static HTML files can be automatically generated, which is (to the best of my knowledge) the requirement for the format of the documentation.
More importantly, it is unclear to me exactly how much is part of the review. For instance, are the examples [1] part of the submission? You only point to the "Getting Started" documentation, but what about the rest of the documentation? Are both v1 and v2 part of the submission (and if so, why does it contain v1?) The structure of the mpllibs repository follows the old structure of the Boost repository and from that the metaparse library is part of the review (the content of the mpllibs/metaparse and libs/metaparse directories recursively). Yes, the examples and the entire Metaparse documentation (the markdown source of the documentation is in the libs/metaparse/doc directory, the generated HTML files can be viewed at http://abel.web.elte.hu/mpllibs/metaparse) are part of the review.
Good question about v1 and v2. The library is currently part of Mpllibs, where I have already released 1.0.0 and I've made breaking API changes to Metaparse since then. That is why I have the v2 namespace. If the library gets accepted to Boost, it is safe to start the versioning over (as there is a namespace change anyway) and drop the "old" v1, so in that sense only v2 is part of the review. Regards, Ábel
2015-05-28 15:12 GMT+02:00 Christophe Henry
Dear all,
The review of the metaparse library started last Monday. Please consider taking the time to review it and post comments or reviews on this list.
This is not a review. Just a remark. All reviews I attended so far had the following structure: The author presented the library in the shape that he considered ready to be pushed to Boost repository and shipped. The reviewers decided if they wanted the library in the given shape. This time, the situation looks different. It looks like the contract is, "if you the library is accepted, it will be changed to the following...", and to me it is not clear enough what I am reviewing: the library in the current shape, or the promise of something else. I believe that the library with such capabilities deserves its place in Boost. Even macro MPLLIBS_STRING() alone is a useful (and impressive) addition. But as it is, it does not even meet Boost naming conventions, (should probably be BOOST_MPL_STRING()). It took me a long while to figure out what the library does. I have seen examples of compile-time regexp and safe printf, but as it turns out, this is not what the proposed library is. I understand that the author is reluctant to invest time in Boostifying the library unless he knows that the library is accepted. For encouragement, I can say I would vote for inclusion, if what was proposed, were a Boostified library in the shape that is intended to go into Boost repository. Regards, &rzej
Hi Andrzej, On 2015-06-01 13:57, Andrzej Krzemienski wrote:
I believe that the library with such capabilities deserves its place in Boost. Even macro MPLLIBS_STRING() alone is a useful (and impressive) addition. But as it is, it does not even meet Boost naming conventions, (should probably be BOOST_MPL_STRING()). I'm calling it MPLLIBS_STRING (or BOOST_STRING after the renaming). BOOST_MPL_STRING would suggest that it is part of Boost.MPL (and it is not). It could be called BOOST_METAPARSE_STRING. I'd find it uncomfortably long, however if it is a concern, I'm happy to rename it.
Regards, Ábel
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Abel Sinkovics Sent: 01 June 2015 22:32 To: boost@lists.boost.org Subject: Re: [boost] [metaparse] Review period starts May 25th and ends June 7th - ongoing
Hi Andrzej,
On 2015-06-01 13:57, Andrzej Krzemienski wrote:
I believe that the library with such capabilities deserves its place in Boost. Even macro MPLLIBS_STRING() alone is a useful (and impressive) addition. But as it is, it does not even meet Boost naming conventions, (should probably be BOOST_MPL_STRING()). I'm calling it MPLLIBS_STRING (or BOOST_STRING after the renaming). BOOST_MPL_STRING would suggest that it is part of Boost.MPL (and it is not). It could be called BOOST_METAPARSE_STRING. I'd find it uncomfortably long, however if it is a concern, I'm happy to rename it.
BOOST_METAPARSE_STRING is clear (and not much longer than MPLLIBS_STRING ;-) Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 (0) 1539 561830
Hi Paul, On 2015-06-02 11:57, Paul A. Bristow wrote:
I'm calling it MPLLIBS_STRING (or BOOST_STRING after the renaming). BOOST_MPL_STRING would suggest that it is part of Boost.MPL (and it is not). It could be called BOOST_METAPARSE_STRING. I'd find it uncomfortably long, however if it is a concern, I'm happy to rename it. BOOST_METAPARSE_STRING is clear (and not much longer than MPLLIBS_STRING ;-) Yes, that is true :)
Maybe BOOST_METASTRING (as metastring is less general than string) Regards, Ábel
On 05/28/2015 03:12 PM, Christophe Henry wrote:
Please always include in your review a vote as to whether the library should be accepted into Boost.
I vote yes for acceptance into Boost.
- What is your evaluation of the design?
Clean parser combinator design.
- What is your evaluation of the implementation?
Well-structured and easy to read.
- What is your evaluation of the documentation?
Very thorough and of a high standard. What kind of grammar can be written with Metaparse? I fould a reference to EBNF; does this mean that I can write context-free grammars? If so, how is left-recursion [1] handled? I would be nice to have this kind of information in the documentation. The "Getting Started" chapter has much more detail about the internal machinery than I would expect from an introduction. I found the "User Manual" chapter to be better as an introduction. I still think all the content in "Getting Started" is useful, but it may be an idea to restructure these chapters. Furthermore, the tutorial links to an external document. [1] http://en.wikipedia.org/wiki/Left_recursion
- What is your evaluation of the potential usefulness of the library?
The library seems mostly to target embedded DSLs, but I would be more inclined to choose, say, sqlpp11 for embedded SQL than a text-based SQL approach. That said, others may find text-based expressions useful. I still see the ability to transform string literals into types and values as very useful, and something that could find surprising applications. For instance, I would love to see something that could generate a perfect hash function for a set of strings at compile-time. Even a hash parser to turn a string into a hash value could be useful.
- Did you try to use the library? - With what compiler? - Did you have any problems?
I ran the test suite and a couple of examples with gcc 4.8. No problems were encountered.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?
A couple of hours spent reading the documentation, and another couple of hours looking through the code. My code reading mainly focussed on the string type, the int_ parser, and the traversal functions.
- Are you knowledgeable about the problem domain?
I have been writing DSL parsers for a living, although they were all run-time parsers. The following are minor comments directed more towards the author than the review manager. Why are the errors in separate headers? Implementations are found in the v1 folder, but their include guards have V2 in them (probably an oversight from the recent move to a separate repository.) The tests are not re-compiled the second time you run them. As this is a compile-time library, I was wondering if a re-compilation should be forced (if that is doable with Boost.Build.)
Hi Bjorn, Thank you for the review. On 2015-06-06 14:55, Bjorn Reese wrote:
What kind of grammar can be written with Metaparse? I fould a reference to EBNF; does this mean that I can write context-free grammars? If so, how is left-recursion [1] handled? I would be nice to have this kind of information in the documentation. You can write context free grammars.
The "Getting Started" chapter has much more detail about the internal machinery than I would expect from an introduction. I found the "User Manual" chapter to be better as an introduction. I still think all the content in "Getting Started" is useful, but it may be an idea to restructure these chapters. The "User manual" used to be the first "tutorial". However, what I've realised was that something as deep as "Getting Started" is needed for
Furthermore, the tutorial links to an external document. Correct. I used that as the material on a Metaparse introduction
More general (eg. context sensitive, unrestricted) grammars: it is difficult to tell if Metaparse supports them. There is for example the accept_when combinator which you can use to provide arbitrary predicates for enabling/disabling a specific rule. You can go as far as providing the Turing machine (as a metafunction) of the entire grammar as a predicate, so you can build parsers for unrestricted grammars you have a Turing machine for. By doing that you'd be (in my opinion) building a new parser library, however, using accept_when with simpler predicates makes perfect sense. If you close accept_when completely out, you are not really talking about Metaparse. Where is the limit that still makes sense and what type of grammars you can implement with that? I have no clear answer. Metaparse assumes that the parsers are deterministic, as they have only "one" result. You could of course write parsers and combinators that return a set (or list or some other container) of results as that "one" result, but (again) you'd be building a new parser library. Left-recursion: you can build top-down parsers with Metaparse and can not do left-recursion. Note that I'd encourage users to use the iterative parser combinators - eg. the folds - instead of direct recursion. I'll add a description to the documentation about this. people to confidently start writing parsers, so they have a solid understanding of what they are doing, which is important when things go wrong and they need to figure out how to fix the parser. The "Getting started" guide does not go into full details of the internal machinery, only to the extent I find it helpful for users of the library. After adding the short initial example (eg. the rational parser) to the documentation people who start reading the tutorial will already know that they are interested in the library, so going a bit slower and building up a solid understanding of what is really going on should be fine. training (in a summer school). I thought that it could be useful in itself as well as a tutorial to Metaparse (there was no "Getting started" at that time). I can make it more explicit in the documentation, that this is a link to an external site.
The library seems mostly to target embedded DSLs, but I would be more inclined to choose, say, sqlpp11 for embedded SQL than a text-based SQL approach. That said, others may find text-based expressions useful. I guess it depends on the DSL and the "hosting" project, how a DSL should be embedded.
The following are minor comments directed more towards the author than the review manager.
Why are the errors in separate headers? I consider (most of) them reusable (eg. index_out_of_range).
Implementations are found in the v1 folder, but their include guards have V2 in them (probably an oversight from the recent move to a separate repository.) I'll fix that.
The tests are not re-compiled the second time you run them. As this is a compile-time library, I was wondering if a re-compilation should be forced (if that is doable with Boost.Build.) I understand that comparing this to a runtime test suite might look strange, however the tests are the evaluation of a set of pure functions (always return the same result given the same arguments), so not recompiling (rerunning) them seems to make sense when nothing changes.
Regards, Ábel
On Thu, May 28, 2015 at 9:12 AM, Christophe Henry < christophe.j.henry@gmail.com> wrote:
Dear all,
The review of the metaparse library started last Monday. Please consider taking the time to review it and post comments or reviews on this list.
Thanks,
Christophe
[snip]
Everybody on this list is invited to participate in this formal review. I hope to see your review of Metaparse, including your vote on acceptance, and I welcome your participation in the ensuing discussions on the Boost developers' mailing list.
Please always include in your review a vote as to whether the library should be accepted into Boost.
Yes.
Additionally please consider giving feedback on the following general topics:
- What is your evaluation of the design?
I don't have much negative to say about the design, because the functionality is similar to mpl, which I am also familar with. I like the consistency of the interface(s). As others have noted, it would be interesting to try the newer style of TMP (dependent typing), because syntax similar to Spirit might be possible. Requiring an internal `::type` for result types also surprised me (also previously noted). The only other point is a nitpick (again, previously mentioned) - the `any` parser is surprisingly repetitive and not an alternation. - What is your evaluation of the implementation?
I looked at some of the implementation (I was curious). The parts I saw seemed well-written. Some thoughts: * Adding support for the Gcc character-literal expansion extension would be nice, because the strings macro has some intense pre-processor work. * Compiling in C++11 mode is noticeably quicker for some of the examples I tried (in clang and gcc) - due to the true variadic string implementation (no pre-processor)? ** I noticed some of the other areas used pre-processor variadics (one_of, etc.), could C++11 variadics improve compilation time here? ** Using boost::mpl::vector for sequences/repeats feels ancient, could this be moved to a mpl conformant C++11 variadic implementation too? mpl has pre-processed headers, but still have lots of symbols/code for the compiler to parse and manage for the variations. Since the documentation states that the repetition/sequence based parsers return a "implementation-defined" mpl sequence, could this data structure be swapped in c++11 mode? Unfortunately I can't find a compile-time performance comparison of mpl::vector and a similar data structure using C++11 variadics. - What is your evaluation of the documentation?
I find the documentation suitable, but I also skimmed the tutorial and went to the libray reference instead. After looking at a few examples, I started "hacking" at it. The diagrams in the documentation were also nice. - What is your evaluation of the potential usefulness of the library?
- Did you try to use the library? - With what compiler?
Yes. With gcc 4.8 and clang 3.5. I have a partially implemented a printf format string to spirit::karma expression tree conversion utility that uses metaparse. The library worked exactly as expected everytime, the difficulties were mainly in the complexities of the printf specification. - Did you have any problems?
The compile-times I experienced were not great. A decent portion may have been due to the usage of spirit, another library known to have significant compile times. The (somewhat) simple calculator example still takes ~2.5s to compile in clang 3.5 C++11 mode. This was on an older laptop, maybe I should just upgrade. Although it would be nice to see if there are ways to improve on the compilation time - otherwise people might avoid this library. The framework that Louis mentioned previously seems worthwhile. - How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?
A decent amount of effort considering how much of the library I used for my printf->spirit utility. A little less on the tutorial and getting started part of the documentation. - Are you knowledgeable about the problem domain?
I'm fairly knowledgable of TMP techniques, but new to compile-time parsing. Its difficult for me to say whether a better solution exists for this specific problem. Lee
Hi Lee, Thank you for the review. On 2015-06-08 04:34, Lee Clagett wrote:
* Adding support for the Gcc character-literal expansion extension would be nice, because the strings macro has some intense pre-processor work. Do you mean multi-character literals? (eg. 'abcd'?)
* Compiling in C++11 mode is noticeably quicker for some of the examples I tried (in clang and gcc) - due to the true variadic string implementation (no pre-processor)? I'm not aware of a speed gain in switching to C++11 mode. Could you point me to the example and the compiler version/flags you were using? I'd like to learn more about where that comes from (it could be useful to reduce compilation times).
What I'm aware of is that some of the examples only print "Please use a compiler that support constexpr" when the string macro is not supported (eg. calculator_with_parens_and_unary_ops example). Because of this, in pre-C++11 mode, they compile quicker as they don't run the metaprograms.
** I noticed some of the other areas used pre-processor variadics (one_of, etc.), could C++11 variadics improve compilation time here? I'm planning to add C++11 support to those places. Mainly to remove the length limit (eg. the number of cases one can use in one_of), but there might be speed improvements as well (I'll need to check).
** Using boost::mpl::vector for sequences/repeats feels ancient, could this be moved to a mpl conformant C++11 variadic implementation too? mpl has pre-processed headers, but still have lots of symbols/code for the compiler to parse and manage for the variations. Since the documentation states that the repetition/sequence based parsers return a "implementation-defined" mpl sequence, could this data structure be swapped in c++11 mode? Unfortunately I can't find a compile-time performance comparison of mpl::vector and a similar data structure using C++11 variadics. Yes, it definitely could and should be replaced where possible. The question is "how?".
I've been using mpl::vector as it works on older platforms as well and I've been waiting for a variadic template-based (meta)container to be added to Boost (I didn't want other dependencies and have not investigated building it myself). The interface is based on MPL's sequence abstraction, so as long as the thing returned supports it (the docs could be more specific on the type of sequence), it should be swappable when further/better containers become available.
The compile-times I experienced were not great. A decent portion may have been due to the usage of spirit, another library known to have significant compile times. The (somewhat) simple calculator example still takes ~2.5s to compile in clang 3.5 C++11 mode. This was on an older laptop, maybe I should just upgrade. Although it would be nice to see if there are ways to improve on the compilation time - otherwise people might avoid this library. The framework that Louis mentioned previously seems worthwhile. I'll investigate how to make compilation faster.
Regards, Ábel
On Mon, Jun 8, 2015 at 1:03 AM, Abel Sinkovics
Hi Lee,
Thank you for the review.
On 2015-06-08 04:34, Lee Clagett wrote:
* Adding support for the Gcc character-literal expansion extension would be nice, because the strings macro has some intense pre-processor work.
Do you mean multi-character literals? (eg. 'abcd'?)
I thought Louis mentioned this [ https://github.com/ldionne/hana/blob/master/include/boost/hana/string.hpp#L8... ]. No need for the pre-processor with the extension.
* Compiling in C++11 mode is noticeably quicker for some of the examples I
tried (in clang and gcc) - due to the true variadic string implementation (no pre-processor)?
I'm not aware of a speed gain in switching to C++11 mode. Could you point me to the example and the compiler version/flags you were using? I'd like to learn more about where that comes from (it could be useful to reduce compilation times).
What I'm aware of is that some of the examples only print "Please use a compiler that support constexpr" when the string macro is not supported (eg. calculator_with_parens_and_unary_ops example). Because of this, in pre-C++11 mode, they compile quicker as they don't run the metaprograms.
The calculator example. My Linux box has an older core 2 (7+ years old), and the OSX box has a much newer chip. The compiler options were just `(clan)g++ -I ~/code/mpllibs main.cpp` with `-std=c+11` added for cpp11. Compiler c++03 c++11 ------------------------------------------------- g++ 4.8.4 Gentoo Linux ~10.5s ~3.5s clang++ 3.5 Gentoo Linux ~5.8s ~2.6s clang++ 3.5 OSX ~2.1s ~1.0s These are repeatable, so weird caching effects should be minimal. It also seems unlikely that the flag would introduce faster code paths internally in both compilers. The interesting part comes when adding -E to run the pre-processor only. I timed this while redirecting output to /dev/null. Not the greatest test, but still interesting: Compiler c++03 c++11 ------------------------------------------------- g++ 4.8.4 Gentoo Linux ~9.2s ~2.1s clang++ 3.5 Gentoo Linux ~4.7s ~1.3s clang++ 3.5 OSX ~1.6s ~0.9s The g++ output is 41kb smaller in cpp11 mode. I looked at the output to find the difference, but its not easy when both are 1.5MB of text. `diff` is not helpful. At first glance the mpl includes seem to be the same (I didn't spot any pre-processed includes cheating). Also FWIW, the Linux box is using a standard boost 1.55 release, while the OSX is using a master checkout of boost from several weeks ago. Lee
Hi Lee, On 2015-06-08 16:29, Lee Clagett wrote:
* Adding support for the Gcc character-literal expansion extension would be nice, because the strings macro has some intense pre-processor work.
Do you mean multi-character literals? (eg. 'abcd'?)
I thought Louis mentioned this [ https://github.com/ldionne/hana/blob/master/include/boost/hana/string.hpp#L8... ]. No need for the pre-processor with the extension. Now I see what you mean. I'll definitely check that.
I'm not aware of a speed gain in switching to C++11 mode. Could you point me to the example and the compiler version/flags you were using? I'd like to learn more about where that comes from (it could be useful to reduce compilation times).
The calculator example. Wow, I'll take a closer look. Thank you for pointing this out.
Regards, Ábel
On 2015-05-28 15:12, Christophe Henry wrote:
Dear all,
The review of the metaparse library started last Monday. Please consider taking the time to review it and post comments or reviews on this list.
Thanks,
Christophe
------------------------------------------------------------------------------------------
Please always include in your review a vote as to whether the library should be accepted into Boost. Yes
Additionally please consider giving feedback on the following general topics:
- What is your evaluation of the design? Clean. - What is your evaluation of the implementation? Nicely structured and readable, as far as I can tell.
I believe that there is still room for improvement regarding the error messages in case of a parse error. I described the idea of a portable static_assert in my talk at MeetingC++ in Berlin last year: For each potential problem (i.e. for each of the BOOST_STATIC_ASSERTs that you now have), define a struct with a static method that fires a static_assert when called. In case of an error, transport this class from the problem-location to the TMP-call-site and call that static method. This way you can produce a very targeted message with a very short TMP call stack. Haven't looked deep enough to say if this pattern is actually applicable.
- What is your evaluation of the documentation? Very readable, nicely organized. - What is your evaluation of the potential usefulness of the library? Now that is an interesting question. Given the current compile times, I have no idea how it will be used in practice, but I am sure it will.
More importantly though, it is an excellent show case of what is possible given enough perseverance. Also, it might spur development in the core language towards features that might make such a library easier to write (and faster to compile). Part of the role of boost is to push compilers to the limit (and sometimes beyond). This library certainly has the capabilities to do so.
- Did you try to use the library? - With what compiler? - Did you have any problems? I tried a few of the examples in https://github.com/sabel83/metaparse using clang-3.5.1 -std=c++11.
The only problem I encountered was in example/parsing_error. It did not produce an error at first. I had to remove a comment. It turned out there was a spelling error in the comment leading to a compile error that was not intended, I guess. After fixing that, I think I saw the intended error message which I found a bit hard to grok. I was a bit surprised to see BOOST_STATIC_ASSERT in a C++11 environment.
- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? A quick reading of the documentation and bits of the code, plus some experiments with the tests provided.
Tempus fugit...
- Are you knowledgeable about the problem domain? As author of sqlpp11 I believe to have some grasp on DSLs. Also I did quite a few experiments on turning strings into templates. But I am certainly not a parser specialist.
Best, Roland
I believe that there is still room for improvement regarding the error messages in case of a parse error. I described the idea of a portable static_assert in my talk at MeetingC++ in Berlin last year: For each potential problem (i.e. for each of the BOOST_STATIC_ASSERTs that you now have), define a struct with a static method that fires a static_assert when called. In case of an error, transport this class from the problem-location to the TMP-call-site and call that static method. This way you can produce a very targeted message with a very short TMP call stack.
Haven't looked deep enough to say if this pattern is actually applicable. Metaparse uses classes (used as values in the metaprograms) representing
Hi Roland, Thank you for the review. On 2015-06-08 08:02, Roland Bock wrote: parsing errors. The library provides a few (eg. index_out_of_range, unpaired, etc) and users can create more (a helper macro is provided). The built-in ones are in the metaparse::error namespace. Parsing errors are propagated out of the top level parser and the user of the library can decide if he wants to handle them himself or uses build_parser, which emits a compilation error using static assertion. When someone builds the interface of his library using Metaparse, there might be template layers above the parser, in which case the author of those has to propagate the parsing errors out of those layers and then emit the compilation error at the top level to follow your advice. (he can do that with Metaparse). The parsing error classes have a static get_value method giving a string explanation of the error. Unfortunately, this can only be displayed at runtime (static assert's explanation has to be a string literal). What might be done based on your portable static assertion approach is adding a static method to these error classes representing the compilation errors to "static assert themselves" with better error messages. My concern about this is loosing the details of the error (position in the parsed text, the range boundaries and the index in case of an out of range error, etc). I'll check if "static assert yourself" can be done without loosing that information.
I tried a few of the examples in https://github.com/sabel83/metaparse using clang-3.5.1 -std=c++11.
The only problem I encountered was in example/parsing_error. It did not produce an error at first. I had to remove a comment. It turned out there was a spelling error in the comment leading to a compile error that was not intended, I guess. After fixing that, I think I saw the intended error message which I found a bit hard to grok. I'll fix the typo. The idea there is that the commented code fails to compile, however commenting that out and using debug_parsing_error can display the above mentioned (very user-friendly) text explanation of the error. I'll add some further explanation on this to the README of the example.
I was a bit surprised to see BOOST_STATIC_ASSERT in a C++11 environment. To the best of my knowledge BOOST_STATIC_ASSERT uses static_assert, when it is available. So using it works (well) everywhere and Metaparse does not need to deal with that aspect of the platform.
Regards, Ábel
Hi Abel, On 2015-06-08 21:26, Abel Sinkovics wrote:
On 2015-06-08 08:02, Roland Bock wrote:
I believe that there is still room for improvement regarding the error messages in case of a parse error. I described the idea of a portable static_assert in my talk at MeetingC++ in Berlin last year: For each potential problem (i.e. for each of the BOOST_STATIC_ASSERTs that you now have), define a struct with a static method that fires a static_assert when called. In case of an error, transport this class from the problem-location to the TMP-call-site and call that static method. This way you can produce a very targeted message with a very short TMP call stack.
Haven't looked deep enough to say if this pattern is actually applicable. [...] What might be done based on your portable static assertion approach is adding a static method to these error classes representing the compilation errors to "static assert themselves" with better error messages. My concern about this is loosing the details of the error (position in the parsed text, the range boundaries and the index in case of an out of range error, etc). I'll check if "static assert yourself" can be done without loosing that information. I had similar concerns at the beginning. It turned out (at least in my case) that the "lost information"
* might be relevant for the developer of the library * was just hiding the relevant things from the user Whether the same is true for Metaparse, I don't know.
I tried a few of the examples in https://github.com/sabel83/metaparse using clang-3.5.1 -std=c++11.
The only problem I encountered was in example/parsing_error. It did not produce an error at first. I had to remove a comment. It turned out there was a spelling error in the comment leading to a compile error that was not intended, I guess. After fixing that, I think I saw the intended error message which I found a bit hard to grok. I'll fix the typo. The idea there is that the commented code fails to compile, however commenting that out and using debug_parsing_error can display the above mentioned (very user-friendly) text explanation of the error. I'll add some further explanation on this to the README of the example.
Thanks for the explanation. That is a neat feature :-) Best, Roland
Hi Roland, On 2015-06-09 07:03, Roland Bock wrote: >> What might be done based on your portable static assertion approach >> is adding a static method to these error classes representing the >> compilation errors to "static assert themselves" with better error >> messages. My concern about this is loosing the details of the error >> (position in the parsed text, the range boundaries and the index in >> case of an out of range error, etc). I'll check if "static assert >> yourself" can be done without loosing that information. > I had similar concerns at the beginning. It turned out (at least in my > case) that the "lost information" > > * might be relevant for the developer of the library > * was just hiding the relevant things from the user > > Whether the same is true for Metaparse, I don't know. In case of Metaparse the information lost there is important for the users of the parsers. It has the answers to questions like: - where (in the DSL script) is the error coming from? (column, maybe line number) - information that is a template argument of a parser combinator. For example: when the parser finds a different character than what was expected in the DSL script, what was the expected character? eg. when the parser is parsing "()" or "[]" pairs and is waiting for the closing element, etc. In those cases the expected characters are template arguments, not hard-coded values of the parsers (and parser combinators) the library offers. And they should be part of the error report. Regards, Ábel
participants (9)
-
Abel Sinkovics
-
Andrzej Krzemienski
-
Bjorn Reese
-
Christophe Henry
-
Gordon Woodhull
-
Lee Clagett
-
Louis Dionne
-
Paul A. Bristow
-
Roland Bock