
It's all in the attachment. I had several other topics I wanted to explore; however, this will have to do for now. NOTE: The following is based on a download on 2007-02-17 of proto.zip from boost vault. * What is your evaluation of the design? Fair. One problem is the "mixing" of grammars with transforms. As stated here: <boost-root>/libs/proto/doc/html/boost_proto/users_guide/expression_transformation/canned_transforms/pass_through.html all expression generator meta-functions (For example, posit<>, shift_right<>, function<>, nary_expr<>, etc.) have a pass-through transform by default. which means that when, for example, matches<posit<_>,Expr> is compiled, the associated pass-through transform is not used. If it's not used in matches, then it should be eliminated. Of course it is used when the transform is applied; however, a transform template class taking a grammar as argument (the proto::when template seems appropriate) should be used to do that. This would make it clearer what the programmer's intention is. That means another template instantiation, thus slowing compile times; however, I think the time saved in understanding the code would compensate for that. Of course that's a judgment call. This "unused transform" feature is especially obvious with and_. Only the last transform (call it Xn) associated with the list of and_ args can ever be used. If transforms were eliminated from the and_ args and and_ was only used for matches, then: when<and_<G1,...,Gn>,Xn> could be used with the same effect as and_ with transforms associated with the args. Here, when is more obviously a transform, Tn,enabled by condition and_<G1,...,Gn>. Hm..., but that "looks" (based on what if means in c++) more like proto::if_, however I've never used proto::if_ and haven't read proto::if_ docs; hence, maybe that conclusion is all wrong. I have read somewhere that proto::if_'s 1st argument is a transform, which seems odd when the if_ name would suggest it's a grammar with a noop (unused) transform. What about proto::or_<T0,T1,...,Tn>? If the T0,T1,...,Tn were changed to "pure" grammars, G0,G1,...,Gn, then how would the *very* useful examples of transforms such as CalcArity in: <boost-root>/libs/proto/doc/html/boost_proto/users_guide/expression_transformation/example__calculator_arity_transform.html be done? Well, as explained in the in-source comments to proto::switch_ in matches.hpp, proto::switch_<Cases> could be used instead. OOPS, looking at switch_ shows it contains: template<typename This, typename Expr, typename State, typename Visitor> struct result<This(Expr, State, Visitor)> { typedef typename Cases::template case_<typename Expr::proto_tag> which; typedef typename which::template result<void(Expr, State, Visitor)>::type type; }; and, along with the example in code in /libs/proto/test/matches.cpp, shows that the only test is on the tag of the grammar and not the children. So.. maybe switch_ should be renamed to switch_tag and a new template, switch_gram, which does what the current or_ does except the template argument's are instances of a new template, proto::case_, where proto::case_ is like the current proto::when except: 1) The first arguments don't have to have a transform associated with them. 2) There's a default 2nd argument to proto::case_ which is identity_t (identity transform, which is the same as the transform part of the current proto::terminal): struct identity_t { template<typename Sig> struct result; template<typename This, typename Expr, typename State, typename Visitor> struct result<This(Expr, State, Visitor)> { typedef Expr type; }; template<typename Expr, typename State, typename Visitor> Expr const &operator ()(Expr const &expr, State const &, Visitor &) const { return expr; } } So, with these additional templates, the CalcArity might be: struct CalcArity : switch_gram< case_< terminal_g< placeholder1 >, mpl::int_<1>() > , case_< terminal_g< placeholder2 >, mpl::int_<2>() > , case_< terminal_g<_>, mpl::int_<0>() > , case_< unary_expr_g<_, CalcArity>, CalcArity(_arg) > , case_< binary_expr_g<_, CalcArity, CalcArity>, mpl::max<CalcArity(_left), CalcArity(_right)>() > > {}; where: XXX_g, for XXX={terminal,unary_expr,binary_expr}, is the "grammar part" of the current XXX template. However, even here there's a sortof mixture of grammar and transform. The "source" grammar of the transform is the one that appeared earlier on the example__calculator_arity_transform.html page (here, renamed CalcGram and with XXX->XXX_g [see where: paragraph under CalcArity above]): struct CalcGram : or_< terminal_g< placeholder1 > , terminal_g< placeholder2 > , terminal_g< _ > , unary_expr_g< _, CalcGram > , binary_expr_g< _, CalcGram, CalcGram > > {}; In order to completely separate the grammar from the transforms, something like an algebraic homomorphism: http://planetmath.org/encyclopedia/HomomorphismBetweenAlgebraicSystems.html is needed. Each function name, w_A, in source algebra in that reference would correspond to some grammar, G_w, in the or_: or_<G1,G2,...,Gn> The bad thing about that is that the function names would be sorta a duplication of the grammars, I think (I haven't thought it thru really). The only way I can be more certain of this is to try and implement it; however, I'm already way late in doing this review, so, that suggestion about complete separation of grammar from transform is just *highly* speculative. * What is your evaluation of the implementation? Not done. * What is your evaluation of the documentation? Good. However: 1) Some transform template's are barely documented or not documented (except in-source) at all (e.g. switch_). There really needs a full toc like the mpl one here: http://www.boost.org/libs/mpl/doc/refmanual/refmanual_toc.html I almost always go there when I want to figure out what mpl templates do. 2) Some transform documents are verify formally documented. IOW, they have almost a set of rules for guiding the reader, step-by-step, into how the transforms work. Each step in the process is described by a table in documentation of the outermost template. These tables have the before expression in the 1st column and the step's resulting expression in the 2nd column. This is very good. I used it to help me understand parts of proto. However, manual execution of rules on a given expression is hard because the user has to keep track in his mind the intermediates steps, and that's a lot of mental bookkeeping. It would be very helpful if there was a "proto-debugger" which, given an source expression, executes each of these steps (noting which table is being executed) and shows the correspondence between names in the "rules tables" with names in the source expression or intermediate expression, and finally stops when the expression cannot be reduced further. Maybe even proto could be used to do this. A sorta "self-aware" proto ;) This ability to take an expression and simply it using a set of rules would have uses in other domains, as shown by Task 3 in Markus' post: http://lists.boost.org/Archives/boost/2008/03/134376.php There's a prototype of doing something similar (although no association is shown between rules and tables) for language grammar analysis in the vault: http://www.boost-consulting.com/vault/index.php?&directory=Strings%20-%20Text%20Processing It just uses proto::eval to go through the steps until not further change is possible. 2) There needs to be a more detailed description of how the 1st argument, the expression argument, of proto::matches "relates" to the 2nd argument, the grammar argument. The description here: <boost-root>/libs/proto/doc/html/boost/proto/result_of/matches.html only uses "expression grammars" but makes no mention of proto meta-functions for generating expressions: <boost-root>/libs/proto/doc/html/boost_proto/users_guide /expression_construction/tags_and_meta_functions.html Judging from the example of using a terminal grammar, terminal<X>, to generate an expression by appending ::type: terminal<X>::type I jumped to the rash conclusion that a similar technique would work for binary_expr<Tag,L,R>. IOW, an expression could be generated by: binary_expr<Tag,L,R>::type However, this is only true if L and R are already expression. This rash conclusion lead to a time consuming "debugging of my thinking" as evidenced by the thread: http://lists.boost.org/Archives/boost/2008/03/133715.php Instead, the ::type appending must be propagated to all sub grammar expressions, as done by the morphism template attached to: http://lists.boost.org/Archives/boost/2008/03/134964.php Avoiding the rash conclusion mentioned above would probably be easier if proto grammars were prominently classified on some page into the following grammar subtypes: 1) expression grammars: These are the grammars appearing in the matches.html page mentioned above. These also are expression types; hence, the can also be used as the first argument to matches. In contrast to the meta-function grammars [2) below], for any expression grammar, E, the following expression fails to compile: E::type (Digression to design review: This "schizophrenic" feature of expression grammars makes understanding match harder in a similar way that untyped variables in an interpreted language makes understanding that code's functions harder. If there were an as_gram<ExprType> meta-function which returned the corresponding meta-function grammar generator, and only meta-function and predicate grammars [see below] were allowed as the 2nd argument to matches, then this would ease understanding and probably debugging [since, hopefully, deciphering compiler error messages would be easier]. Alternately, there could be an is_gram<ExprType> that when used in some kinda debug mode, diagnosed, with clearer error messages than the compilers, that ExprType was not a valid grammar. This is_gram could implemented using the boost concept checking library, AFAICT. ) 2) meta-function grammar: These (except for the nullary ones) appear in the last column of the table on: <boost-root>/libs/proto/doc/html/boost_proto/users_guide /expression_construction/tags_and_meta_functions.html As the name, meta-function, implies, they all have a nested ::type which returns the corresponding expression type [a.k.a. expression grammar, see 1) immediately above]. These are classified as nullary and non-nullary depending on the number of "sub-grammars": nullary) These are the ones formed by the proto::op::terminal template. They have no sub-grammars. non-nullary) These have an associated number (the arity) defining the number of sub-grammars. Each non-nullary template also is associated with its own c++ operator. The arity of the grammars is the same as the arity of the associated c++ operator. definition: meta-function grammar instance: basis-step) A nullary meta-function grammar is an meta-function grammar instance. induction-step) If O is a non-nullary meta-function grammar template with arity, N, and O1_inst,O2_inst, ... ,ON_inst are all meta-function grammar instances, then O<O1_inst,O2_inst, ... ,ON_inst> is an meta-function grammar instance. 3) predicate grammar: These are grammars with no corresponding c++ operators. In contrast to meta-function grammars, for any predicate grammar, B, the following expressions fail to compile: B::type tag_of<B>::type One reason for this is because there maybe be more than one type of proto expression that matches a given predicate grammar. For example, either an expression matching T0 or one matching T1 matches or_<T0,T1>. There one special case of a predicate grammar, the wild-card predicate: proto::wildcardns_::_ which matches any proto expression. This is analogous to a Boolean predicate always returning true no matter what the argument. Other important predicate grammars (in proto::control namespace) are not_, and_ and or_. AFAICT, just like any predicate logic expression can be constructed using: true not and or the equivalent of all other predicate grammars, such as: if_ can be constructed using just these four: _ not_ and_ or_ * What is your evaluation of the potential usefulness of the library? Very. * Did you try to use the library? With what compiler? Did you have any problems? Yes: http://www.boost-consulting.com/vault/index.php?&directory=Strings%20-%20Text%20Processing &filename=cfg_lookahead_extends.zip &filename=toy_attract.pair.zip * How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? In-depth, although till have a long way to go. * Are you knowledgeable about the problem domain? Somewhat; however, I'm obviously unfamiliar with the specific problems proto was designed to solve. For example, I was unaware of some of proto viewpoints expressed by Eric in the last 2 paragraphs of this post: http://lists.boost.org/Archives/boost/2008/03/134959.php * Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion. Eventually, but not yet. There needs to be more documentation debugging and also more exploration of alternative designs, in particular, designs guided by algebraic morphisms and category theory. One justification for this exploration is the morphism template in attachment to: http://lists.boost.org/Archives/boost/2008/03/134964.php Although that post observed that that morphism template: is correct in the limited case of those dual-purpose grammars-and-metafunctions. that doesn't mean it couldn't be extended to handle more general cases. In particular, in the case of expression grammars, I'd think the morphism template in the attachment could just be specialized to be an identity morphism: template < class Tag , class Args , int Arity
struct morphism < expr<Tag,Args,Arity>
{ typedef expr<Tag,Args,Arity> type; }; Further specialization designed to error when given a predicate grammar would, AFAICT, correctly calculate from either a meta-function grammar or an expression grammar the matching expression type. Category theory *might* be used to guide the predicate grammar design, as suggested (very indirectly) by parts of: http://www.amazon.com/Algebraic-Approaches-Semantics-Monographs-Computer/dp/... For Section 3.3, "The Boolean Algebra of Guards", contains: Pascal allows a statement of the form: if x>0 and not x=5 then S_1 else S_2 If S_1,S_2 are to be semantically interpreted as morphisms X->Y in a category _C_, how can "P and not Q" be interpreted, where P is "x>0" and Q is "x=5"? ... Equivalently, they [i.s. P and Q] may be represented as guard functions as in 1.5.20. ... for an object X in a partially additive category _C_, there is a subset, Guard(X) of _C_(X,X) of "guard morphisms" which form a Boolean algebra. On the same page where 1.5.20 mentioned in the above quote appears, there's: 22 Definition. let (p1,...,pn) be an n-way test on X and let f1,...,fn be in SC(X,Y). Then a natural generalization of the case statement in Pascal is: case(p1,...,pn)of(f1,...,fn)=f1 p1+...+fn pn where the p's correspond to the lhs of the when's in an or_ with elements decorated with transforms and the f's correspond to the transforms on rhs of the corresponding when's. In conclusion, the somewhat vague similarities, noted above, of proto with algebraic morphisms and, more generally, with category theory strongly suggests (to me at least) there could be a large benefit to trying to make these similarities more concrete and using that knowledge to improve proto's design.