[review][proto]suggest doc debug & category guided design

It's all in the attachment. I had several other topics I wanted to explore; however, this will have to do for now. NOTE: The following is based on a download on 2007-02-17 of proto.zip from boost vault. * What is your evaluation of the design? Fair. One problem is the "mixing" of grammars with transforms. As stated here: <boost-root>/libs/proto/doc/html/boost_proto/users_guide/expression_transformation/canned_transforms/pass_through.html all expression generator meta-functions (For example, posit<>, shift_right<>, function<>, nary_expr<>, etc.) have a pass-through transform by default. which means that when, for example, matches<posit<_>,Expr> is compiled, the associated pass-through transform is not used. If it's not used in matches, then it should be eliminated. Of course it is used when the transform is applied; however, a transform template class taking a grammar as argument (the proto::when template seems appropriate) should be used to do that. This would make it clearer what the programmer's intention is. That means another template instantiation, thus slowing compile times; however, I think the time saved in understanding the code would compensate for that. Of course that's a judgment call. This "unused transform" feature is especially obvious with and_. Only the last transform (call it Xn) associated with the list of and_ args can ever be used. If transforms were eliminated from the and_ args and and_ was only used for matches, then: when<and_<G1,...,Gn>,Xn> could be used with the same effect as and_ with transforms associated with the args. Here, when is more obviously a transform, Tn,enabled by condition and_<G1,...,Gn>. Hm..., but that "looks" (based on what if means in c++) more like proto::if_, however I've never used proto::if_ and haven't read proto::if_ docs; hence, maybe that conclusion is all wrong. I have read somewhere that proto::if_'s 1st argument is a transform, which seems odd when the if_ name would suggest it's a grammar with a noop (unused) transform. What about proto::or_<T0,T1,...,Tn>? If the T0,T1,...,Tn were changed to "pure" grammars, G0,G1,...,Gn, then how would the *very* useful examples of transforms such as CalcArity in: <boost-root>/libs/proto/doc/html/boost_proto/users_guide/expression_transformation/example__calculator_arity_transform.html be done? Well, as explained in the in-source comments to proto::switch_ in matches.hpp, proto::switch_<Cases> could be used instead. OOPS, looking at switch_ shows it contains: template<typename This, typename Expr, typename State, typename Visitor> struct result<This(Expr, State, Visitor)> { typedef typename Cases::template case_<typename Expr::proto_tag> which; typedef typename which::template result<void(Expr, State, Visitor)>::type type; }; and, along with the example in code in /libs/proto/test/matches.cpp, shows that the only test is on the tag of the grammar and not the children. So.. maybe switch_ should be renamed to switch_tag and a new template, switch_gram, which does what the current or_ does except the template argument's are instances of a new template, proto::case_, where proto::case_ is like the current proto::when except: 1) The first arguments don't have to have a transform associated with them. 2) There's a default 2nd argument to proto::case_ which is identity_t (identity transform, which is the same as the transform part of the current proto::terminal): struct identity_t { template<typename Sig> struct result; template<typename This, typename Expr, typename State, typename Visitor> struct result<This(Expr, State, Visitor)> { typedef Expr type; }; template<typename Expr, typename State, typename Visitor> Expr const &operator ()(Expr const &expr, State const &, Visitor &) const { return expr; } } So, with these additional templates, the CalcArity might be: struct CalcArity : switch_gram< case_< terminal_g< placeholder1 >, mpl::int_<1>() > , case_< terminal_g< placeholder2 >, mpl::int_<2>() > , case_< terminal_g<_>, mpl::int_<0>() > , case_< unary_expr_g<_, CalcArity>, CalcArity(_arg) > , case_< binary_expr_g<_, CalcArity, CalcArity>, mpl::max<CalcArity(_left), CalcArity(_right)>() > > {}; where: XXX_g, for XXX={terminal,unary_expr,binary_expr}, is the "grammar part" of the current XXX template. However, even here there's a sortof mixture of grammar and transform. The "source" grammar of the transform is the one that appeared earlier on the example__calculator_arity_transform.html page (here, renamed CalcGram and with XXX->XXX_g [see where: paragraph under CalcArity above]): struct CalcGram : or_< terminal_g< placeholder1 > , terminal_g< placeholder2 > , terminal_g< _ > , unary_expr_g< _, CalcGram > , binary_expr_g< _, CalcGram, CalcGram > > {}; In order to completely separate the grammar from the transforms, something like an algebraic homomorphism: http://planetmath.org/encyclopedia/HomomorphismBetweenAlgebraicSystems.html is needed. Each function name, w_A, in source algebra in that reference would correspond to some grammar, G_w, in the or_: or_<G1,G2,...,Gn> The bad thing about that is that the function names would be sorta a duplication of the grammars, I think (I haven't thought it thru really). The only way I can be more certain of this is to try and implement it; however, I'm already way late in doing this review, so, that suggestion about complete separation of grammar from transform is just *highly* speculative. * What is your evaluation of the implementation? Not done. * What is your evaluation of the documentation? Good. However: 1) Some transform template's are barely documented or not documented (except in-source) at all (e.g. switch_). There really needs a full toc like the mpl one here: http://www.boost.org/libs/mpl/doc/refmanual/refmanual_toc.html I almost always go there when I want to figure out what mpl templates do. 2) Some transform documents are verify formally documented. IOW, they have almost a set of rules for guiding the reader, step-by-step, into how the transforms work. Each step in the process is described by a table in documentation of the outermost template. These tables have the before expression in the 1st column and the step's resulting expression in the 2nd column. This is very good. I used it to help me understand parts of proto. However, manual execution of rules on a given expression is hard because the user has to keep track in his mind the intermediates steps, and that's a lot of mental bookkeeping. It would be very helpful if there was a "proto-debugger" which, given an source expression, executes each of these steps (noting which table is being executed) and shows the correspondence between names in the "rules tables" with names in the source expression or intermediate expression, and finally stops when the expression cannot be reduced further. Maybe even proto could be used to do this. A sorta "self-aware" proto ;) This ability to take an expression and simply it using a set of rules would have uses in other domains, as shown by Task 3 in Markus' post: http://lists.boost.org/Archives/boost/2008/03/134376.php There's a prototype of doing something similar (although no association is shown between rules and tables) for language grammar analysis in the vault: http://www.boost-consulting.com/vault/index.php?&directory=Strings%20-%20Text%20Processing It just uses proto::eval to go through the steps until not further change is possible. 2) There needs to be a more detailed description of how the 1st argument, the expression argument, of proto::matches "relates" to the 2nd argument, the grammar argument. The description here: <boost-root>/libs/proto/doc/html/boost/proto/result_of/matches.html only uses "expression grammars" but makes no mention of proto meta-functions for generating expressions: <boost-root>/libs/proto/doc/html/boost_proto/users_guide /expression_construction/tags_and_meta_functions.html Judging from the example of using a terminal grammar, terminal<X>, to generate an expression by appending ::type: terminal<X>::type I jumped to the rash conclusion that a similar technique would work for binary_expr<Tag,L,R>. IOW, an expression could be generated by: binary_expr<Tag,L,R>::type However, this is only true if L and R are already expression. This rash conclusion lead to a time consuming "debugging of my thinking" as evidenced by the thread: http://lists.boost.org/Archives/boost/2008/03/133715.php Instead, the ::type appending must be propagated to all sub grammar expressions, as done by the morphism template attached to: http://lists.boost.org/Archives/boost/2008/03/134964.php Avoiding the rash conclusion mentioned above would probably be easier if proto grammars were prominently classified on some page into the following grammar subtypes: 1) expression grammars: These are the grammars appearing in the matches.html page mentioned above. These also are expression types; hence, the can also be used as the first argument to matches. In contrast to the meta-function grammars [2) below], for any expression grammar, E, the following expression fails to compile: E::type (Digression to design review: This "schizophrenic" feature of expression grammars makes understanding match harder in a similar way that untyped variables in an interpreted language makes understanding that code's functions harder. If there were an as_gram<ExprType> meta-function which returned the corresponding meta-function grammar generator, and only meta-function and predicate grammars [see below] were allowed as the 2nd argument to matches, then this would ease understanding and probably debugging [since, hopefully, deciphering compiler error messages would be easier]. Alternately, there could be an is_gram<ExprType> that when used in some kinda debug mode, diagnosed, with clearer error messages than the compilers, that ExprType was not a valid grammar. This is_gram could implemented using the boost concept checking library, AFAICT. ) 2) meta-function grammar: These (except for the nullary ones) appear in the last column of the table on: <boost-root>/libs/proto/doc/html/boost_proto/users_guide /expression_construction/tags_and_meta_functions.html As the name, meta-function, implies, they all have a nested ::type which returns the corresponding expression type [a.k.a. expression grammar, see 1) immediately above]. These are classified as nullary and non-nullary depending on the number of "sub-grammars": nullary) These are the ones formed by the proto::op::terminal template. They have no sub-grammars. non-nullary) These have an associated number (the arity) defining the number of sub-grammars. Each non-nullary template also is associated with its own c++ operator. The arity of the grammars is the same as the arity of the associated c++ operator. definition: meta-function grammar instance: basis-step) A nullary meta-function grammar is an meta-function grammar instance. induction-step) If O is a non-nullary meta-function grammar template with arity, N, and O1_inst,O2_inst, ... ,ON_inst are all meta-function grammar instances, then O<O1_inst,O2_inst, ... ,ON_inst> is an meta-function grammar instance. 3) predicate grammar: These are grammars with no corresponding c++ operators. In contrast to meta-function grammars, for any predicate grammar, B, the following expressions fail to compile: B::type tag_of<B>::type One reason for this is because there maybe be more than one type of proto expression that matches a given predicate grammar. For example, either an expression matching T0 or one matching T1 matches or_<T0,T1>. There one special case of a predicate grammar, the wild-card predicate: proto::wildcardns_::_ which matches any proto expression. This is analogous to a Boolean predicate always returning true no matter what the argument. Other important predicate grammars (in proto::control namespace) are not_, and_ and or_. AFAICT, just like any predicate logic expression can be constructed using: true not and or the equivalent of all other predicate grammars, such as: if_ can be constructed using just these four: _ not_ and_ or_ * What is your evaluation of the potential usefulness of the library? Very. * Did you try to use the library? With what compiler? Did you have any problems? Yes: http://www.boost-consulting.com/vault/index.php?&directory=Strings%20-%20Text%20Processing &filename=cfg_lookahead_extends.zip &filename=toy_attract.pair.zip * How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? In-depth, although till have a long way to go. * Are you knowledgeable about the problem domain? Somewhat; however, I'm obviously unfamiliar with the specific problems proto was designed to solve. For example, I was unaware of some of proto viewpoints expressed by Eric in the last 2 paragraphs of this post: http://lists.boost.org/Archives/boost/2008/03/134959.php * Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion. Eventually, but not yet. There needs to be more documentation debugging and also more exploration of alternative designs, in particular, designs guided by algebraic morphisms and category theory. One justification for this exploration is the morphism template in attachment to: http://lists.boost.org/Archives/boost/2008/03/134964.php Although that post observed that that morphism template: is correct in the limited case of those dual-purpose grammars-and-metafunctions. that doesn't mean it couldn't be extended to handle more general cases. In particular, in the case of expression grammars, I'd think the morphism template in the attachment could just be specialized to be an identity morphism: template < class Tag , class Args , int Arity
struct morphism < expr<Tag,Args,Arity>
{ typedef expr<Tag,Args,Arity> type; }; Further specialization designed to error when given a predicate grammar would, AFAICT, correctly calculate from either a meta-function grammar or an expression grammar the matching expression type. Category theory *might* be used to guide the predicate grammar design, as suggested (very indirectly) by parts of: http://www.amazon.com/Algebraic-Approaches-Semantics-Monographs-Computer/dp/... For Section 3.3, "The Boolean Algebra of Guards", contains: Pascal allows a statement of the form: if x>0 and not x=5 then S_1 else S_2 If S_1,S_2 are to be semantically interpreted as morphisms X->Y in a category _C_, how can "P and not Q" be interpreted, where P is "x>0" and Q is "x=5"? ... Equivalently, they [i.s. P and Q] may be represented as guard functions as in 1.5.20. ... for an object X in a partially additive category _C_, there is a subset, Guard(X) of _C_(X,X) of "guard morphisms" which form a Boolean algebra. On the same page where 1.5.20 mentioned in the above quote appears, there's: 22 Definition. let (p1,...,pn) be an n-way test on X and let f1,...,fn be in SC(X,Y). Then a natural generalization of the case statement in Pascal is: case(p1,...,pn)of(f1,...,fn)=f1 p1+...+fn pn where the p's correspond to the lhs of the when's in an or_ with elements decorated with transforms and the f's correspond to the transforms on rhs of the corresponding when's. In conclusion, the somewhat vague similarities, noted above, of proto with algebraic morphisms and, more generally, with category theory strongly suggests (to me at least) there could be a large benefit to trying to make these similarities more concrete and using that knowledge to improve proto's design.

Larry Evans wrote:
that suggestion about complete separation of grammar from transform is just *highly* speculative.
... but one I like to hear someone say. This idea should be investigated. Markus

On 03/28/08 17:35, Markus Werle wrote:
The bundling of grammar with transform appears to be an example of the "blob" anti-pattern. the attached code is a translation of the example on p. 16 of _C++ Template Metaprogramming_. The translation was needed to convert the templated function into a template class in order to more closely reflect the proto structure. The name, apply_fg, from the book example was retained but appended with _sep or _blob to avoid name conflicts. Hopefully someone will be able to explain why the attachment does or doesn't show that proto's mixture of grammar with transform is an example of the blob anti-pattern. AFAICT, the reason proto mixes the two is to save typing. The reason not to mix the two, in addition to the anti-pattern reason (if that's the case), is that it made the code harder for me to understand. Although better documentation might make it easier for me to understand, that still leaves the question of whether it's an instance of the blob anti-pattern. To me, the mixing of grammar with transform is like mixing the test with the execute of c++ if statement. IOW, instead of: if(test) { execute; } there's just: test_execute; Separating the test from the execute is easier to understand.

Sorry for the delayed response, Larry. Larry Evans wrote:
Just because it's not used in some usage scenarios isn't reason to eliminate it.
I've considered this. Had when<> been defined as: template< typename Grammar , typename Transform = pass_through<Grammar>
struct when; then you could say: when< negate<_> > and get the pass-through transform. Is that what you had in mind? I don't see this as an improvement over the simpler: negate<_>
You can do this today.
The presence of the default transform doesn't disallow the usage above, and can make some things simpler.
What about the name "if_" suggests its first argument is a grammar? It's primary use is to test for conditions that grammars cannot easily test for, like: if_< is_base_of<MyBase, _arg>(), X, Y >
switch_ and or_ serve exactly the same role; the only difference is convenience vs. compile-time efficiency.
This is needless complication, IMO. The default transform of proto::or_ is there for a reason. It does just what you need it to. Why do you want to remove it? <snip>
In order to completely separate the grammar from the transforms, something like an algebraic homomorphism:
Please tell me why you think separating grammars and transforms is desirable.
After all our exchanges, I still don't have a coherent picture of what you're after. Write the code, please, and then port some of Proto's examples to your system. Then we can talk about which approach is better. I just don't think there's any other way to proceed.
Yes, an oversight.
Agreed.
Others have suggested that these tables move into the reference section, and more approachable descriptions take their place in the users' guide. Do you have an opinion about that?
I'm having a hard time imagining what this would look like. A runtime debugger? That you can step through? Not getting it.
Which download?
It does: "matches<Expr,Grammar> inherits (indirectly) from mpl::true_ if Expr::proto_base_expr matches Grammar::proto_base_expr, and from mpl::false_ otherwise." The key here is the access of the nested ::proto_base_expr. For both expressions and grammar metafunctions, ::proto_base_expr is a typedef for an instantiation of expr<>. The rest of the description is specified in terms of expr<>. That's how it works.
"The grammars appearing in the matches.html page" include things like proto::or_ which are *not* expressions and cannot be used as the first argument to matches. Clearly I've misunderstood you.
Let the nested ::type thing go. It has nothing to do with the distinction between expressions and grammars. Read matches.html again, carefully. It's all there, I think.
(Digression to design review: This "schizophrenic" feature of expression grammars makes understanding match harder in a similar
There are expressions and there are grammars. There is no "expression grammar" -- I don't know what that is. The above description doesn't make sense to me.
As things stand, an expression type is a degenerate grammar type that matches itself and other expression types sufficiently like it (disregarding references and whatnot). You would disallow this usage. OK...
Rather than take away a feature because it is potentially confusing, I'd rather improve the documentation to eliminate the confusion. Unless there's a compelling reason, of course. I just haven't heard one yet.
OK, so you'd rather I specified the behavior of matches<> in terms of the metafunctions *and* expr<> instantiations? That would nearly double the size of the specification. What would be the benefit of doing it that way? Can I try to reformulate what you're saying, and you tell me whether I am on the right track? There should be Expression and Grammar concepts (in the C++0x sense), and the matches<> matefunction should be constrained to take an Expression and a Grammar. That would certainly be an interesting and illuminating exercise.
OK. The exception are unary_expr, binary_expr and nary_expr, which are not associated with C++ operators.
OK, this describes a subset of the allowable grammar types.
Right, but your formalism disallows grammars of the form: // match -X, where X is anything other // than *Y negate< not_< dereference<_> > > In your formalism, the only allowable subgrammars of the metafunction grammars are other metafunction grammars, not predicate grammars as in the above. Proto's grammars are more flexible than that.
In the blurb at the bottom of: http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g... I try to express that idea.
About the docs: agreed. About the design: show me the code. I haven't yet seen a better design or heard a description of one I can understand. I don't share your mathematical vocabulary, but if we both talk in C++ I'm sure we'll make ourselves understood.
????
No clue, sorry.
You may be right, but if you are I'll never know unless you try to express your mathematical ideas in C++ code so I can access them. -- Eric Niebler Boost Consulting www.boost-consulting.com
participants (3)
-
Eric Niebler
-
Larry Evans
-
Markus Werle