Re: [boost] [Proto] Review

28 Mar 2008

      David Abrahams wrote:
...
...
...
* What happens if your type has a generalized operator? ::
namespace fu
    {
      struct zero {};
    #if 1
      template <class T> T operator+(T x,zero) { return x; }
    #else
      double operator+(double x,zero) { return x; }
    #endif
    }
int main()
    {
        // Define a calculator context, where _1 is 45 and _2 is 50
        calculator_context ctx( 45, 50 );
// Create an arithmetic expression and immediately evaluate it
        double d = proto::eval( (_2 - _1) / _2 * 100 + fu::zero(), ctx );
// This prints "10"
        std::cout << d << std::endl;
    }
Answer: a nasty error message (at least on g++).  Anything we can
  do to improve it (just wondering)?
It's similar to what happens in e.g. a linear algebra domain where 
vector terminals want to define += that actually does work as opposed to 
build expression trees. In that case, you'll need to disable proto's 
operator overloads with a grammar. Otherwise, the operators are ambiguous.
I'm can't think of anything better.
In this case what I really wanted was to disable the existing operator.
Proto's operator? The way to do that is with a grammar. 'Course I don't 
show that until the very end.
...
Think of zero as a non-lazy type like double, that you might want to use
in a lazy context.
Anyway, I was only aiming at "can we improve the error message," rather
than, "can we actually make this work," although I can imagine some
approaches to the latter.
I'm curious what you have in mind here.
...
...
...
* is there a reason we need ``ref_`` as opposed to using true
  references? (just curious, but the docs don't answer this
  question).
It's not strictly necessary, and in branches/proto/v3 there's a version 
of proto that uses true references. I found it complicated the the 
implementation and causes a bunch of unnecessary remove_reference<> 
instantiations.
If the existing implementation isn't too complicated, you could always
add an internal "convert T& to ref_<T>" step, just to keep ref_ out of
the users' face.
You mean, just to keep ref_ out of the type of an expression? It would 
add some code complexity in some places (e.g., given expr<Tag,Args,N>, I 
wouldn't be able to simply get the tag type of the 0th child as 
Args::arg0::proto_tag), but it might be worth it. ref_ really is a 
detail that shouldn't show up in error messages and in debuggers. It 
also contributes to debug symbol name length.
...
...
I need to document the naming idioms. proto::foo() is a function, 
proto::result_of::foo is a metafunction that calculates the return type, 
proto::functional::foo is the equivalent function object, and (where 
relevant) proto::transform::foo is the primitive transform, and 
proto::_foo is an alias for proto::transform::foo.
Now that you know, do you have any suggestions for improvements?
Hmm, that's hard to keep track of.  If you dopped the alias it would be
a little simpler.  Maybe s/functional/function_obj/ or even
s/functional/object/...
Which alias would you drop? If I had to drop one, I'd drop everything in 
the transform namespace, and just use _foo. If I had to type 
transform::childN everywhere, I think I'd shoot myself.

I don't like "object" ... it carries no semantic weight. How about 
s/functional/functor/?
...
wait, isn't proto::foo in the above case actually an instance of
proto::functional::foo?
It shouldn't be, no.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
  - What purpose do the <>s serve in the table if you're not going
    to show the metafunction signature?
Elsewhere in the docs, I've used "foo()" to emphasize that foo is a 
function, and "foo<>" to emphasize that foo is a template. Not sure if 
it's a worthwhile convention.
I think as long as you're going to put the <>s in, it would help a lot
to remind people how the thing is used, so I would add arguments.
Just in this table, or everywhere?
...
...
...
- Does it use the same tag and metafunction no matter the arity
    of a function call?
Not sure what you mean. What function call?
I'm talking about the tag::function row.
Oh, yes. The function<> grammar/metafunction is variadic. So is 
nary_expr<>. Those are the only two.
...
...
...
- What namespaces are the metafunctions in?
boost::proto
My point was that there's no guide at the beginning that tells me how to
interpret such names.  I would prefer in most cases to see everything
written as though
namespace proto = boost::proto;
were in effect, so there would be a lot more qualification.
OK.
...
...
...
- The ``DomainOrArg`` argument to ``result_of::make_expr`` is
    confusing.  I don't see a correponding argument to the function
    object.  I might not have been confused by this except that you
    seem to use that argument in the very next example.
The make_expr function object does have an optional Domain template 
parameter:
template<typename Tag, typename Domain = default_domain>
     struct make_expr : callable
What I'm not showing is an overload of the proto::make_expr() free 
function that doesn't require you to specify the domain.
Can you do anything to clear up the confusion?
I think for the users' guide, I should describe the usages and leave the 
signatures for the reference. That should make this clearer.
...
...
...
- I don't know how well or badly it would work with the rest of
    the library, but I'm thinkin' in cases like this one it might
    be possible to save the user some redundant typing::
// One terminal held by reference:
      int i = 0;
typedef
          proto::result_of::make_expr<
              MyTag
            , int &   // <-- Note reference here
            , char
          >::type
      expr_type;
expr_type expr = proto::make_expr<MyTag>(boost::ref(i), 'a');
I'm thinking the result of ``proto::make_expr<...>(...)`` could hold
    everything by reference, but the type of
    ``proto::result_of::make_expr< ... >`` would hold things by the
    specified type.  
And rely on an implicit conversion between expression types?
Yes.
...
I've tried to avoid that. Figuring what is convertible to what can be
expensive, and it's hard to know whether you are paying that cost in
your code or not because the conversions are implicit.
I don't understand, sorry.  This doesn't look like something that
requires any type checking that wouldn't happen anyway.
If I understand what you're suggesting, each expr<T,Args,N> would have 
the following conversion operator:

     template<typename Args2>
     operator expr<T,Args2,N>() const
     {
         expr<T,Args2,N> that = {
             this->arg0
           , this->arg1
           , ...
         };
         return that;
     }

Is that right? This is recursive, because this->arg0 might be an expr<> 
that needs to be implicitly converted, etc. This makes me uncomfortable. 
  Whenever I return an expr<>, I don't know whether an implicit 
conversion is happening, how expensive it might be at runtime, or how 
many unnecessary templates are instantiated at compile time. I can't 
search my code for these conversions, either -- they're invisible. My 
only recourse is to comment out the conversion operator and see what breaks.

My gut tells me to leave it out.
...
...
...
- Is all this time spent on ``make_expr`` really appropriate at
    this early stage of the tutorial?  Seems to me we *ought* to be
    able to do a lot of more sophisticated things with the library
    before we start into the nitty-gritty of building expressions
    explicitly (i.e. framework details).  No?
You're probably right. Currently, the users' guide is neatly divided 
into 5 sections: construction, evaluation, introspection, transformation 
and extension.
That sounds like an excellent structure for a reference section :-)
Interesting suggestion.
...
...
That means I have to exhaustively cover construction 
before I get to anything else -- even make_expr() which is rather 
esoteric. I suppose I should rethink the overall structure. Suggestions 
welcome.
Walk us through practical examples in increasing order of
sophistication, showing the most useful features first and the more
esoteric ones laater.
Sure.
...
...
...
- Is this the first time you're showing us how to build a simple
    lazy function?  That should come *much, much* earlier.
Not everything can come first!
Does it sound like I'm saying that about everything?
Should have put a winky face there .. it was tongue-in-cheek.
...
...
...
* what tells the library to substitute the function's template
      parameter there?
Not sure I understand the question.
What is the underlying rule that says how this works?  Does the library
substitute the function's template parameter into all first arguments,
or what?
I'll clarify.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
You were misled by the fact that proto::eval is an instance of
proto::functional::eval. It shouldn't be. Better to make it a free
function like the others. No reason why it shouldn't be find-able with
ADL.
OK.  Well, the larger point is that the idioms need to be explained up
front so people don't have to create a series of wrong hypotheses like I
did before they understand the patterns you're using.
Right.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
  - That said, a really good motivating case for using matches<>
    seems to be missing.
I should talk about how it can be used to improve error messages by 
validating expressions at API boundaries.
Yep.  In general I've found enable_if's impact on error messages to be
disappointing (a long list of the overloads that didn't match, with
little explanation).  It usually turns out to be better to match with an
overload that generates a specific error message.
Agreed. This, and static function dispatching are the primary use cases 
for matches<>.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
To start with, there's one very basic thing we won't be able to do: name
template specializations that cannot be instantiated, like
vector<foo(bar)>
just for DSEL purposes.
Ugh! I suppose this would work around the issue:

   vector<call<foo(bar)>>

as long as call<> happened to be Regular.
...
...
...
- ``when< grammar, transform >`` seems like it would be better
    named ``replace< grammar, transform >`` or something.
Why do you say that?
My understanding was that tree transformations might match parts of the
tree and replace them with transformed versions, while leaving other
subtrees untouched.
To make an analogy with the runtime world, I don't think of a function 
call as "replacing" its arguments with the function's return value.
...
I'm not sure I believe that there's a better name than "when," but this
shows you how I was thinking about what you wrote, anyway.
I once called it "case_", but then there was an impedance mismatch with 
the switch_ grammar element. I'm still happiest with "when".
...
...
Visitor is just a blob of mutable data, whatever you want. The type of 
the visitor usually doesn't change during the transformation. None of 
proto's built in transforms touch it in any way --- it is passed through 
unchanged.
Oh, oh... please don't call that a visitor then!  It's just auxilliary
data; you might call it "data" or "aux".  When I see "visitor" I
automatically think of an object with associated operations that are
applied at each traversed element of a structure.
Good point. (The name "visitor" in Proto is historical ... in xpressive, 
it really was a visitor.) The Haskell guys call this a "context", but 
for obvious reasons I don't want to call it that. "data" or "aux" would 
work.
...
...
...
- I think ::
when< unary_expr< _, CalcArity >, CalcArity(_arg) >
should be spelled ::
when< unary_expr< _, CalcArity >, CalcArity(_arg(_)) >
CalcArity(_arg) and CalcArity(_arg(_)) are synonyms today. Do you feel 
that the (_) should be required? (_) is optional because _arg is a 
so-called primitive transform, for which expr, state, and visitor can be 
implicit. It's not just syntactic sugar -- it's less work for Proto.
In that case, no, I don't feel it should be required.  However, I think
this library is hard enough to grok that consistency of notation should
be a primary goal in the user guide, so I would use the more consistent
spelling.  Can you get most of the efficiency back with a single
template specialization on <_arg(_)> ?  That's often the case in such
situations.
Sort of. I would specialize when<X,Y(_)>, but I can't make an assumption 
here about whether Y is callable or not. It would still be an 
improvement, tho.
...
...
...
- I think you're missing a good example of why you'd do two
    different evaluations on the same expr.  A classic might be
    regular expression evaluation and symbolic differentiation.
    E.g., ::
x = eval(_1 * _1, normal(7));        // 49
          y = eval(_1 * _1, differentiate(7)); // 14
Symbolic differentiation?
d(x*x)/dx = 2x
Duh. Of course. Good suggestion.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
  - "It can be used without needing to explicitly specify any
    arguments to the transform."  Huh?  What kind of arguments?
This is why _arg is a synonym for _arg(_). In this case, "_" is an 
argument to the transform.
Okay.  I tend to think of placeholders as not being actual arguments.
I'm grasping for the right terminology here.
...
...
Because _arg is "callable", you can leave off 
_ and also _state and _visitor. They are implicit.
The point being that the language confused me.  Do what you will with
that fact; clarifying 50% of the things I complain about may make the
other 50% understandable to me.
I don't yet know how best to talk about these things. An improvement 
would be to write a glossary of terms, use terms consistently, and refer 
people to the glossary frequently.
...
...
...
2. Can we replace ``_make_terminal(_arg)`` above with
       ``_expr``?  If not, why not?
No, _expr is analogous to lambda::_1 ... whatever the first argument is, 
return that. (_state is analogous to lambda::_2 and _visitor is 
analogous to lambda::_3 --
Oooooh!  Please, a statement up front about that!
That would help people familiar with Lambda, but might just confuse 
people who don't. I can say the same thing without referring to Lambda, tho.
...
...
they return the state and visitor 
parameters). So _expr(_arg) would be the same as _arg. It wouldn't 
create a new expression node, which is what _make_terminal(_arg) does.
_make_terminal is a typedef for functional::make_expr<tag::terminal>. I 
don't think I say that anywhere, though. :-/
There's a lot of that going around ;-)
Yeah, documentation is hard.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
  - Does ``and_< T0,T1, ... Tn >`` *really* only apply ``T``\ *n*?
Yes. It can't apply all the transforms. The result of applying the first 
transform might have a type that the second transform can't make sense 
of. Transforms don't chain that way.
They could be chained by and_, could they not?
No, not in any meaningful way. Consider:

   and_<
     when<G1, T1>
    ,when<G2, T2>
...
T1 requires that any expression passed to it matches G1. Likewise for T2 
and G2. Now, imagine that the result of applying T1 is some type that 
*doesn't* match G2 ... it might not even be an expression! If you tried 
to pass it to T2, you would be violating T2's preconditions. T2's 
transform will likely fail.
...
...
The default transform for and_ is admittedly a bit arbitrary and not 
terribly useful. But you can always override it with proto::when
I'm lost.   Why override it when you don't have to use the arbitrary and
not-terribly-useful thing in the first place?
Is it clearer now?
...
...
...
- wow, you totally lost me on this page.  I can't understand why
    the stated behaviors of these transforms make sense
    (e.g. correspond to their names), and I can't understand why
    the usage of ``and_`` in ``UnwrapReference`` is an example of a
    transform and not a grammar element.  The outer ``or_`` is a
    grammar element, right?  When ``or_`` is used as a grammar
    element, aren't its arguments considered grammar elements also?
and_, or_ and not_ are both grammar elements and transforms.
Yes, I'm aware of that duality.  My understanding is that how they are
treated depends on context.  My point is that in the UnwrapReference
example, and_ is treated as a grammar element and not as a transform.
It is treated as *both* a grammar element and a transform. It's easier 
to see with or_

   struct UnwrapReference
     : or_<
          // ... stuff
       >
   {};

Here, or_ is very clearly being used as a grammar element. But when you 
apply UnwrapReference's transform, you're actually applying or_'s 
transform, because UnwrapReference inherits its transform from or_.

The case of and_ is the same:

   struct UnwrapReference
     : or_<
           // Pass through terminals that are not
           // reference_wrappers unchanged:
           and_<
               terminal<_>
             , not_<if_<is_reference_wrapper<_arg>()> >
           >
           // ... other stuff
       >
   {};

or_'s transform is simply to apply the transform associated with 
whichever subgrammar matched. So if the and_ subgrammar matched, its 
transform is applied. Hence, and_ is being used both as a grammar and as 
a transform.
...
That's doubly true because given its sub-elements, even the default
transform associated with and_ has no interesting effects: whether it
returned the result of the first, the last, or chained the transforms,
you'd get the same answer.
In this case, yes. The only interesting thing being demonstrated here is 
that and_ can be legally used where a transform is needed.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
  - This business of handling transforms specially when they can
    accept 3 arguments is hard to understand.  Aside from the fact
    that special cases tend to make things harder to use, it's not
    clear what it's really doing or why.  I guess you're saying
    that state and visitor are implicitly passed through when they
    can be?
Yes, and the expression, too.
It would be nice to have a logical overview up front that describes the
transformation as a uniform process of operating on these three values
at every matched node in the tree, and that you have these three
specially-named placeholders that pick up those values... unless their
values are overridden by supplying a round lambda expression.
I think I see what you're getting at. I may follow up with you offline 
at some point.
...
...
...
I can understand why you'd want something like that,
    but let's look at this more closely:
"For callable transforms that take 0, 1, or 2 arguments,
      special handling is done to see if the transform actually
      expects 3 arguments..."
Do you really mean, "for callable transforms that are *passed*
    0, 1, or 2 arguments...?"  Or maybe it's something more
    complicated, like "for callable transforms that *are written as
    though* they take 0, 1, or 2 arguments...?"
Yes, the latter.
OK, please clarify the language.  My point is that I had to work all
this stuff out for myself by writing about it.
Sure.
...
...
The following transform are synonyms:
_arg
   _arg()
   _arg(_)
   _arg(_,_state)
   _arg(_,_state,_visitor)
That is true not just for _arg but for any primitive transform. And for 
completeness (or just to make it more confusing?) you can use _expr 
instead of _ and it means the same thing here.
As I say above, using _arg instead of _arg(_,_state,_visitor) is more 
than just sugar. It's less work for Proto.
Great; the picture is becoming clearer.  Let's get that clarity into the
documentation.
Yes, I can't imagine why I didn't just say this in the first place. :-/
...
...
...
- Again the use of ``proto::callable`` without a prior
    explanation... oh! there it is, finally, in the footnote of the
    example!  If you check for "callable with 3 args" using
    metaprogramming tricks anyway, why not do the same for
    "callable" in general?
Not sure I understand the questions. Getting a little bleary-eyed myself.
The point is that, given everything you've written so far, at this point
I wonder why I have to specialize is_callable (or derive a class from
proto::callable) when you have (and use) a way to detect callability.
Ah. Those tricks I use to detect callability would cause types to 
instantiate. That wouldn't work for std::vector<foo(bar)>.
...
...
Or the "call" transform can be renamed 
"apply", and "function" can be "call".
That might be better.
Elsewhere, I use names from <functional>, like "multiplies" and 
"divides". So ... "calls"? Ick.
...
...
I don't care.
me neither ;-)
:-P
...
...
...
- Translating the example into terms I understand::
make_pair(_arg(_arg1), _arg(_arg2))
becomes ::
make_pair(_value(_left(_)), _value(_right(_)))
which looks a bit better to me.
typedef _arg _value;
You're good to go! :-)
I realize that; I'm suggesting the 2nd way makes an easier-to-grasp
presentation.
I'm of 2 minds about this. First, I want to document "best practices", 
and if the first is better than the second, I don't want to be 
encouraging the second.

The other issue is rather subtle: proto::_ is not a placeholder like 
mpl::_ is. It's a transform. Ditto for _left and _right. It's important 
for users to understand that they can write their own transforms and use 
them instead. There's nothing magic about proto::_ and no good reason to 
use it like this. _value(_left(_)) may *look* nice, but IMO it 
encourages people to think about this in a limiting way.
...
...
Consider a transform such as:
fusion::single_view< add_const<_> >( _ )
Do you see now? Gotta look for the nested ::type in add_const *after* 
the placeholder (er, nested transform) has been evaluated.
Sure.  What I mean is that the option to have no nested ::type is merely
convenient, and not even all that reliable in most cases (how do you
know your vector<>'s implementor didn't add a member ::type)?
Oh, I see. You're suggesting to *always* require a nested ::type. Sure. 
But it's very convenient as-is.
...
if you just used boost::result_of in the documentation, I think maybe
you'd skirt the whole issue.
Good point.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
The example doesn't even name pass_through directly.  Can we do
  better?
I've never used pass_through explicitly, but I use it all the time 
implicitly, as in this example.
Then maybe it doesn't deserve so much space in the user guide.
But it's quite possibly the most important transform.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
...
  - It's not clear to me why you need all this fancy footwork to
    define a special meaning for ``operator[]``.  Isn't that what
    contexts are for?
So for:
( v2 + v3 )[ 2 ];
... you're saying to let proto build the expression tree representing 
the array access, and evaluate it lazily with a context. Seems reasonable.
Does that undermine the whole example?
Oh, I remember now:

   double d = (v2 + v3)[2];

Either op[] is special, or else there is an implicit conversion to 
double. I opted for the former.
...
...
...
- But I have to say, that use of the grammar to restrict the
    allowed operators is *way cool*.  I just think it should have
    been shown *way earlier* ;-).
I can't show *all* the good bits first.
I actually disagree, for some reasonable definition of "first."
...
Then there'd be no reason to read further! :-)
It sounds a bit like you're saying it's your goal in the user guide to
eventually teach the reader everything about the library, whether he wants
to learn the details or not.  I don't think you should be disappointed
if she stops after she's learned to use the library powerfully but long
before she learns many of the details she won't need.
I was being tongue-in-cheek again. :-) I'll use <tongue 
location="cheek"/> tags in the future.
...
...
...
* http://boost-sandbox.sourceforge.net/libs/proto/doc/html/boost_proto/users_g...
- "After Proto has calculated a new expression type, it checks
    the domains of the children expressions. They must match."
Does this impair DSEL interoperability?
It does. Thanks for bringing that issue up; I had forgotten about it. 
It's an open design question. What is the domain of an expression such 
as (A + B) where A and B are in different domains? There needs to be an 
arbitration mechanism, but Proto doesn't have one yet. I don't know what 
that would look like.
This is important, because interoperability is one of the most
powerful arguments for DSELs.  I think there are a few cases, at least,
where you can make some default decisions.  For example A(B) and A[B]
both ought to be handled in A's domain in most cases.
Yes, that's how it works currently.
...
Other more symmetric expressions probably need some explicit wrappers:
A + in_domain<A>(B)
or something.
Ah! That makes sense, is very explicit, and would be very easy to 
implement. Thanks.
...
...
And I really don't want to have to explain PETE in Proto's docs.
I was thinking more along the lines of "_this_ roughly corresponds to
_that_ but look how much easier _this other thing_ is in Proto."
OK

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com