Re: [boost] [Phoenix] Some questions and notes...

26 Sep 2008

      On Fri, Sep 26, 2008 at 4:44 PM, Joel de Guzman
<joel@boost-consulting.com> wrote:
...
Giovanni Piero Deretta wrote:
...
...
IMO, it's not controversial. I've considered this approach a long
time ago. It's actually doable: evaluate immediately when there are no
placeholders in an expression. I'm not sure about the full effect
of this behavior, OTOH. Such things should be taken very carefully.
Mind you, the very impact of immediate evaluation on expressions like
above already confuses people. Sometimes, the effect is subtle
and is not quite obvious when you are dealing with complex
lambda expressions. I know, from experiences with ETs (prime
example is Spirit), that people get confused when an expression
is immediate or not. The classic example:
for_each(f, l, std::cout << 123 << _1)
Oops! std::cout << 123 is immediate. But that's just for starters.
With some heavy generic code, you can't tell by just looking at the
code which expresions are being evaluated immediately or lazily.
Ok, I'll start by saying that for me this is not just "it would be
great if...". I routinely use an extension to boost.lambda that allows
one to define lazy functions a-la phoenix, except that they lazy only
if one or more of their parameter is a lambda expression.
Are you suggesting that you want phoenix functions to be "optionaly
lazy" too? Currently, they are not. That can be done, but I need
more convincing.
I would love if they were. But I can live with a layer on top of that.
About the convincing part... hum I guess the only way is to try them
and see if one find them convenient. For me it started as an
experiment that worked.
...
...
I usually assume that an expression is not lazy unless I see a
placeholder in the right position. I do not think that generic code
makes thing worse: I carefully wrap all function parameters with the
equivalent unlambda [1] *before* I pass them to higher order
functions. In fact I always have to because in general I do not know
if a certain function is optionally lazy or not.
This is exactly the problem I see with generic code.
In general the compiler will loudly tell you if you get something
wrong (even if the errors might be unintelligible). Except of course
for something like the example below...
...
...
The biggest problem with optional lazyness is in fact not in generic
code, but in simple top level imperative code: most of my lazy
function objects are pure functions, with the most common exception
the 'for_each' wrapper. Sometimes I forget to protect the function
argument with lambda[], which makes the whole function call a lambda
expression.
You always use the result of a pure function, so the compiler will
loudly complain if it gets a lambda expression instead of the expected
type, but it is not usually the case with for_each. So the code will
compile, but will be silently a nop.
I'm not sure I understand this part. Can you explain this with
simple examples?
Ok, let's assume for_each is optionally lazy:

   for_each(range, lambda[cout << arg1]);

will print all the elements. What if I forget the lambda?

  for_each(range, cout << arg1);

D'oh, now everything is a big unary lambda expression. It compiles,
but as the lambda is never evaluated, it is a nop. At least gcc
doesn't warn that the code doesn't do anything.

If your lazy functions are pure functions, you will always use the
value (to pass it to another function or store it in a variable or
whatever:

  int i = map(range_of_ints, lambda[ arg1 * 2]).front();

If you forget the lambda, it will complain that the the result of map
doesn't have a front() parameter.

C++0x auto will make thing a bit trickier though:

  auto r2 =  map(range_of_ints, arg1 * 2);

The user wanted r2 to be a range, but, as he forgot lambda[], it is
actually a lambda expression. The compiler will probably complain when
he tries to use r2, but the error will be more incompressible than
usual.
...
...
...
...
Anyways, I can live with the current design, 'optional lazyness' could
be built on top of phoenix lazy functions. My only compliant is that
the 'lambda[]' syntax is already taken.
I'd like to get convinced. Can you give me a nice use case
for this 'optional lazyness' thing that cannot be done with
the curent interface?
I find it very convenient to define functions that I use very often
(for example tuple accessors, generic algorithms, etc...) as
polymorphic function objects. I put a named instance in an header file
and I use them as normal functions. Except that I can pass them to
higher order functions without monomorphizing them. In addition I can
use the exact same name in a lambda expression without having to pull
in additional headers or namespaces. After a while it just feel
natural and you wish that all functions provided the same
capabilities.
The following snippet is taken from production code:
map(
              ents.to_tokens(entities, tok),
              lambda[
                   tuple(
                        ll::bind(to_utf8, arg1)
                      , newsid, quote_id, is_about
                   )
                ]
           ) | copy(_, out) ;
Some explainations:
*   'a | f'  is equivalent to  'f(a)'. In practice I often use it to
chain range algorithm, but can be used for anything.
*  '_' is similar to 'arg1', except that it can only used as a
parameter to a lazy lambda function and the resulting  unary function
is no longer a lambda expression (so the 'lambdiness' doesn't
propagate).
map returns a pair of transform iterators, copy is the range
equivalent of std::copy and tuple is equivalent to make_tuple. All
three functions can be used both inside and outside of lambdas.
So, it is not really a question of power, just of convenience. You can
always have to functions one lazy and one not, but I like to have
everything packaged in a single place.
Compile times are of course not pretty and requires lots of code to
roll this syntax on top of boost.lambda. I think that a port to
Phoenix will be much simpler and probably lighter at compile time.
Those are pretty cool code. I'm still not sure of the implications
of all these though. I know for sure that people less smarter than
you are tend to get bitten by expressions that are intended to
be lazy but are actually immediate. Perhaps we can be arrange
for an "optionaly-lazy" layer on top of phoenix: it can be done,
phoenix is modular enough to have that layer.
It would be great.
...
In general though, I tend to avoid special cases. This
"optional laziness" is based on special casing depending
on some qualities of a lambda function.
Well, I guess that is a point of view.. as I see it, functions are
usually evaluated, unless some of the arguments are suspended: it is
not eager evaluation that is special, but lazyness (or partial
application or whatever you want to call it).
...
This may be outside
our subject, but this same special casing is the reason why
I rejected lambda's design to have optional-reference-capture
on the LHS. For example, this is allowed in lambda:
int i = 0;
  i += _1;
But in Phoenix, it would have to be:
int i = 0;
  ref(i) += _1;
All variables are captured by value, always. I know that's
off the subject, but I'm sure you see what I mean.
I didn't even know that lambda captured by reference in this case (if
it is documented, I missed it)! I always use ref(i)... which begs
another question: could you add a shorthand syntax to tell that a
specific variable must be captured by reference? Lambda has it, but is
a bit cumbersome:

  int i =0;
  var_type<int> r_i = var(i);
  (_1 + r_i)(0);  //capture by ref

something like this is desirable (and in fact can be done easily, but
IMHO should be part of the library):

  byref<int> i = 0;
  (_1 + i)(0); // capture by ref

BTW, yet another reason for always requiring lambda[] or something
equivalent: the library could delay the decision of capturing by
reference or by value by default up to the lambda introducer (I think
I already mentioned something like this in the past):

  int i = 0;

  lambda[ i+= arg1]; //default capture by value
  lambda_r[ i += arg1]; //default capture by reference

It would make Phoenix similar to C++0x lambdas.
Without requiring lambda[]  [1] I do not see how this could be
implemented (expecially if you want the default capture behavior to be
by value).

[1] or a different set of placeholders... messy!
...
...
...
...
...
BOOST_TEST((val(ptr)->*&Test::value)() == 1);
This doesn't compile
BOOST_TEST((arg1->*&Test::value)(test) == 1);
because 'test' is not a pointer. OTOH boost::bind(&Test::value,
arg1)(x) compiles for both x of type "Test&" and "Test*".
Ah! Good point. But, hey, doesn't -> imply "pointer"? bind
OTOH does not imply pointer. But sure I get your point and it's
easy enough to implement. Is this a killer feature for you?
No, not really, in fact I can see many would consider it needless
obfuscation. I just thought that was a nicer notation than bind.
Anyways, -> doesn't necessarily imply pointer. See optional.
Hah, I don't want to get in that optional pointer semantics argument
again :P Least I can say is that I never liked it. OptionalPointee
(http://tinyurl.com/3fwlp9) does imply pointer.
Ok, as  I said, it is not a killer feature :)

I want to comment about switch_, but I'll do it by replying to another
thread. After that, time to write a review :)

-- 
gpd