Re: [boost] [Phoenix] Some questions and notes...

26 Sep 2008

      On Fri, Sep 26, 2008 at 3:06 AM, Joel de Guzman
<joel@boost-consulting.com> wrote:
...
Giovanni Piero Deretta wrote:
...
...
It's not broken. As Doug noted in his review, phoenix lambda
is like lambda protect (http://tinyurl.com/3sx7bo).
I would be perfectly fine if it lambda[f] worked as protect(f), but it
actually is subtly different
What I do not like is the extra '()' you have to use to actually get
the protected lambda:
int i = 0;
std::cout << protect(arg1)(i) ; // print 0
Have you tried it? I did and I get compiler error.
Yes I did. But I just found out that I had mixed lambda placeholders
with phoenix ones (In my own code I use the argN placeholers with
lambda too for compatiblity with boost.bind). It is interesting that
when using the arg1 phoenix placeholder with lambda.protect everything
compiles and gives me what I expected... werid.

Anyways, as I said in another email, I always confuse protect with
unlambda. I initially expected lambda[] to behave as unlambda.
...
...
I now understand why you need another evaluation round, and I see the
need for local variables in lambdas (I've missed them in boost.lambda,
and it was one of the reasons I was eagerly waiting for phoenix to be
reviewed).
My only objection is that a lambda[f] which doesn't have any local
variables should just return 'f' and not a nullary. In fact I think
this should be a global propery of lambda expressions:
Let 'add' be an binary lazy function:
'add(arg1, 0)'
should return an unary function (as it is currently the case).  OTOH:
'add(1, 2)'
should immediately be evaluated and not return a nullary function. In
practice, 'add' would be 'optionally lazy'. This is in fact not that
surprising: let's substitute add with its corresponding operator:
'arg1 + 0'
returns an unary funciton, but
'1 + 2'
is immediately evaluated. I know this is a bit controversial and would
probably require large code changes, but probably a review is the best
place to comment on design aspects.
IMO, it's not controversial. I've considered this approach a long
time ago. It's actually doable: evaluate immediately when there are no
placeholders in an expression. I'm not sure about the full effect
of this behavior, OTOH. Such things should be taken very carefully.
Mind you, the very impact of immediate evaluation on expressions like
above already confuses people. Sometimes, the effect is subtle
and is not quite obvious when you are dealing with complex
lambda expressions. I know, from experiences with ETs (prime
example is Spirit), that people get confused when an expression
is immediate or not. The classic example:
for_each(f, l, std::cout << 123 << _1)
Oops! std::cout << 123 is immediate. But that's just for starters.
With some heavy generic code, you can't tell by just looking at the
code which expresions are being evaluated immediately or lazily.
Ok, I'll start by saying that for me this is not just "it would be
great if...". I routinely use an extension to boost.lambda that allows
one to define lazy functions a-la phoenix, except that they lazy only
if one or more of their parameter is a lambda expression.

I usually assume that an expression is not lazy unless I see a
placeholder in the right position. I do not think that generic code
makes thing worse: I carefully wrap all function parameters with the
equivalent unlambda [1] *before* I pass them to higher order
functions. In fact I always have to because in general I do not know
if a certain function is optionally lazy or not. The additional
advantages of always using lambda[] are that:

- a lambda[] stands out in the code better than a placeholder, so it
is clearer what is going on (if you think about it, pretty much every
other language that support a lambda abstraction has a lambda
introducer).

- the rule to determine the scope of a placeholder is simpler: it
doesn't cross a lambda[] barrier.

The biggest problem with optional lazyness is in fact not in generic
code, but in simple top level imperative code: most of my lazy
function objects are pure functions, with the most common exception
the 'for_each' wrapper. Sometimes I forget to protect the function
argument with lambda[], which makes the whole function call a lambda
expression.
You always use the result of a pure function, so the compiler will
loudly complain if it gets a lambda expression instead of the expected
type, but it is not usually the case with for_each. So the code will
compile, but will be silently a nop.

[1] I have rolled my own lambda[] syntax for this (which in addition
makes a a boost.lambda expression result_of compatible), and this is
why the behavior of phoenix::lambda surprised me.
...
...
Anyways, I can live with the current design, 'optional lazyness' could
be built on top of phoenix lazy functions. My only compliant is that
the 'lambda[]' syntax is already taken.
I'd like to get convinced. Can you give me a nice use case
for this 'optional lazyness' thing that cannot be done with
the curent interface?
I find it very convenient to define functions that I use very often
(for example tuple accessors, generic algorithms, etc...) as
polymorphic function objects. I put a named instance in an header file
and I use them as normal functions. Except that I can pass them to
higher order functions without monomorphizing them. In addition I can
use the exact same name in a lambda expression without having to pull
in additional headers or namespaces. After a while it just feel
natural and you wish that all functions provided the same
capabilities.

The following snippet is taken from production code:

            map(
               ents.to_tokens(entities, tok),
               lambda[
                    tuple(
                         ll::bind(to_utf8, arg1)
                       , newsid, quote_id, is_about
                    )
                 ]
            ) | copy(_, out) ;

Some explainations:
*   'a | f'  is equivalent to  'f(a)'. In practice I often use it to
chain range algorithm, but can be used for anything.
*  '_' is similar to 'arg1', except that it can only used as a
parameter to a lazy lambda function and the resulting  unary function
is no longer a lambda expression (so the 'lambdiness' doesn't
propagate).

map returns a pair of transform iterators, copy is the range
equivalent of std::copy and tuple is equivalent to make_tuple. All
three functions can be used both inside and outside of lambdas.

So, it is not really a question of power, just of convenience. You can
always have to functions one lazy and one not, but I like to have
everything packaged in a single place.

Compile times are of course not pretty and requires lots of code to
roll this syntax on top of boost.lambda. I think that a port to
Phoenix will be much simpler and probably lighter at compile time.
...
...
...
BOOST_TEST((val(ptr)->*&Test::value)() == 1);
This doesn't compile
BOOST_TEST((arg1->*&Test::value)(test) == 1);
because 'test' is not a pointer. OTOH boost::bind(&Test::value,
arg1)(x) compiles for both x of type "Test&" and "Test*".
Ah! Good point. But, hey, doesn't -> imply "pointer"? bind
OTOH does not imply pointer. But sure I get your point and it's
easy enough to implement. Is this a killer feature for you?
No, not really, in fact I can see many would consider it needless
obfuscation. I just thought that was a nicer notation than bind.
Anyways, -> doesn't necessarily imply pointer. See optional. In fact I
think that -> is, in general, a good substitute for the lack of an
overloadable 'operator.', sometimes I wish that reference_wrapper did
provide it.
...
...
BTW, what if the member function is not nullary?
struct foo {
     void bar(int){}
 };
foo * x =...;
 int y = 0;
 ((arg1->*&foo::bar)(arg2))(x, y);
The above (or any simple variation I could think of) doesn't compile.
There is a way to make something like this to work without using bind?
Not that it is very compelling, I'm just curious.
The examples show how. I don't know why your example does not compile.
Again, see /test/operator/member.cpp for examples on this. Here's
one that's not nullary:
struct Test
   {
       int func(int n) const { return n; }
   };
...
BOOST_TEST((val(ptr)->*&Test::func)(3)() == 3);
and I just added this test for you:
int i = 33;
   BOOST_TEST((arg1->*&Test::func)(arg2)(cptr, i) == i);
Compiles fine.
Yes, it compiles fine... I think I was trying to stream out the result
of a void function, sorry for the noise ;).
Thanks.
...
...
...
...
Well, that's all for now, those questions are mostly to get the
discussion rolling, more to come.
An additional question: is it possible to make lambdas always
assignable? ATM, if you close around a reference (something which I
think is very common), the lambda expression is only
Not sure what you mean by "close around a reference".
Hum, "the lambda expression closure captures a reference to a local object"

int i = 2;
auto plus_i = arg1 + ll::ref(i);  //closes around i by reference

decltype(plus_i) y; // error, not default constructible
plus_i = plus_i; //error plus_i not assignable
...
...
CopyConstructible. This means that an iterator that holds a lambda
expression by value internally (think about filter_iterator or
transform iterator) technically does no longer conform to the Iterator
concept.
Good point.
In fact I think the problem is not limited to iterators. AFAIK
standard algorithms require that their funcitonal parameters be
assignable.
...
...
A simple fix, I think, is internally converting all references to
reference wrappers which are assignable.
I wish it was that simple. Anyway, gimme some time to ponder on
the impact of this on the implementation. Perhaps an easier way
is to do as ref does: store references by pointer.
Sure.

BTW, boost.lambda has the same problem; I use a wrapper that hides the
lambda in an optional (another service provided by my lambda[]). With
in_place construction you can implement assignment as an copy
construction. But it adds overhead.

-- 
gpd