Iterator adaptors <-> boost::bind interaction

The following code currently fails to compile because of a subtle interaction between bind-produced function objects and the current implementation of transform_iterator's 'dereference' function (the same applies to 'filter_iterator' and most likely the rest of the adaptors in the library): #include <boost/iterator/transform_iterator.hpp> #include <boost/bind.hpp> int add_two( int x ) { return x + 2; } int main() { int x[] = { 0 }; *boost::make_transform_iterator( boost::make_transform_iterator( x, boost::bind( &add_two, _1 ) ) , boost::bind( &add_two, _1 ) ); } The problem here is that the result of 'boost::bind( &add_two, _1 )' takes its argument by reference, while transform_iterator's 'dereference' implementation is: typename super_t::reference dereference() const { return m_f(*this->base()); } ^^^^^^^^^^^^^ In our case, '*this->base()' produces a temporary 'int' which cannot be directly passed to 'm_f'. Effectively, the issue prevents you from layering any iterator adaptors built with 'boost::bind' on top of 'transform_iterator', which, needless to say, is BAD. My sugesstion would be to replace the above with typename super_t::reference dereference() const { typename iterator_reference<Iterator>::type x( *this->base() ); return m_f( x ); } Comments? -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
The problem here is that the result of 'boost::bind( &add_two, _1 )' takes its argument by reference, while transform_iterator's 'dereference' implementation is:
typename super_t::reference dereference() const { return m_f(*this->base()); } ^^^^^^^^^^^^^
In our case, '*this->base()' produces a temporary 'int' which cannot be directly passed to 'm_f'.
Effectively, the issue prevents you from layering any iterator adaptors built with 'boost::bind' on top of 'transform_iterator',
Only if the underlying iterator is an input iterator whose dereference returns by value, right?
which, needless to say, is BAD.
Regrettable, I agree.
My sugesstion would be to replace the above with
typename super_t::reference dereference() const { typename iterator_reference<Iterator>::type x( *this->base() ); return m_f( x ); }
If it passes all tests I have no objections. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
The problem here is that the result of 'boost::bind( &add_two, _1 )' takes its argument by reference, while transform_iterator's 'dereference' implementation is:
typename super_t::reference dereference() const { return m_f(*this->base()); } ^^^^^^^^^^^^^
In our case, '*this->base()' produces a temporary 'int' which cannot be directly passed to 'm_f'.
Effectively, the issue prevents you from layering any iterator adaptors built with 'boost::bind' on top of 'transform_iterator',
Only if the underlying iterator is an input iterator whose dereference returns by value, right?
Well, only if you talk in old iterator categories ;).
which, needless to say, is BAD.
Regrettable, I agree.
My sugesstion would be to replace the above with
typename super_t::reference dereference() const { typename iterator_reference<Iterator>::type x( *this->base() ); return m_f( x ); }
If it passes all tests I have no objections.
OK, thanks! -- Aleksey Gurtovoy MetaCommunications Engineering

Following up with more use cases/data before I forget the details. Aleksey Gurtovoy writes:
The following code currently fails to compile because of a subtle interaction between bind-produced function objects and the current implementation of transform_iterator's 'dereference' function (the same applies to 'filter_iterator' and most likely the rest of the adaptors in the library):
#include <boost/iterator/transform_iterator.hpp> #include <boost/bind.hpp>
int add_two( int x ) { return x + 2; }
int main() { int x[] = { 0 }; *boost::make_transform_iterator( boost::make_transform_iterator( x, boost::bind( &add_two, _1 ) ) , boost::bind( &add_two, _1 ) ); }
The problem here is that the result of 'boost::bind( &add_two, _1 )' takes its argument by reference, while transform_iterator's 'dereference' implementation is:
typename super_t::reference dereference() const { return m_f(*this->base()); } ^^^^^^^^^^^^^
In our case, '*this->base()' produces a temporary 'int' which cannot be directly passed to 'm_f'.
Effectively, the issue prevents you from layering any iterator adaptors built with 'boost::bind' on top of 'transform_iterator', which, needless to say, is BAD.
My sugesstion would be to replace the above with
typename super_t::reference dereference() const { typename iterator_reference<Iterator>::type x( *this->base() ); return m_f( x ); }
... which turns out to be not enough to make transform_iterator usable. For instance, this will still fail: #include <boost/iterator/transform_iterator.hpp> #include <boost/bind.hpp> #include <algorithm> int add_two( int x ) { return x + 2; } bool is_odd( int x ) { return x % 2; } int main() { int x[] = { 0 }; std::count_if( boost::make_transform_iterator( x, boost::bind( &add_two, _1 ) ) , boost::make_transform_iterator( x + 1, boost::bind( &add_two, _1 ) ) , boost::bind( &is_odd, _1 ) ); } I had to dig into our old in-house transform_iterator implementation to see why _we_ don't have this problem (it's been a long time). Translating into the boost::transform_iterator terms, basically, what we do is this: template <class UnaryFunction> struct function_object_result : mpl::if_< is_reference< typename UnaryFunction::result_type > , typename UnaryFunction::result_type , typename UnaryFunction::result_type const // ^^^^^ > { }; This takes care of both the original issue and the earlier snippet. Unfortunately, this is not the end of the story yet. Another iterator library's innovation is blending together transform and projection iterators and using tranformation object's return type to decide whether to return by value (conventional transform_iterator behavior) or by reference (conventional projection_iterator behavior). While conceptually appealing, I'm afraid the folding resulted in a seriously increased fragility of the user code. It became way too easy to write an innocently looking transformation that in fact invokes undefined behavior -- and the latter is not even diagnosed by the most compilers (neither at compile nor at run time). Consider, for instance, this: #include <boost/iterator/transform_iterator.hpp> #include <boost/bind.hpp> #include <algorithm> struct X { X( int i = 7 ) : i_( i ) {} ~X() { this->i_ = -1; } int const& get_i() const { return this->i_; } private: int i_; }; struct Y { X x; }; X make_x( int i ) { return X( i ); } int main() { int i[] = { 8 }; Y y[1]; // undefined behavior #1 (returning reference to a temporary) int r = *boost::make_transform_iterator( boost::make_transform_iterator( i, std::ptr_fun( &make_x ) ) , std::mem_fun_ref( &X::get_i ) ); assert( r == 8 ); // undefined behavior #2 (returning reference to a temporary) r = *boost::make_transform_iterator( boost::make_transform_iterator( y, boost::bind( &Y::x, _1 ) ) , std::mem_fun_ref( &X::get_i ) ); assert( r == 7 ); } IMO this is way too subtle and error-prone to leave it as is. Opinions? -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
It became way too easy to write an innocently looking transformation that in fact invokes undefined behavior -- and the latter is not even diagnosed by the most compilers (neither at compile nor at run time). Consider, for instance, this:
#include <boost/iterator/transform_iterator.hpp> #include <boost/bind.hpp> #include <algorithm>
struct X { X( int i = 7 ) : i_( i ) {} ~X() { this->i_ = -1; } int const& get_i() const { return this->i_; }
private: int i_; };
struct Y { X x; }; X make_x( int i ) { return X( i ); }
int main() { int i[] = { 8 }; Y y[1];
// undefined behavior #1 (returning reference to a temporary) int r = *boost::make_transform_iterator( boost::make_transform_iterator( i, std::ptr_fun( &make_x ) ) , std::mem_fun_ref( &X::get_i ) );
assert( r == 8 );
// undefined behavior #2 (returning reference to a temporary) r = *boost::make_transform_iterator( boost::make_transform_iterator( y, boost::bind( &Y::x, _1 ) ) , std::mem_fun_ref( &X::get_i ) );
assert( r == 7 ); }
IMO this is way too subtle and error-prone to leave it as is. Opinions?
Well I'm afraid there's too much going on in that code for me to see where the problems are without some narrative. You could take that as an argument that it's too subtle, but personally I would never write something that involved and "just expect it to work," so I don't see it that way. Maybe you could add some commentary? -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
It became way too easy to write an innocently looking transformation that in fact invokes undefined behavior -- and the latter is not even diagnosed by the most compilers (neither at compile nor at run time). Consider, for instance, this:
#include <boost/iterator/transform_iterator.hpp> #include <boost/bind.hpp> #include <algorithm>
struct X { X( int i = 7 ) : i_( i ) {} ~X() { this->i_ = -1; } int const& get_i() const { return this->i_; }
private: int i_; };
struct Y { X x; }; X make_x( int i ) { return X( i ); }
int main() { int i[] = { 8 }; Y y[1];
// undefined behavior #1 (returning reference to a temporary) int r = *boost::make_transform_iterator( boost::make_transform_iterator( i, std::ptr_fun( &make_x ) ) , std::mem_fun_ref( &X::get_i ) );
assert( r == 8 );
// undefined behavior #2 (returning reference to a temporary) r = *boost::make_transform_iterator( boost::make_transform_iterator( y, boost::bind( &Y::x, _1 ) ) , std::mem_fun_ref( &X::get_i ) );
assert( r == 7 ); }
IMO this is way too subtle and error-prone to leave it as is. Opinions?
Well I'm afraid there's too much going on in that code for me to see where the problems are without some narrative.
Understandable.
You could take that as an argument that it's too subtle,
Yes, that's my line of argument. That, and the fact that the code like this used to work as expected until the blend between transform and projection iterators.
but personally I would never write something that involved and "just expect it to work," so I don't see it that way.
It's only "involved" because it's a minimal test case stripped of the higher-level layers that make it usable in real life. Non-stripped version would be something like: transform_view( transform_view( y, boost::bind( &Y::x, _1 ) , std::mem_fun_ref( &X::get_i ) ) Is it more realistically looking for you, use case-wise?
Maybe you could add some commentary?
Sure. Both of the marked lines above invoke undefined behavior on dereference of the corresponding composite transform iterator because its 'operator*' by the iterator is basically equivalent to this: int const& operator*() const // ^^^^^^^^^^ { // returning a reference to a member of a temporary object // that will be destroyed at the end of the function block scope return make_x( *base() ).get_i(); // #1 // return X( base()->x ).get_i(); // #2 } -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy writes:
Sure. Both of the marked lines above invoke undefined behavior on dereference of the corresponding composite transform iterator because its 'operator*' by the iterator is basically equivalent to this: ^^^^^^^^^^^^^^^ Strike this part, please.
-- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
Maybe you could add some commentary?
Sure. Both of the marked lines above invoke undefined behavior on dereference of the corresponding composite transform iterator because its 'operator*' by the iterator is basically equivalent to this:
int const& operator*() const // ^^^^^^^^^^ { // returning a reference to a member of a temporary object // that will be destroyed at the end of the function block scope return make_x( *base() ).get_i(); // #1 // return X( base()->x ).get_i(); // #2 }
Oh. Well, I'm not sure that separating transform_iterator and projection_iterator would help here. AFAICT this issue only really applies when using make_transform_iterator, and only when you're a. assuming a transform_iterator always returns by value and b. unclear on what your functions are doing. OR c. You expect transform_iterator to be a perfect drop-in replacement for the transform_iterator from the earlier Boost library, which I admit is a fair expectation. Maybe we need a make_transform_iterator variant that allows us to force a by-value return or something? -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
Maybe you could add some commentary?
Sure. Both of the marked lines above invoke undefined behavior on dereference of the corresponding composite transform iterator because its 'operator*' by the iterator is basically equivalent to this:
int const& operator*() const // ^^^^^^^^^^ { // returning a reference to a member of a temporary object // that will be destroyed at the end of the function block scope return make_x( *base() ).get_i(); // #1 // return X( base()->x ).get_i(); // #2 }
Oh. Well, I'm not sure that separating transform_iterator and projection_iterator would help here.
It would help in a sense that the choice whether to return by value or by reference would be explicit, and the default (== transform_iterator) would be always safe. Note that I'm not talking about implementation here, just about what is presented to user at high level.
AFAICT this issue only really applies when using make_transform_iterator, and only when you're
a. assuming a transform_iterator always returns by value
and
b. unclear on what your functions are doing.
There is more to it: even if you _are_ clear what your functions are doing when you are writing the code, the trap is still there in the form of code fragility and ultimately one of the worst maintanance nightmares: suppose 'X::get_i' in my original example was written as struct X { ... int get_i() const { return this->i_; } // ^^^ ... }; Now, given this particular signature, the code I cited is perfectly valid and working, and I'd have hard time blaming anybody for writing something along those lines. Yet change the signature to return by reference (for instance, as an optimization measure) and you've got a hundred of places in your program invoking undefined behavior, and your compiler doesn't even warn you about any of them.
OR
c. You expect transform_iterator to be a perfect drop-in replacement for the transform_iterator from the earlier Boost library, which I admit is a fair expectation.
That, too.
Maybe we need a make_transform_iterator variant that allows us to force a by-value return or something?
I would say transform_iterator should return by value by default and there should be an explicit notation to request by reference semantics (if applicable). -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy writes:
David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
Maybe you could add some commentary?
Sure. Both of the marked lines above invoke undefined behavior on dereference of the corresponding composite transform iterator because its 'operator*' by the iterator is basically equivalent to this:
int const& operator*() const // ^^^^^^^^^^ { // returning a reference to a member of a temporary object // that will be destroyed at the end of the function block scope return make_x( *base() ).get_i(); // #1 // return X( base()->x ).get_i(); // #2 }
Oh. Well, I'm not sure that separating transform_iterator and projection_iterator would help here.
It would help in a sense that the choice whether to return by value or by reference would be explicit, and the default (== transform_iterator) would be always safe. Note that I'm not talking about implementation here, just about what is presented to user at high level.
AFAICT this issue only really applies when using make_transform_iterator, and only when you're
a. assuming a transform_iterator always returns by value
and
b. unclear on what your functions are doing.
There is more to it: even if you _are_ clear what your functions are doing when you are writing the code, the trap is still there in the form of code fragility and ultimately one of the worst maintanance nightmares: suppose 'X::get_i' in my original example was written as
struct X { ... int get_i() const { return this->i_; } // ^^^ ... };
Now, given this particular signature, the code I cited is perfectly valid and working, and I'd have hard time blaming anybody for writing something along those lines. Yet change the signature to return by reference (for instance, as an optimization measure) and you've got a hundred of places in your program invoking undefined behavior, and your compiler doesn't even warn you about any of them.
OR
c. You expect transform_iterator to be a perfect drop-in replacement for the transform_iterator from the earlier Boost library, which I admit is a fair expectation.
That, too.
Maybe we need a make_transform_iterator variant that allows us to force a by-value return or something?
I would say transform_iterator should return by value by default and there should be an explicit notation to request by reference semantics (if applicable).
David, any reply on the above? I'd like to come to some kind of resolution on the outlined problem since it is real and hurting. -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
I would say transform_iterator should return by value by default and there should be an explicit notation to request by reference semantics (if applicable).
David, any reply on the above? I'd like to come to some kind of resolution on the outlined problem since it is real and hurting.
I don't think there's much danger in transform_iterator itself, is there? Isn't this all in make_transform_iterator? -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
I would say transform_iterator should return by value by default and there should be an explicit notation to request by reference semantics (if applicable).
David, any reply on the above? I'd like to come to some kind of resolution on the outlined problem since it is real and hurting.
I don't think there's much danger in transform_iterator itself, is there? Isn't this all in make_transform_iterator?
Yes, it's in the 'make_transform_iterator's current semantics. I should have phrased the above as I would say 'make_transform_iterator' should by default return a 'transform_iterator' which 'operator*' always returns by value and there should be an explicit notation to request by reference semantics (if it's compatible with the transformation function's return type). -- Aleksey Gurtovoy MetaCommunications Engineering

Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
I would say transform_iterator should return by value by default and there should be an explicit notation to request by reference semantics (if applicable).
David, any reply on the above? I'd like to come to some kind of resolution on the outlined problem since it is real and hurting.
I don't think there's much danger in transform_iterator itself, is there? Isn't this all in make_transform_iterator?
Yes, it's in the 'make_transform_iterator's current semantics. I should have phrased the above as
I would say 'make_transform_iterator' should by default return a 'transform_iterator' which 'operator*' always returns by value and there should be an explicit notation to request by reference semantics (if it's compatible with the transformation function's return type).
Okay, so the next question. Do we a. Pick a new name for this make_transform_iterator to avoid breaking existing uses b. Change the semantics to un-break older uses (and if so, what do we call the current semantics?) c. Something else? -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams writes:
Aleksey Gurtovoy <agurtovoy@meta-comm.com> writes:
Yes, it's in the 'make_transform_iterator's current semantics. I should have phrased the above as
I would say 'make_transform_iterator' should by default return a 'transform_iterator' which 'operator*' always returns by value and there should be an explicit notation to request by reference semantics (if it's compatible with the transformation function's return type).
Okay, so the next question. Do we
a. Pick a new name for this make_transform_iterator to avoid breaking existing uses
b. Change the semantics to un-break older uses (and if so, what do we call the current semantics?)
c. Something else?
I'm strongly in favor of (b); in case I didn't make myself clear in http://article.gmane.org/gmane.comp.lib.boost.devel/124639, though, I maintain that relying on the specific transformation function's signature to get a particular dereference behavior that happens to work now -- that is, the current semantics, -- in majority of cases simply puts in a maintenance time bomb. Furthermore, I might be missing something, but I don't see a single compelling use case for this behavior except that "we can do that". You know when you want your iterator's 'operator*' to return by reference, and once you've established that, you want the compiler to enforce it for you. So, my proposal would be: 1) Revert 'make_transform_iterator' to pre-1.32 semantics (return by value). 2) Introduce 'make_reference_transform_iterator' which would explicitly ask for return by reference and enforce the corresponding signature of the transformation function. 3) Leave 'transform_iterator' intact, allowing for unforeseen use cases which would benefit from its current default behavior. -- Aleksey Gurtovoy MetaCommunications Engineering
participants (2)
-
Aleksey Gurtovoy
-
David Abrahams