Dear all,
It is my pleasure that Neil Groves' RangeEx library has been accepted
into boost. Congratulations Neil! There are quite a number of minor
issues that need to be resolved before the library is release ready,
see below for a summary.
Review statistics
-----------------
Full Reviews: 8.
Discussion: extensive.
I had the clear impression that everybody that participated in the
discussion were in favor of this library, albeit they did not have time
to submit a full review.
I did not hear a single statement saying that this library should be
rejected.
Thanks to everybody that participated in the review and its discussions.
Issue Summary
-------------
Below is given a list of topics that must be adressed before the
library can be included into boost. In general, we should try
to discuss them one at a time in seperate threads. Many people
suggested various extensions, new algorithms (e.g. from adope), etc.
**In general
we should not require Neil to add all these if he does not have the time
currently: the basic infrastructure, the concepts, the naming
and organization of the library is much more important. Then we can
add all these things later.**
1: documentaion
===============
The documentation should clearly reflect if an algorithm delegates
to a standard algorithm. If the algorithm is new, then it should
state complety and exception-safety. More examples would be welcome.
Sometimes the rationale and explanations could be improved, e.g for
operator|().
2: return type specification for find() etc
===========================================
There where no major objection to the mechanism, but some found
the syntax ugly. I believe the suggested syntax (e.g.)
boost::find[_f,_e]( range, x )
boost::find[_f+1,_e]( range, x )
was considered good by most.
3: namespace organization
=========================
Currently the library uses the namspaces
boost
boost::adaptors
Some raised concern about putting the algorithms straight in
the boost namespace before we have a general agreement on where
to put algorithms (e.g. in boost::algorithm or boost::ranges or whatever).
I think it will make sense to put each algorithm in its own header,
since we might have to include additional standard library headers
for calling algorithms that are implemented as member functions
(but see 7 below).
Feedback is most welcome.
4: naming convention for adaptors
=================================
The following have been proposed throughout the review
(here examplified with "transform"):
| transformed( fun )
| transform( fun )
| transform_view( fun )
| view::transform( fun )
There where no consensus during the review. Other libraries
seem to use the _view suffix, which is a strong argument
for that candidate
(in that case, I suggest to drop the adaptors namespace).
5: naming convention for the adaptor generators
===============================================
Many people like to be able to say make_transform_range(r,fun)
instead of r | transformed( fun ). There was no concensus
on how to name these functions. Here are some candidates:
make_transform_range( r, fun )
transformed( r, fun )
transform( r, fun ) (*)
transform_view( r, fun )
(*) was dislike by some because it has exactly the same spelling as
the algorithm transform. Many felt that the confusion is
too big if view generators are not named different from the actual
algorithms.
6: reintroducing _if variants of algorithms
===========================================
An important problem with the suggested approach
was the the iterator returned by an algorithm
after applying | filtered( pref ) are now filtered
iterators and so one needs to manually unwrap the return
iterator. This provided enough justification for reintroducing
these algorithms. (One could imagine that this unwrapping
was done automatically by a conversion operator such that
iterator i = boost::find( r | boost::filtered( pred ), x );
"just worked". However, the syntax is still slightly worse
than just using the original algorithm).
Generic programming is concerned with the use of orthogonal
concepts which put together yields all possible combinations
of algorithms. If we reintroduce all _if algorithms, we have to aks
ourselves "when does it end?". Should all new algorithms then
simply add _if variants? This seems very much against the spirit
of generic programming.
The problem with the iterators being changed (as in the find example)
seems to suggest the following guideline:
"If the algorithm returns an iterator, it makes sense to provide
an _if overload. Otherwise it does not."
Under this guideline, find_if() should be there, but
count_if() should not be provided.
7: should algorithms call member function when possible?
========================================================
I heard one strong voice against this. The reason was that
the algorithm implemented as member function often has quite
different guarantees w.r.t. complexity, reference and iterator
stabililty. I think this is a very strong argument. Also, we
can add this later if good arguments appear, but it is much
harder to take away. We also have to remember Scott Meyers
item "Beware of container independent code" which also suggest
that this is a bad idea.
At the very menimum, the original algorithm and the member function
should have similar semantics. This seems to suggest that
we can call set::lower_bound() from boost::lower_bound(). In that
case, the best way would probably be to add these as overloads
in the boost namespace:
template< class T, class A >
typename std::set<T,A>::iterator
lower_bound( std::set<T,A>& s );
template< class T, class A >
typename std::set<T,A>::const_iterator
lower_bound( const std::set<T,A>& s );
8: making return values of algorithms consistent/intuitive
==========================================================
For example, sort(r) returns the sorted range, but
sort_heap(r) does not. Similar issues for partial_sort().
Please go through all algorithms to see if this done correctly.
9: output range concept?
=========================
It appears that the only use for an output iterator in the library
was for boost::copy(rng,iter), and the only use for that function
was for "sinks" like std::ostream_iterator<T>(...).
This seems to be the only safety hole left in the library.
Here's an idea to how we might remove that: create a new "copy sink"
concept:
struct array_copy_sink
{
...
template< class Iterator >
copy( Iterator begin, Iterator end )
{ ... }
};
Then we might imagine
boost::copy( rng, boost::ostream_sink(std::cout) );
boost::copy( rng, boost::array_sink(an_array) );
boost::copy( rng, boost::ptr_sink(begin,end) );
boost::copy( rng, boost::front_insert_sink(a_deque) );
...
For optimal performance, we also need a "nonoverlapping copy sink"
concept, and an algorithm that expects that sink
(or that boost::copy() determines the type of the sink automatically).
10: automatic projection
=======================
Adope has algorithms overloaded such that projection is very easy:
struct my_type {
int member;
double another_member;
};
//...
my_type a[] = { { 1, 2.5 }, ... };
// sort the array by the first member:
sort(a, less(), &my_type::member);
my_type* i = lower_bound(a, 5, less(), &my_type::member);
We should be able to express this well, albeit not as concise, with
my_type* i =
boost::lower_bound( a | project( &my_type::member ), 5, less() );
where project( ... ) could simply return a transform iterator
constructed with boost::bind().
There is another problem, however, because now the return value is a
tranform_iterator of some form. Appending .base() would solve it,
albeit it is somewhat ugly.
Again this might raise the question of an implicit conversion.
I just want to say this is not a critical issue as we can always
add these overloads.
--- End of Issues ----
Again, I would like to thank everybody that participated in the review.
best regards
Thorsten, review manager