
My RangeEx review follows. I'm afraid that I have not been following much of the ongoing discussion of this library on the list so apologies if I simply repeat what other people are saying. Algorithms ---------- I have always found the verbosity of things like std::sort(container.begin(),container.end()) to be an embarrassing misfeature of the std library so it's good to be able to now write the much more sensible sort(container). Just a couple of comments: - It would be good if the docs were explicit about this being a thin wrapper around your existing std::sort, and identifying those algorithms that are not so thin. - Do we really want this in the top-level boost:: namespace? There have been discussions about sorting libraries for Boost who might also want boost::sort. Would it not be better to put these algorithms in a boost::range namespace? Then we move on to the algorithms where you can select what is returned using a template parameter. This is an interesting idea, but it's a little verbose and I can't think of any precedent for it in the std library or elsewhere in Boost. Being unfamiliar increases the learning curve for a feature. I was wondering if it would be possible to have something more like: boost::unique(rng).found instead of boost::unique<boost::return_found>(rng) Or maybe not; just a thought. You then have this example for sorting and removing duplicates: boost::erase( vec, boost::unique<boost::return_found_end>( boost::sort(vec) ) ); Err... yuk. Why can't I write: unique(vec); Or maybe unique(sort(vec)); or sort(vec); unique(vec); My point is that std::unique only has its "move duplicates to the end" behaviour because the conventional interface with a pair of iterators doesn't let you change the size of the container. Now that we have a unique() that takes a container we don't need that restriction. I recently tried to write a starts_with algorithm: template <typename CONTAINER> bool starts_with(const CONTAINER& a, const CONTAINER& b) { // does a start with b ? return std::equal(b.begin(), b.end(), a.begin()); } The problem with this is that it will crash if a is shorter than b, because std::equal takes only three not four iterators. The fix is to add an additional test: return a.size() >= b.size() && std::equal(b.begin(), b.end(), a.begin()); Unfortunately, for containers where size() is O(N) this makes the algorithm O(N+M), which is especially problematic if the common case if for a difference to be found early in the sequences. So I was curious to see if range::equal addresses this problem: it has the potential to do so since it has access to all four ends of the containers. According to the docs the complexity is "Linear. Exactly distance(rng) comparisons." which doesn't look hopeful, but what "rng" is it referring to? Looking at the source, however, it seems that it does have an optimal implementation. I think this needs a documentation fix. This also makes me wonder if these would be useful algorithms: bool longer(rng1,rng2); bool equal_length(rng1,rng2); bool size_greater_than(rng,size_t); bool size_equal_to(rng,size_t); Adaptors -------- The author seems to be very enthusiastic about chaining functions using operator|. I'm not inherently opposed to this idea but I don't think it should be the role of this library, RangeEx, to provide it. In my opinion this feature should be a general-purpose one. Example: y = sqrt ( sin (x) ); In natural language we may choose to describe that as "the square root of the sine of x" or "take the sine of x and then take the square root of that". If the second of those seems more "natural" to you then you may prefer to write: y = x | sin | sqrt; or maybe even y = (sin | sqrt) (x); I guess that someone who is better at metaprogramming than me could write a general-purpose operator| that can be used in this way. Here's a start: template <typename T1, typename T2, typename T3> struct chained_func { typedef function<T1(T2)> f1_t; typedef function<T2(T3)> f2_t; f1_t f1; f2_t f2; chained_func(f1_t f1_, f2_t f2_): f1(f1_), f2(f2_) {} T1 operator()(T3 arg) { return f1(f2(arg)); } }; template <typename T1, typename T2, typename T3> chained_func<T1,T2,T3> operator|(function<T1(T2)> f1, function<T2(T3)> f2) { return chained_func<T1,T2,T3>(f1,f2); } template <typename T1, typename T2, typename T3> T1 operator|(T3 val, chained_func<T1,T2,T3> cf) { return cf(val); } This feature could then be available for any purpose that the user prefers, including range adaptors, and RangeEx users who don't want to use it can ignore it entirely. (For all I know, maybe this already exists somewhere.) Even if you choose to keep operator| I suggest making the documentation less "evangelical". Quote: "Why do I prefer the operator| syntax? The answer is readbility:" and you go on to show one version that uses operator| and one that doesn't. I would argue that the version using operator| is _less_ readable simply because it uses operator|, and the reader will wonder why you're doing a bitwise OR between a vector and a, umm, a something else. Of course anyone who reads the documentation will understand it, but one of the best features of this library is that, as in the example of sort(container), the functions do "exactly what they say on the tin", and a casual reader of code using it doesn't need to have digested the docs first. Quote: * boost::copy_if( rng, pred, out ) * boost::count_if( rng, pred ) These algorithms are not defined and the reason is that both of them can be expressed by using the original algorithm togther with a Adaptor Generator: The fact that something can be expressed using a combination of other things is not a sufficient justification for leaving it out. For example, sort(container) could be expressed as sort(container.begin(),container.end()) yet you have still provided it. Why should I have to write: int n_zeros = count( container | boost::adaptors::filtered(_1==0) ); instead of int n_zeros = count_if(container, _1==0); ? Summary ------- - What is your evaluation of the design? I would prefer that operator| be factored out. Otherwise I am satisfied with the design. - What is your evaluation of the implementation? I have not properly studied the implementation, but I have not seen any problems in the code that I have looked at. - What is your evaluation of the documentation? Good. - What is your evaluation of the potential usefulness of the library? The basic feature of being able to pass containers to algorithms makes it useful enough to justify inclusion. - Did you try to use the library? With what compiler? Did you have any problems? I did some very basic experiments with gcc 4.1 on linux, with no problems. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? A couple of hours. - Are you knowledgeable about the problem domain? Not especially. - Do you think the library should be accepted as a Boost library? I would prefer to see the operator| stuff factored out, as described above. But given the choice between including the library as-is or not including it at all I would choose to include it as is, since the algorithms alone are sufficiently useful to justify inclusion. Regards, Phil.

Phil Endecott wrote:
- It would be good if the docs were explicit about this being a thin wrapper around your existing std::sort, and identifying those algorithms that are not so thin.
Actually, I believe sort(c) calls c.sort() if it exists. So it's not just std::sort.
You then have this example for sorting and removing duplicates:
boost::erase( vec, boost::unique<boost::return_found_end>( boost::sort(vec) ) );
Err... yuk. Why can't I write:
unique(vec);
Or maybe
unique(sort(vec));
or
sort(vec); unique(vec);
My point is that std::unique only has its "move duplicates to the end" behaviour because the conventional interface with a pair of iterators doesn't let you change the size of the container. Now that we have a unique() that takes a container we don't need that restriction.
Not a container, a range. You can't remove elements from a range, only move those to the end and change the end. However, uniqued(vec) returns a new lazy range without duplicates (if the range is sorted), and unique_copy(vec) returns a new range without duplicates.

Mathias Gaunard wrote:
Phil Endecott wrote:
- It would be good if the docs were explicit about this being a thin wrapper around your existing std::sort, and identifying those algorithms that are not so thin.
Actually, I believe sort(c) calls c.sort() if it exists. So it's not just std::sort.
I don't see any evidence of that in boost/range/algorithm/sort.hpp. Where should I be looking? This would of course be a useful feature, and documenting whether it is or is not done would be useful.
unique(vec);
My point is that std::unique only has its "move duplicates to the end" behaviour because the conventional interface with a pair of iterators doesn't let you change the size of the container. Now that we have a unique() that takes a container we don't need that restriction.
Not a container, a range.
The fact that the container that I pass is converted to a range can be considered as an implementation detail for many applications of this library. There is a lot of stuff in C++ that is more complicated to use than it could be. We have all got used to how things are and know the reasons why, but it presents an obstacle to new users. This library reduces some of this complexity, which is really great. I suggest that we consider how much further we can go in this direction. Maybe algorithms that really take containers, rather than ranges, are beyond the scope of this library. Or maybe not.
However, uniqued(vec) returns a new lazy range without duplicates (if the range is sorted), and unique_copy(vec) returns a new range without duplicates.
Neither allows me to unique a container in-place though. Cheers, Phil.

Phil Endecott wrote:
Mathias Gaunard wrote:
Actually, I believe sort(c) calls c.sort() if it exists. So it's not just std::sort.
I don't see any evidence of that in boost/range/algorithm/sort.hpp. Where should I be looking? This would of course be a useful feature, and documenting whether it is or is not done would be useful.
It is done by the range_ex version in the sandbox. Maybe it has been dropped then, a rationale for that would be nice.
My point is that std::unique only has its "move duplicates to the end" behaviour because the conventional interface with a pair of iterators doesn't let you change the size of the container. Now that we have a unique() that takes a container we don't need that restriction.
Not a container, a range.
The fact that the container that I pass is converted to a range can be considered as an implementation detail for many applications of this library.
So you would want unique to have different semantics whether the passed range is a container or not? It would probably be best to have "unique" have the same semantics as std::unique and "unique_inplace" have the semantics you want. Also, for a container such as vector, should unique_inplace actually move the elements to remove to the end before erasing them, since that should be more efficient, or should it erase duplicates as they are found?
participants (2)
-
Mathias Gaunard
-
Phil Endecott