
Dear Vincente, On Sun, Feb 22, 2009 at 3:08 PM, vicente.botet <vicente.botet@wanadoo.fr>wrote:
----- Original Message ----- From: "Neil Groves" <neil@grovescomputing.com> To: <boost@lists.boost.org> Sent: Sunday, February 22, 2009 3:11 PM Subject: Re: [boost] Formal Review: Boost.RangeEx - missing algorithms
Dear Vicente,
On Sun, Feb 22, 2009 at 12:13 PM, vicente.botet <
vicente.botet@wanadoo.fr>wrote:
<snip> </snip>
May be it will be godd to add the equivalences. The user can always
include
them in their application if they found useful.
I'm not sure about this issue. I really want to obtain feedback from as many people as possible. I understand the appeal of providing everything that is in <algorithm> but I dislike setting what is, in essence, a bad example of how to write algorithms. My personal preference is to drop the inferior technique, this will hopefully make it easy to do the right thing, and hard to do the wrong thing. I am, of course, open to feedback on this issue. Ultimately the library is for it's users.
It's OK. I will wait from feedback from others also.
The _n operations are replaced if the input is a model of the RandomAccessRangeConcept with boost::copy( output, input | sliced(0, n) ) for example, but on reflection this can and must be improved before release. If I simply add a first_n adaptor then we can replace all _n algorithms with a superior adaptor for all valid ranges. Hence I propose to add something that would allow: boost::copy( output, input | first_n(n) );
I am pleased that you made me justify dropping the _n versions, because frankly my solution is currently insufficient to be a perfect substition.
Yes, somthing like that should work, but what will be the difference between first_n(n) and sliced(0, n)?
The key difference is that while sliced works perfectly fine for RandomAccessRange's it does not work for SinglePassRange's. It is possible to make a first_n work for all valid Ranges. Hence I can't correctly state that I have currently provided the tools to make the _n versions of the algorithms always replacable with a superior alternative. I shall, because I want to get rid of them, I should have provided a better alternative first.
Ok, I see.
As for is_sorted and is_heap, these seem like useful additions to the algorithm_ext.hpp.
And horror of horrors, the random_sample is missing due to no good justifiable reason what-so-ever. This I will address.
waiting for them.
There are surely hidden reasons to don't include some of them.
<snip>
I definately like the idea of carefully adding partitioning that is compatible with Boost.Range. I need to revist posts and suggestions made by others on this list. IIRC Dr. Arnold Schodl had some interesting ideas, and partitioning is actively being worked on.
But a dynamic one would also be interesting. What do you think?
The only comment I can make at this point is that it requires some investigation. There are others that have investigate partitioning in much more detail and I it would be silly for me to comment before studying their contributions. It appears to be something that can be extended and made compatible with range very easily without affecting the core. Hopefully I can still find enough time to make this happen before release.
Could you give me some pointers?
I have already found the thread in this ML, but I'm not sure we address the same partition concept. I need to reread it.
I do too!
I have just a last question, is there any difference between using directly a Range Range range;
using a sub_range boost::sub_range<Range> sub_rng(boost::begin(range), boost::end(range));
and using a sliced adaptor range | sliced(0, boost:size(range)) ?
The idea is that the algorithm may be written without regard for what actual range is being used. There are reasons to have iterator_range, sub_range and special returns from the adaptors. I believe these are all documented, so I'm going to be cheeky and ask that you look in the documentation for sub_range.
Sorry, I was not enough explicit. My question was related to performances. So is there any performance difference between ...
Actually having re-read my paragraph I state something I don't really mean. The user of the algorithm must be able to call it for any range that models a valid Range concept. In very few cases when implementing an algorithm you can benefit from using the sub_range as a tool to help build the algorithm. There is no performance difference between iterator_range, and sub_range to the best of my knowledge and belief. I have not yet measured so this is slightly speculative.
BTW, if I need to make a function that gets the i-th partition of a range divided on k-parts
template <typename Range> ??? get_partition(Range& range, unsigned i, unsigned parts) { std::size_t size = boost::size(range); if (i==parts - 1) return range | sliced(boost::begin(range_)+i*(size/parts_), boost::end(range_)); else return range | sliced(boost::begin(range_)+i*(size/parts_), boost::begin(range_)+((i+1)*(size/parts_))-1); }
Which will be the type of get_partition? Should this be done in another way?
I can't see any reason why you should not be able to do it this way. What this example shows is that I was wrong to state that the return type of adaptors did not need to be known. I will add the return type information into the documentation. The answer in this particular case is: iterator_range<typename range_iterator<Range>::type >
Indeed it is a core design goal that the user may provide free functions to allow other containers to work with the range library seamlessly. This indeed is exemplified by the code that allows ATL and MFC 'collections' to operate as ranges without requiring modification. The relevant documentation is here index.html -> reference -> extending the library.
The use of anything that is a valid model of the appropriate Range Concept must work. It's a defect if it doesn't.
I'm not sure if my example is a counter-example. I'm trying to define a parallel_sort function taking a Range as parameter. I have started with
template <typename Range> void parallel_sort(Range& range, unsigned cutoff=10000) { pool_type pool( boost::tp::poolsize( 2) ); x_sort fct( pool, cutoff); fct.execute(range); }
class x_sort { private: pool_type & pool_; unsigned cutoff_;
template <typename Range> void seq_( Range& range) { sort_fct()(range); }
template <typename Range> void par_( Range& range) { unsigned size = boost::size(range); if ( size <= cutoff_) return seq_( range); else { boost::sub_range<Range> left(boost::begin(range), boost::begin(range)+(size/2)); boost::sub_range<Range> right(boost::begin(range)+(size/2)+1, boost::end(range)); task_type task(pool_.submit( boost::bind(& x_sort::par_<Range>, boost::ref( * this), boost::ref(left)))); this->par_(right); task.wait(); inplace_merge_fct()(range, boost::begin(range)+(size/2)); } } public: x_sort( pool_type & pool, unsigned cutoff) : pool_( pool), cutoff_( cutoff) {} template <typename Range> void execute( boost::sub_range<Range>& range) { par_( range); } };
I think it's a good example of where the differing range types are not as easy to implement an algorithm as we would wish. The finished algorithm should be able to work with anything that is a model of a Range Concept.
The problem is that the instantiation of par_<Range> requests the instantiation of par_<boost::sub_range<Range>>, and ..., which fall in an recursive instantiation that do not finish. Should the types boost::sub_range<Range> and boost::sub_range<boost::sub_range<Range> > be convertibles? I have taken a workaround, create a sub_range before calling to the internal function execute.
template <typename Range> void parallel_sort(Range& range, unsigned cutoff=10000) { pool_type pool( boost::tp::poolsize( 2) ); x_sort fct( pool, cutoff); boost::sub_range<Range> rng(range); fct.execute(range); }
The internal templates have a boost::sub_range<Range> parameter instead of a Range, so the single tempalte instantiation is par_<boost::sub_range<Range> >.
class x_sort { private: pool_type & pool_; unsigned cutoff_;
template <typename Range> void seq_( boost::sub_range<Range>& range) { sort_fct()(range); }
template <typename Range> void par_( boost::sub_range<Range>& range) { unsigned size = boost::size(range); if ( size <= cutoff_) return seq_( range); else { boost::sub_range<Range> left(boost::begin(range), boost::begin(range)+(size/2)); boost::sub_range<Range> right(boost::begin(range)+(size/2)+1, boost::end(range)); // fork a new sub-action t1 in pool task_type task(pool_.submit( boost::bind(& x_sort::par_<Range>, boost::ref( * this), boost::ref(left)))); this->par_(right); task.wait(); inplace_merge_fct()(range, boost::begin(range)+(size/2)); } } public: x_sort( pool_type & pool, unsigned cutoff) : pool_( pool), cutoff_( cutoff) {}
template <typename Range> void execute( boost::sub_range<Range>& range) { par_( range); } };
Is this the correct way to use Range and sub_range?
Yes, this is exactly what I intend. The public algorithm exposes an interface that can accept any Range Concept, and the internal implementation has to, occasionally jump through some hoops to handle sub_range etc. I haven't been able to think of a way to eliminate this need. Fortunately for most algorithm implementations it is not necessary, and I think the extra function call is not a huge deal. If anyone knows a technique to avoid the need to use sub_range that simplifies matters, I would gladly use it.
Best, Vicente
Thanks, Neil Groves
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost