Re: [boost] Accelerating algorithms with SIMD - Segmented iterators and alternatives

13 Oct 2010

      On 13/10/10 11:44, Jeffrey Lee Hellrung, Jr. wrote:
...
On 10/13/2010 02:52 AM, Mathias Gaunard wrote:
...
On 12/10/10 09:06, David Abrahams wrote:
...
Nice for very linear things. Not so great for sort, binary_search,
reverse, rotate, ...
[...]
In any case, linear operations or filters represent in my opinion a
large enough spectrum of what people do with iterable ranges of data
that it could be worthwhile to have concepts that allow them to be
efficient, even if it cannot apply to other cases.
I could see this.
[...]
...
...
...
Of course, there is the drawback that it makes the code using the
iteration primitive somewhat more complicated: but so do segmented
iterators. There are also several other drawbacks, in particular
tied to the fact that it cannot be conceptually much more than a
forward iterator.
...nor even implement several of the algorithms that work with forward
iterators.
Do you have some examples?
find_if? adjacent_find?
Those return an iterator, so that makes it complicated to see what they 
would do.

Assuming they would return a pointer to the value, you could do 
(ignoring const issues):

template<typename Range, typename P>
struct find_if_f
{
   find_if_f(P p_) : p(p_) {}

   bool operator()(typename range_value<Range>::type& e)
   {
     if(p(e))
     {
       ptr = &e;
       return false;
     }
     return true;
   }

   P p;
   typename range_value<Range>::type* ptr;
};

template<typename Range, typename P>
typename range_value<Range>::type* find_if(Range& range, P p)
{
   find_if_f<Range, P> f(p);
   if(for_each_until(range, f))// for_each but stops when f returns false
     return f.ptr;
   return 0;
}
...
adjacent_difference?
You need to remember the previous element, but you don't have to copy to 
do that, you could just use a reference or pointer to it.
I would think a sufficiently good optimizer should take care of inlining 
that.
...
parallel_for_each?
Good point.
This requires the ability to subdivise the range, which would be 
feasible but inefficient, in particular due to the lack of random access.

  I
...
suppose (some of) these could be implemented using a for_each-style of
iteration, but would require caching (of the previously visited element,
in the case of adjacent_*) and exceptions (in the case of
find_if/adjacent_find)...which would seem to negate the performance
motivation. Or maybe I'm being too dense...
I'm not certain those really hamper performance (assuming the use of 
another mechanism instead of exceptions to stop the loop).
...
One might describe this as expressing iteration in terms of the visitor
pattern...? I honestly didn't understand what you meant between "push"
and "pull" before this snippet.
This referenced the initial conversation I had with Dave in another 
thread. It's not very practical terminology.
...
[...]
...
...
Although what you're describing is easier to make efficient for many
(perhaps even the majority of cases), it doesn't sound like it
actually has the same level of generality as segmented iterators do.
Segmented iterators still allow bidirectional support, indeed (I think?).
But I'm note sure it can do random access as well, if you consider
recursively segmented local iterators.
Seems to me segmented iterators and a visitation-based/push-based
iteration address fundamentally different problems. At least, I don't
see how segmented iterators help here. The problem that visitation-based
iteration seems to try to solve is improving iteration efficiency over
join(t)_range-like ranges...
Since those could be different segments, you wouldn't need to check 
which segment you are in at each iteration.
Actually, all local iterators for a a segmented iterator must be the 
same for all segments, but since a local iterator can be a segmented 
iterator itself, it can also have a local iterator which can be 
different. By recursion, you can therefore use different iterator types 
for each segment, but that's somewhat convoluted.
...
...
I believe writing a zip_iterator (or similar) would be no easier for
segmented iterators than for generators due to the arbitrary recursion.
What do you mean by generators in this context?
That's what I called my alternative to segmented iterators in the other 
thread, because I made them use an output iterator instead of a function 
object.

Any function that reads a range and passes each element to a template 
user-supplied object would fit the idea.
I call it push because the data is being pushed onto the user-supplied 
object.