Re: [boost] [parallel_sort] Proposal

2 Feb 2009

      Edouard A wrote:
...
For example, for 4 threads you have :
parallel_sort<4> -> parallel_sort<2> -> parallel_sort<1>
                                     -> parallel_sort<1>
                 -> parallel_sort<2> -> parallel_sort<1>
                                     -> parallel_sort<1>
This code is generated at compile time.
The one thread version use the fallback sorter which is by default
std::sort.
The default > 1 thread version merges the results of the underlying calls
(there is a version with quicksort).
So basically it's something like this:

   thread t( sort(begin,mid) );
   sort(mid,end);
   t.join();
   merge(begin,mid,end);  // [*]

The trouble with this is that one of the threads will terminate before 
the other, e.g. because its data was easier to sort or was in 
more-local memory, and it will then do nothing until the other thread 
(and all its sub-threads) has also finished.  It would be better to 
keep all threads occupied for as much of the time as possible, e.g. by 
dividing the problem up into chunks and giving the chunks to threads as 
and when they are ready to handle them.

You mentioned the Intel implementation before.  Have you benchmarked 
that in comparison with your algorithm?

I guess that you may have benchmarked with random input data.  This 
will tend to favour your method as the input is uniformly difficult to 
sort.  You should also benchmark with input that is less homogeneous, 
e.g. data that's mostly sorted but with local permutations, etc.  Is 
anyone aware of synthetic or real benchmark data sets for sorting?

[*] Of course a fundamental issue for this algorithm is that even if 
the two sub-sorts are equally time-consuming, the final merge is still 
a sequential operation.  But I just wondered whether it would be 
possible to have one thread start doing

   in_place_merge(begin,mid,end);

while the other thread starts at the other end, going backwards:

   in_place_merge(rbegin,reverse_iterator(mid),rend);

Of course they will crash into each other in the middle.  But maybe 
it's possible to do this in some sort of safe lock-free manner?

Phil.

Re: [boost] [parallel_sort] Proposal

Phil Endecott