Re: [boost] [review] [sort] Sort library review manager results

28 Nov 2014

      ...
On 28/11/14 13:55, Steven Ross wrote:
...
Here are some potential standard
algorithms that it might be nice to add in the future:
LSD radix sort, for fixed-length data where stability matters
Timsort, a stable comparison sort that is fast for mostly-sorted data and
is used by standard Java implementations.
A sorting network based comparison sort for small and fixed-size
datasets.
I've found these to be about twice as fast as Insertionsort.
k-way merge parallel sort
AFAIK, in the parallel world, merge sort and radix sort are the only two
algorithms used.
Without the special parallel optimizations though, they tend to
underperform compared to smarter algorithms.
Both merge sorting and radix or bucket sorting are used in parallel
sorting problems for dividing up and remerging the problem.  k-way merge
uses RAM and CPU efficiently by only requiring one final pass to merge
however many pieces (which makes it strongly preferred when parallel
sorting across multiple systems or on disk), where conventional mergesort
is usually slightly faster for in-memory parallel sorting, at the expense
of more CPU due to the extra passes.  In-memory, quicksort-based splitting,
or using an initial iteration of Spreadsort to break up the problem into
On Fri Nov 28 2014 at 12:45:37 PM Mathias Gaunard <
mathias.gaunard@ens-lyon.org> wrote:

pieces may be more efficient than plain mergesort.

It's also worth noting that quite different algorithms are usually employed
for the splitting up and remerging of the parallel problem as compared to
the sorting of each individual chunk.  I've seen plenty of parallel
implementations that use std::sort for sorting the individual chunks
assigned to a single thread, for example.

I'm not about to jump into a boost parallel sort implementation.  If
somebody writes a cross-platform implementation they'd like to share that
is speed-competitive and reasonably efficient, I'd be happy to take a look
at it.

What special parallel optimizations are you referring to?  SIMD operations?