On 28/11/14 13:55, Steven Ross wrote:
Here are some potential standard algorithms that it might be nice to add in the future: LSD radix sort, for fixed-length data where stability matters Timsort, a stable comparison sort that is fast for mostly-sorted data and is used by standard Java implementations. A sorting network based comparison sort for small and fixed-size datasets. I've found these to be about twice as fast as Insertionsort. k-way merge parallel sort
AFAIK, in the parallel world, merge sort and radix sort are the only two algorithms used. Without the special parallel optimizations though, they tend to underperform compared to smarter algorithms.
Both merge sorting and radix or bucket sorting are used in parallel sorting problems for dividing up and remerging the problem. k-way merge uses RAM and CPU efficiently by only requiring one final pass to merge however many pieces (which makes it strongly preferred when parallel sorting across multiple systems or on disk), where conventional mergesort is usually slightly faster for in-memory parallel sorting, at the expense of more CPU due to the extra passes. In-memory, quicksort-based splitting, or using an initial iteration of Spreadsort to break up the problem into
On Fri Nov 28 2014 at 12:45:37 PM Mathias Gaunard < mathias.gaunard@ens-lyon.org> wrote: pieces may be more efficient than plain mergesort. It's also worth noting that quite different algorithms are usually employed for the splitting up and remerging of the parallel problem as compared to the sorting of each individual chunk. I've seen plenty of parallel implementations that use std::sort for sorting the individual chunks assigned to a single thread, for example. I'm not about to jump into a boost parallel sort implementation. If somebody writes a cross-platform implementation they'd like to share that is speed-competitive and reasonably efficient, I'd be happy to take a look at it. What special parallel optimizations are you referring to? SIMD operations?