
Edouard A. wrote
I just finished the first working version of parallel_sort.
Current benchmarks on my Q6600 vs2008-64-bit default allocator:
2 threads : 160 % faster
4 threads : 260 % faster
So you're getting a super-linear speedup going from 1 to 2 processors? That seems odd. What size input does the benchmark relate to? I'm curious to know whether you still get a worthwhile speedup for relatively small inputs.
It can be heavily customized. I already offer the possibility to choose between quick sorting and merge sorting, and for quick sorting I offer two pivot algorithms : fast or secure. Then you can of course specify a predicate and a fallback sorting algorithm for when you run out of threads.
How about run-time selection between quicksort and mergesort, in the style of introsort? This seems hard to beat in uniprocessor applications.
Nevertheless, I think it's time to open up the code and start getting feedback.
Yes please. You haven't said anything about how it actually works... Phil.