Re: [boost] Proposed templated integer_sort

3 Jul 2008

      On Thu, Jul 3, 2008 at 9:36 AM, Phil Endecott <
spam_from_boost_dev@chezphil.org> wrote:
...
I'll have a look at that.  What do we think about "magic tuning parameters"
in algorithms?  My feeling is that things dependent on e.g. cache sizes (or
number of processors etc) really need to be determined at run time to be
useful to most people.
The values I picked are pretty good across most systems.  If there is a
standard, quick way to find the size of the processor cache, then I could
consider using it.  By far the most important parameter MAX_SPLITS is
dependent on the page size, and 9 corresponds to the number of bins that
take up 4096 bytes on a 32-bit system, which is a common page size.
introsort looks like it has a tuning parameter of 16 in SGI's
implementation.

Most of my performance tests are with 40M elements over a 32-bit
...
...
range, and my experience is that the speedup gradually increases between
10K
and 10M elements, in roughly this fashion:
10K no difference
100K 10% faster
1M 20% faster (not too far from your result)
10M 30% faster
(and 50+% speedup on some systems or with some types of data and large
datasets)
Hmm.  Actually my application is more typically sorting of the order of 1e4
to 1e5 elements, within a smaller range.
Then Spreadsort won't be very useful to you for that purpose, unless the
range is  small enough that bucketsort starts to become practical (range
width on the order of n, or the data is spiky).  Spreadsort is fast for
small ranges, as is any decent bucketsort implementation.

 Spreadsort is designed for large n.  If you have small n, std::sort is
...
better, and between 1K and 1M, the speedup isn't much.  If you have 100M
elements, then it starts to make a significant difference.  That's the
reason for the initial check to see if the number of elements is less than
some constant value, in which case it uses std::sort.
I'm unsure how many real-world problems there are that sort tens or hundreds
...
of millions of elements in main memory.  1e8 elements takes ~ a gigabyte if
you have just one pointer associated with it.  Sorting this size of dataset
from disk is a more common, but entirely different, problem.  (But of course
there are whole problem domains out there that I don't know anything
about...)
<http://lists.boost.org/mailman/listinfo.cgi/boost>
For those of us who need fast sorts of  gigabytes of data, it is very
important.  For those who don't, it doesn't make much difference, and
they're probably not under the same performance constraints.
With regards to on-disk sorting, I also developed a fast (greater than order
of magnitude improvement vs. conventional mergesort) algorithm for that
years ago, but I'm not open-sourcing it at this time, so this isn't the
place to discuss it.
As an additional note, as it is radix-based, Spreadsort can readily run in
parallel.  Is that worth trying to put in boost?

Steve

Re: [boost] Proposed templated integer_sort

Steven Ross