
Phil wrote:
Having said all that, personally I have little interest in an algorithm
that has no benefit for fewer than millions of items. What do other people think about that aspect?
I frequently sort gigabytes, tens of gigabytes or even hundreds of gigabytes of geometry data as a pre-requisite to scanline over the data. However, a faster sort would only benefit me if the speedup were reliably 2X or more. A 30% speedup in sort would disappear with the effect of Amdahl's Law to something under 10%. I get better speedup just by adopting each new version of the compiler. For me, the modest size and tenuous nature (depends upon architecture specific constants) of the speedup does not justify any effort to integrate and carry the dependency to the library that provides it. I have been following the thread only out of academic interest. I would end up using the algorithm only if it were integrated into std::sort and applied in a transparent manner where appropriate. Bringing the algorithm to boost might be a good first step in that direction. Template metaprogramming techniques could be used to detect at compile time whether a data type fits the model for spreadsort. From that standpoint I'd like to see a wrapper around spreadsort/std::sort that selects the appropriate algorithm for arbitrary T and allows the user to be completely abstracted away from even knowing about the existence of spreadsort. The next step from there is to try to push it into the standard itself. I wouldn't expect that to ever happen though, and am not too optimistic about how worthwhile it would be to everyone if it did happen. The speedup is just not consistently large enough to be worth caring about in the context of an application that does more than just sorting data. Luke