Just to be clear, Steven's code is not actually radix sort, it's bucket
sort, right? Which is great, but let's not get confused about what's what.
It is MSD radix sort, which is recursive bucket sort, combined with switching to std::sort once the problem is cut up small enough that std::sort has better worst-case performance. Conventional MSD radix sorts use insertion sort for small problems instead, and switch at much smaller data sizes, which can be problematic for performance with data that is unevenly distributed (conventional MSD radix sorts take about 4X as long as std::sort in this case).
Steven, I gave it a quick test but it failed when trying to include
. Looks like Robert Ramey changed the names of things around 1.37.0. (I tested with 1.53.0.) Changing it to fixed it up. Looks like <vector>, <cstring> and <limits> are unnecessary includes in some of the files?
Thanks, I'll clean those up.
Otherwise, it worked, and was much faster than std::sort(), so well done. :)
I'm not real sure about the interface, though: do they have to be different named functions (integer_, float_, string_)? Can't it be called bucket_sort() and specialized for different types? And reverse sorting, isn't that done by passing reverse iterators?
Reverse sorting could be done by passing reverse iterators, though that eliminates the possibility of using raw pointers, which are slightly faster with some of the compilers I tested. I'd be happy to remove the reverse calls if there is general agreement that the performance difference isn't a
Thanks. problem. With regards to different types, this covers three different cases: sorting by an integer or integer-like key of finite length, sorting a float correctly by casting it to an integer (tricky but fast), and sorting strings of variable length. Specializing by type isn't ideal as people often want to sort larger data types by a key, and the key could itself be a more complex data type that isn't an integer, float, or string, but supports the necessary operators.
Out of curiosity, I decided to tackle the problem and wrote an implementation of counting sort, which as you may know, is one way of implementing radix sort. The code is not ready for a formal review but I would really appreciate feedback on it, as it is written with a view to inclusion in Boost.
An LSD radix sort could make sense as an additional option, and counting sort would make sense as a part of that. LSD is substantially different from MSD radix sort, so it seems like a separate project to me. LSD radix sort is often slower than std::sort on real problems because it can't take advantage of easy problems. It also uses more memory. and doesn't make sense on variable-length data types. All that said, it is useful for some types of problems, such as when stability is a requirement.