Dear community, I humbly offer you for perusal and review, LSD radix sort (and counting sort): https://github.com/jeremy-murphy/algorithm/tree/radix_sort The tests and docs are unfinished but the library code is as complete as I can manage without comment from others. The new files of interest are in include/boost/algorithm/integer_sort. I have more code with a speed test and CMake files in a separate repository: https://github.com/jeremy-murphy/integer-sort But perhaps I should move all those files into a branch off my Boost.Algorithm fork? Probably. Anyway, thanks in advance for any feedback and comments. Cheers. Jeremy
I should mention that this code was intentionally written to the '98 standard under some assumptions that are probably false and I would be most happy to update it to '11 if it is agreed to be desirable.
I have more code with a speed test and CMake files in a separate repository:
I think it'd be nice to see the results of these speed tests, compared to std::sort etc. Philippe
On 30 March 2014 23:27, Philippe Vaucher
I think it'd be nice to see the results of these speed tests, compared to std::sort etc.
Philippe
Good idea. Here are the speed test results from my x86_64 laptop with gcc. Let me explain them a bit. There are two sets of data based on the distribution of the random numbers generated. The first is a poisson distribution, which was really just an easy way to get a small range (max - min) on the input values. The second is a uniform distribution, which is also the worst-case scenario. Then each data distribution is tested four times: unsigned char (h), short (t), int (j) and long (m). The first column is the log base 2 of the input size, n. Second column is time (t) divided by n, so time per element. Third column is something I am still working on, sorry, just ignore it for now. Fourth and fifth columns are how many times (x) faster radix sort is compared to std::sort and std::stable_sort. So you'll see, LSD radix sort performs pretty well except for on unsigned long with a large range of values. Oh and the blank rows are because I haven't got around to measuring n repetitions of a sort, so the blanks are because one execution of the sort is not measurable. Certainly there are improvements to be made but this is a crude and effective metric to begin with. I'm very curious to compare results with different architectures. Cheers. Jeremy Apologies if the formatting is a mess, it *should* look OK in a fixed-width font. *** poisson_distribution, mean = std::numeric_limits<T>::max() / 2 === Test (seed = 1,396,184,394, T = h). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 18 19 20 21 22 2.38 0.09 15.0 25.0 23 2.38 0.09 15.0 26.0 24 2.38 0.08 15.8 28.5 25 2.98 0.08 12.7 22.1 26 2.83 0.08 13.6 23.9 27 2.76 0.07 14.4 25.3 28 2.76 0.07 14.6 26.3 === Test (seed = 1,396,184,394, T = t). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 18 19 20 9.54 0.45 5.0 7.0 21 4.77 0.19 10.0 15.0 22 14.31 0.64 3.2 5.2 23 10.73 0.43 4.3 7.0 24 12.52 0.50 3.9 6.2 25 13.41 0.52 3.6 6.0 26 16.54 0.62 3.0 4.9 27 20.49 0.74 2.5 4.1 28 23.06 0.82 2.2 3.7 === Test (seed = 1,396,184,394, T = j). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 18 19 19.07 1.00 4.0 4.0 20 28.61 1.40 2.7 2.7 21 23.84 1.10 3.0 3.2 22 28.61 1.27 2.5 2.9 23 32.19 1.39 2.2 2.7 24 32.19 1.33 2.2 2.7 25 36.66 1.44 1.9 2.5 26 43.36 1.65 1.7 2.1 === Test (seed = 1,396,184,394, T = m). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 76.29 4.47 0.0 1.0 18 38.15 2.11 2.0 2.0 19 57.22 3.00 1.0 1.7 20 57.22 2.85 1.3 1.5 21 57.22 2.71 1.4 1.6 22 61.99 2.77 1.3 1.8 23 61.99 2.65 1.4 1.6 24 64.37 2.67 1.4 1.6 25 66.46 2.64 1.4 1.6 26 67.50 2.58 1.4 1.7 *** uniform_int_distribution, min = 0, k = std::numeric_limits<T>::max() === Test (seed = 1,396,184,394, T = h). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 18 19 19.07 1.00 2.0 3.0 20 21 4.77 0.19 9.0 12.0 22 4.77 0.18 8.5 13.5 23 3.58 0.13 11.7 18.0 24 6.56 0.25 6.6 10.0 25 7.75 0.28 5.7 9.0 26 8.05 0.31 5.6 8.9 27 11.77 0.41 3.9 6.2 28 15.87 0.54 3.0 4.7 === Test (seed = 1,396,184,394, T = t). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 18 19 20 9.54 0.45 7.0 8.0 21 9.54 0.43 7.0 9.0 22 11.92 0.50 5.6 7.2 23 15.50 0.65 4.2 5.8 24 17.29 0.71 3.9 5.3 25 19.67 0.76 3.4 5.0 26 24.74 0.92 2.7 3.9 27 31.66 1.15 2.2 3.1 28 36.47 1.29 1.9 2.8 === Test (seed = 1,396,184,394, T = j). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 76.29 4.47 0.0 1.0 18 19 19.07 1.00 3.0 4.0 20 19.07 0.95 3.5 4.0 21 33.38 1.57 2.3 2.6 22 35.76 1.59 2.2 2.4 23 38.15 1.65 2.2 2.4 24 41.13 1.71 2.1 2.4 25 53.35 2.12 1.7 1.9 26 62.73 2.38 1.5 1.6 === Test (seed = 1,396,184,394, T = m). === log2(n) t/n (ns) c x sort x stable_sort ------------------------------------------------------------- 16 17 76.29 4.47 1.0 0.0 18 38.15 2.11 2.0 2.0 19 76.29 4.00 1.0 1.0 20 57.22 2.85 1.3 1.7 21 71.53 3.38 1.1 1.3 22 81.06 3.68 1.0 1.2 23 85.83 3.70 1.0 1.2 24 103.71 4.29 0.9 1.0 25 122.49 4.88 0.8 0.9 26 131.28 5.04 0.7 0.9
Couple of quick things: I only meant "review" informally, in case anyone thought I was being presumptuous. :) Also, does anyone want to give me ssh access to a non-x86 architecture machine so that I can run some speed tests? :) Thanks, cheers. Jeremy
OK, I've pushed the speed test into a "speed_test" branch off of radix_sort, so don't worry about the other repo, everything is in the Boost.Algorithm fork.
There is a pull request open for comment in case anyone is interested.
https://github.com/boostorg/algorithm/pull/2
On 1 April 2014 14:25, Jeremy Murphy
OK, I've pushed the speed test into a "speed_test" branch off of radix_sort, so don't worry about the other repo, everything is in the Boost.Algorithm fork.
participants (2)
-
Jeremy Murphy
-
Philippe Vaucher