
Hi Phil, I haven't gotten around to making changes yet (weekends are when I have time), but I still have some feedback. - Use new and delete rather than *alloc and free.
- Can any exceptions occur? If so, is the dynamically allocated memory leaked? Could you use a smart pointer to avoid this?
Using *alloc, they should return NULL if allocation failed, and that situation is handled. With new and delete I'll have to catch memory exceptions (and write a constructor for the Bin structure) and probably use smart pointers. I could also convert to only using a fixed-size bin array, and putting this on the stack, as if you look at what the default constants do, the bin array ends up being of size MAX_SPLITS. If it doesn't impact performance, I'll be happy to do it the C++ way. Otherwise I'll have to investigate what's going on in detail.
- Does it work with uint64_t? I see some variables declared as 'long', which often can't store a uint64_t.
There exists the problem of how to identify and deal with negatives. I try to avoid forcing any particular return data type, but in the case of divMin and divMax, they are used in performance-critical loops, so they should be used as constants without any extra computation. They are already divided by 512, so an unsigned should fit inside a signed data type without trouble. Should I use an int64_t? - If std::vector<int>::iterator really is slower than int* (and I suggest
some investigation of the assembler to see why this is), then you could add a specialisation that converts vector::iterators into pointers.
Good point. I can look at that, though it could be due to my ancient system. I see the same performance problem with std::sort, and for a vector that isn't vector<bool> (why would anybody sort a vector<bool>?) I should be able to convert to a pointer. I haven't read assembler since college, and I don't even remember how to obtain the assembler for C++ source. I do also have a Windows development environment if finding assembler on that is easier.
- Is there a use-case for a pure radix sort, without the std::sort fallback? i.e. are there types that can't easily provide a comparison function, or distributions where the std::sort fallback is never useful?
A stable, external LSB Radix Sort could be templated by overriding the & operator, and passing in the key size in bits as an argument. This would be particularly useful to those who need a stable sort, and for those with small keys (32 bits or less) and memory to spare, it should be quite fast for large n, running in O(n(keysize/bincount)) operations, and without recursion. The ideal bincount would be close to Spreadsort's MAX_SPLITS, probably between 9 and 11. It would probably be best to have a radix_sort and a float_radix_sort. I make no claim to be an expert on LSB Radix sort, and would welcome someone else writing it. If I did write it, I'd probably take some fast open-source implementation and just template it. Steve