
Steven Ross wrote:
Would there be interest in an integer_sort call that is faster than std::sort? Spreadsort is an in-place hybrid radix-based and comparison-based algorithm with better worst-case and average-case (roughly 20-50%) performance than introsort on random integer data. The worst-case advantage is algorithmic, with Spreadsort taking the better of O(n(square_root(bit_sizeof(data) - log(n)))) and O(nlog(n)) time. For 32-bit integers, that's O(n(square_root(32 - log(n)))), so for a log(n) of 16 (65536 items), it's 4n (plus a small constant times n), vs 16n for introsort. For larger n, the advantage vs. pure comparison-based algorithms increases. In effect, Spreadsort smoothly transitions from pure MSB Radix sort to std::sort with its data cut up by a couple iterations of MSB Radix sort (reducing the size of the log(n) factor), depending upon the distribution and size of the data.
I could write a templated version of Spreadsort for the Boost library if there is interest. It would work for anything that supports the << and - operators.
Yes, this would interest me. I have a fixed-point class that it could probably be used with. I have a feeling that you need a little bit more than just << and - though, don't you? Hmm, what would a (pure) radix sort need? Presumably, since yours is a hybrid method, you need everything that a comparison sort needs and everything that a radix sort needs. Maybe we should start by implementing a (sketch of a) radix sort template. Johan mentioned sorting a vector of [pointers to] structs by some member of that struct; I also have some wrappers around std::sort (e.g. below) that do this and would be interested to see how you would implement it. template <typename obj_t, typename data_member_ptr_t> struct compare_data_member { data_member_ptr_t data_member_ptr; compare_data_member(data_member_ptr_t data_member_ptr_): data_member_ptr(data_member_ptr_) {} bool operator()(const obj_t& a, const obj_t& b) { return a.*data_member_ptr < b.*data_member_ptr; } }; template <typename iter_t, typename data_member_ptr_t> void sort_by(iter_t begin, iter_t end, data_member_ptr_t data_member_ptr) { std::sort(begin,end, compare_data_member<typename iter_t::value_type, data_member_ptr_t> (data_member_ptr)); }
It could also handle standard floats sorted as integers.
what exactly do you mean by this? As I understand it, if you treat an IEEE float as an integer for sorting it will almost work but the negative range will be backwards; you need to do some sign-mag vs. 2s-comp fixup to allow for it. But I bet std::c++ doesn't guarantee anything along those lines.
Functors supporting << and - could make it more general. Multi-dimensional sorting is also sped up by Spreadsort. I could also write a similar string_sort, that should also be significantly faster than std::sort. This would look much like postal sort, just using introsort after a certain number of iterations. Is this of interest?
Yes, that too. I'm particularly interested in algorithmically-efficient sorting of strings that frequently have common prefixes, e.g. filenames, URLs etc; I think would be a good case for this algorithm, wouldn't it?
I've recenty been re-coding something where a std::map<> was using more memory than I wanted. I decided to instead put the data in a std::vector, sort it, and then search using the std::lower/upper_bound algorithms. This works well, but I did notice std::sort using a lot of time in the profile. So an improved sort would be appreciated. This is off-topic but people might be interested: there are a couple of remaining problems with this sorted vector implementation. Firstly, the access pattern when you search is horrible; it starts by reading one element in the middle, then one at the 1/4 or 3/4 point, and so on; probably each of those is a page or cache-line that will never be accessed otherwise. This can be fixed by bit-reversed addressing (well, at least in the case where the vector size is a power of two, I think) and I'm considering a container-adaptor that bit-reverses the index. It makes normal iteration slower of course. The other problem is that if my data is a key-value pair, I waste a lot of cache space for the value parts of the elements that I only access the keys of (i.e. the nodes near the root of the tree). I could fix this by replacing the vector<pair<key,value>> with a pair<vector<key>,vector<value>>, but can I somehow get the memory layout of the latter with the syntax of the former? Regards, Phil