Re: [boost] Proposed templated integer_sort

8 Jul 2008

      Hi Phil,

I haven't gotten around to making changes yet (weekends are when I have
time), but I still have some feedback.

- Use new and delete rather than *alloc and free.
...
- Can any exceptions occur?  If so, is the dynamically allocated memory
leaked?  Could you use a smart pointer to avoid this?
Using *alloc, they should return NULL if allocation failed, and that
situation is handled.   With new and delete I'll have to catch memory
exceptions (and write a constructor for the Bin structure) and probably use
smart pointers.  I could also convert to only using a fixed-size bin array,
and putting this on the stack, as if you look at what the default constants
do, the bin array ends up being of size MAX_SPLITS.
If it doesn't impact performance, I'll be happy to do it the C++ way.
Otherwise I'll have to investigate what's going on in detail.
...
- Does it work with uint64_t?  I see some variables declared as 'long',
which often can't store a uint64_t.
There exists the problem of how to identify and deal with negatives.  I try
to avoid forcing any particular return data type, but in the case of divMin
and divMax, they are used in performance-critical loops, so they should be
used as constants without any extra computation.  They are already divided
by 512, so an unsigned should fit inside a signed data type without trouble.
Should I use an int64_t?

- If std::vector<int>::iterator really is slower than int* (and I suggest
...
some investigation of the assembler to see why this is), then you could add
a specialisation that converts vector::iterators into pointers.
Good point.  I can look at that, though it could be due to my ancient
system.  I see the same performance problem with std::sort, and for a vector
that isn't vector<bool> (why would anybody sort a vector<bool>?) I should be
able to convert to a pointer.
I haven't read assembler since college, and I don't even remember how to
obtain the assembler for C++ source.
I do also have a Windows development environment if finding assembler on
that is easier.
...
- Is there a use-case for a pure radix sort, without the std::sort
fallback?  i.e. are there types that can't easily provide a comparison
function, or distributions where the std::sort fallback is never useful?
A stable, external LSB Radix Sort could be templated by overriding the &
operator, and passing in the key size in bits as an argument.  This would be
particularly useful to those who need a stable sort, and for those with
small keys (32 bits or less) and memory to spare, it should be quite fast
for large n, running in O(n(keysize/bincount)) operations, and without
recursion.  The ideal bincount would be close to Spreadsort's MAX_SPLITS,
probably between 9 and 11.  It would probably be best to have a radix_sort
and a float_radix_sort.  I make no claim to be an expert on LSB Radix sort,
and would welcome someone else writing it.  If I did write it, I'd probably
take some fast open-source implementation and just template it.

Steve

Re: [boost] Proposed templated integer_sort

Steven Ross