
In reply to Steve Watanabe: Thanks. I'd already tried bjam, but what I figured out is that a link error: /usr/bin/ld: unknown flag: --start-group is preventing the executable from being built, so the tests don't run. So instead I link the .o file that it was built by bjam, run it, and it runs my tests. This ought to just work. I presume that you are using gcc. What OS? I am using gcc 4.0.1 on Mac OSX10.4 (PPC). I attempted to install gcc4.3.2, and it failed, even though I added the libraries it stated it was dependent upon. Apple only provides binaries past gcc4.0 for 10.5, so I'm inclined to believe that 10.4 is missing necessary support in the OS for the latest gcc libraries. On Wed, Jan 14, 2009 at 9:51 AM, Phil Endecott < spam_from_boost_dev@chezphil.org> wrote:
<http://www.boostpro.com/vault/index.php?action=downloadfile&filename=algorithm_sorting.tar.gz&directory=&>Some first impressions follow. I will try to find time to look at it more carefully at some point but I can't promise.
You seem to have quite a lot of code that's duplicated for the functor and non-function versions. Can't this be eliminated by using e.g. std::less as a default for the compare template parameter? (Isn't this what std::sort does?)
In the version of STL I downloaded to look at, there is the functor version of std::sort, and the non-functor version. The code is entirely duplicated at all levels where comparison are used, because the functor adds overhead (it takes up space on the stack, if nothing else). This overhead is small (around 2-5%), but measurable. Considering how much effort I went to squeezing out every 1% improvement I could, I would consider it wasteful to accept that kind of penalty just to reduce code size in the library. I've combined functions between functor/non-functor versions where I could without sacrificing measurable performance. My philosophy is that I (as the library developer) go to the effort to make things as fast and flexible as possible, so that those who include the library don't have to worry about any of that.
I suggest that you get rid of the tab characters in the source. I personally find your lines rather long, but that's a matter of taste.
I'll make sure to remove the ones I missed, thanks for the reminder. I'll also shorten lines over 80 characters.
I think you can use things like typename RandomAccessIter::value_type instead of passing *first in order to determine data_type.
I tried that, but it didn't compile on my system (MacOSX PPC gcc4.0.1). *first works just the same, though I agree, ::value_type is prettier. I'll see if I can find out how to make it work on my system, and thus hopefully everybody's system, but no guarantees.
I'm not sure how useful your functor-versions are since they still seem to depend on the "integer-ness" or "float-ness" of the sorted type. Apart from sorting backwards, what use cases are there? Using functors would be more useful if it let you sort types that are neither integers nor floats; for example, fixed-point types or pointers-to-integers or structs containing a key field to be sorted on.
Did you look at my KeyPlusData sample or the index.html? integer_sort can sort anything with an integer or integer-convertible key float_sort can sort anything with a floating-point key string_sort can sort anything with a string key, including (though not tested yet) wide character strings That's what the functor versions are there for: sorting classes or structures based upon a key. All you have to do is define the correct functor. I believe that most single-key data types can be sorted by one of these three functions and the appropriate functors. Fixed-point types can sort as an integer of the same size in bytes. The only real requirement is that they must return a data type that when subtracted from an item of the same type returns something convertible to a size_t, that if (b - a) < (c - a) then b < c, and that bitshifting works as division by a power of two would, reducing the difference by a power of two per shift. Pointers-to-integers need a shift functor: *x >> offset; and a compare functor *x < *y. KeyPlusData shows an example of a struct with a key field.
I think that the key here is that you make assumptions about the return type of operator>> or the right_shift functor which are not true in those cases. It may be that you just need to rename that functor and document what it needs to do. I think it's really "get N bits from key". I would have thought that the floating-point algorithm would just be an instance of the generic algorithm with custom functors, with a bit of specialisation to cope with the reverse-order-for-negatives thing.
Positive floats can be sorted just like integers, with the right functor. Negative floats have to be sorted backwards. I didn't see any efficient way to write a functor to handle negative floats. Hence, a mostly separate algorithm, but positive float sorting shares code with integer_sort. The nice thing about how I wrote float_sort is that it sorts floats almost as fast as integers (I see differences of under 10%).
One of the things that your string sort does is to not compare known-equal initial substrings. Does anyone know if there is a variant of a conventional sort (e.g. quicksort) that does this?
I'm pretty sure that's a radix-sorting trick; it's incompatible with only taking a comparison function, and requires paying attention to the radix index. Plenty of radix sorts do it. The speedup I attained was on the order of 5%; more than enough to be worth doing, but not a huge gain. String comparisons are cache-friendly, so they're still fast relative to swap operations on my test systems.
Some more documentation of the performance benefit would be useful. How about some graphs?
Okay. It will vary substantially from system to system (such as my odd 28X float_sort speedup on Cygwin, where on my system it's roughly 2X as fast, or the minimal speed improvement of string_sort on Cygwin, vs 2X on my system), across different compilers, and depending upon the data distribution. I could graph runtime vs. file size on evenly distributed random data on multiple platforms for each of the 3 algorithms relative to std::sort. Does that sound like what you're looking for? My experience is that Spreadsort tends to be a little faster on unevenly distributed data than evenly distributed data, as it starts bucketsorting; my performance tests include various distributions, but they take a long time to run and it would be impractical and probably confusing to the reader to graph their results. I'll look at std::numeric_limits<>::is_iec559 and see if that does what I want.