
On Sun, Jul 5, 2009 at 1:01 AM, Vladimir Prus <vladimir@codesourcery.com>wrote:
Steven Ross wrote:
If anybody would like to look at my testcase, it is in the boost vault, inside algorithm_sorting.zip. Inside there, it is in libs/algorithm/sorting/samples/float_as_integer.cpp.
This version fails to compile for me, because it includes boost/static_warning.hpp, and it does not seem to be present in my SVN HEAD tree. I've changed include to boost/serialization/static_warning.hpp, but this can't be right. Am I missing something?
I downloaded boost_1_38_0, and I have boost/static_warning.hpp in my tree. If it would make it easier for others to compile, I'll shift the include to serialization.
To bet bigger times that are less prone to noise, I have changed loopCount in your test to 10. Then, then time reported without -std is 1 second, with very small difference between each run. The time with -std is 43 seconds, again with negligible variation.
If I add:
cxxflags="-march=nocona -mfpmath=sse"
then -std runs in 16 seconds, while the time for your algorithm is the same 1 second. Of course, this is still significant speedup, but smaller than number you have reported, so probably a deeper look is reasonable.
Incidentally, does your library allow to sort only a collection of integers/floats, or it also allows to sort an arbitrary collection where each item has integer/float key? If so, can you point me at example?
Yes. samples/keyplusdatasample.cpp for integer keys + string data. I see a 3X improvement for this test, much more than I get for just sorting integers on
Thanks for the suggestion Vladimir, that's a substantial difference, and I will investigate. I also appreciate the verification that it's not just my system doing this, and I don't expect a 120X speedup across all tests and systems. their own. samples/floatfunctorsort.cpp shows the functors you need to pass to float_sort, but the test just uses a float; you should be able to just replace DATATYPE and update the functors. I'll modify it to use key + data so there is an example of such usage. I will note that floatfunctorsort.cpp does not use my new copying approach, so it only gets around a 7X speedup, and possibly less with the optimizations you applied.