[sorting] Assembly language extensions to the algorithms

Lewis Hyatt wrote:
Are you sure about the quicksort speed difference? That seems like a huge difference. Did you have inlining enabled? Also, the current versions are not the final forms. I am implementing random-access iterators and user-provided functions.
Yes, this was on cygwin compiled with g++ -O3. One thing to keep in mind is that your routines are competing with the standard library, which can be highly optimized for whatever target platform the compiler was written for. Unless you have a dramatically improved algorithm, it is unlikely that your sorting routines will beat std::sort in general, even if they do on some platforms. As an example, suppose your goal was to write an equivalent of std::bitset which beat the standard implementation. It is unlikely that you could exceed the performance of std::bitset::count(), for instance, because most implementations use native assembly instructions to do this whenever possible. (e.g., gcc uses its built-in popcount function on platforms where it is available.) On the other hand, if you hand-coded a version with the best algorithm you could find, it might beat a few poor implementations, but in general it would be inferior. The only way to make it better in general would be to implement all these optimizations yourself for each target platform, but at that point, it's just not worth the time. I don't think it's true that std::sort, std::stable_sort, and std::partial_sort are the end of the story, but I think any work in a sorting library should focus on needs which are not already met. So far, only radix sort appears to fall in that category, but you would also need to demonstrate a case in which radix sort is preferable to std::sort. -lewis I think that you are right that each compiler's version of the stl has the advantage of assembly language. Since I know both that speed and features are important to developers, what would people think if I uploaded to vault x86 assembly language extensions to the algorithm for users to download. The whole algorithm will not be coded in assembly language(only the tight inner loop), but it could only be used on built in types. Would you use the extensions if they were supplied?
participants (1)
-
Sam Schetterer