
On Mon, Jul 6, 2009 at 11:07 AM, Vladimir Prus <vladimir@codesourcery.com>wrote:
Yes, I guess I was using denormalized numbers for the 23-bit case, but it's interesting that there is such a huge speed difference, and I doubt it's because they're denormalized; there is no reason to special-case denormalized sorting.
It is well possible that comparing denormalized numbers is slow.
Taking 10M elements with my 23 bits of randomness case, and making the top exponent bit 1, so that the data is normalized: if(!(float_mem_cast<float, int>(array[v]) & 0x7f800000)) { //Make the top exponent bit high CAST_TYPE temp = 0x80000000 | float_mem_cast<float, int>(array[v]); memcpy(&(array[v]), &temp, sizeof(DATATYPE)); //printf("new %e\n", array[v]); } I see std::sort take 83.527s, as compared to 1.226s when sorted as integers. denormalized numbers are not the problem.
Notably, normalized numbers also show the same problem if their exponent
range is small relative to the number of elements; this can be achieved by using a small number of different exponents, or simply a large number of floats (as shown with my large test). What do you consider a reasonable range? I can easily make a test that excludes denormalized numbers.
Even if the full exponent range is in use (which isn't realistic), wouldn't it be better to use an algorithm that doesn't blow up when n gets much larger than 10 million elements, and is still faster with a smaller number of elements?
I don't quite follow the above point. My argument is (and always was), that it's important to provide accurate measurements on the data that ordinary humans will use. Sorting of 200'000'000 floats does not seem a typical use case, for performance on that is probably less important. Just like sorting demormalized numbers.
Normal problems don't use the full space of possible exponents for floating-point numbers. Can you think of a benchmark that accurately reflects this reality?