
Tomas Puverle wrote:
First of all, *thank you* for going into all this trouble and writing the code.
I needed to write some code to understand your use-case. Benchmarks are always criticized, especially hastily created ones. However, it's important to code something because "best" always implicitly implies a "for what". Doing this has enhanced my awareness of endianess issues (and I thought I already knew everything). I'm considering adding an endian_convert_in_place<endian_t, T>(T& x) function to my library, that will be implemented using std::swap. So *thank you* for raising my awareness, and hopefully others are monitoring this thread as well.
I have some problems with your tests.
I was going to make the same points as Robert, but he beat me to it. Additionally (I don't think this was mentioned), I don't think the verification should be part of the test, either.
As Robert pointed out, it should do something with the result, otherwise the whole thing could be optimized away by a clever compiler. Plus, the assert has alerted me to over-zealous optimizations. Its the same overhead for both tests, so it shouldn't affect the results.
When the disk-data-file was in little endian, both approaches came in at around 6 seconds.
Both approaches seem to do nothing. That's as it should be. I'm just using cygwin's "time" command to evaluate the results. So the times given represent program initialization/deinitialization; generating and writing the data file; opening, reading and closing the datafile for each iteration; and verifying that the conversion was correct for each iteration using memcmp. I'm not trying to measure how fast each approach is in absolute terms. Just given a common test app, how do the resulting times compare. The overhead should be the same in both cases.
This, I think, you'll find is the overhead of reading the file & verification. In the little endian data case, swap_in_place<>() does nothing! :)
Both approaches seem to do nothing in the same-endian case.
Swap Based: 18 seconds Type Based 14 seconds
This is unexpected...
Swap-then-copy is less efficient than just a reverse copy (in the general case). These differences are more significant than I expected. If we allow 9 seconds for the disk i/o and memcmp, then the overhead is approximately Type-based: 36% overhead (14-9)/14 Swap-based: 50% overhead (18-9)/18.
Thanks again. If there is a performance discrepancy between swap (soon to be endian_cast<>) and the object-based approach, I will make sure to fix it, of course.
If you implement endian_cast<> using a reverse-copy instead of swapping, you will see a performance improvement. terry