
on 10.09.2009 at 21:41 Simonson, Lucanus J wrote :
Hmm, I brought this exact idea to boost about three years ago. I had min, abs and median of three. [code here] You can improve the efficiency for larger types by using arrays of pointer to argument instead of arrays of argument value. implemented that...
In my own testing the speedup for prescott was 25% for min and 45% for median of three. I would expect less speedup than that in core2 processors because the pipeline is shorter. pipeline is shorter for a reason it sould not degrade overall performance (on the contrary it should speed up)
Also there is a new instruction in core2 that allows the min to compile to a branchless instruction stream. You need to use the proper flags with the compiler to enable the use of new instructions. If you are compiling in max compatibility mode you are dropping performance on the floor. In our testing branch free min (as you propose) was no faster than a < b ? a:b; when compiling with the new insturctions enabled for a core2 machine. Median of three was still 25% faster though. The response from boost three years ago was that such low level architecture specific optimizations have no place in portable libraries. i wasn't here 3 years ago however like stepanov said once "great minds think alike" anyway thanks, i'll think of it and sertainly try it out
-- Pavel ps good luck with your polygon lib