[accumulator][minmax_element] Perfomance comparsion different min max algorithm

Hi, I have made today a few tests to compare the different possibilities to search min,max of a value. I had some really strange results which I don't have expected. As environment I use VS2008 Express edition,not managed, defines are SECURE_SCL=0 and NDEBUG=1. In the comparsion I got the following timings: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< min_element,max_element:73 minVal=0,maxVal=9999999 min_max: accumulator:2338 minVal=0,maxVal=9999999 min_max: minmax_element:86 minVal=0,maxVal=9999999 min/max handcoded:135 minVal=0,maxVal=9999999 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< For me the strange things are: - that std::min_element and std::max_element is faster as boost::minmax_element - the accumulator library seems in this example really slow Best regards Hansjörg The source code was: #include <algorithm> #include <boost/accumulators/accumulators.hpp> #include <boost/accumulators/statistics.hpp> #include <boost/algorithm/minmax_element.hpp> #include <boost/array.hpp> #include <boost\chrono\chrono.hpp> using namespace boost; using namespace boost::chrono; using namespace std; typedef array<int,10000000> dataarray; dataarray data;

Hansi wrote:
I have made today a few tests to compare the different possibilities to search min,max of a value. I had some really strange results which I don't have expected. As environment I use VS2008 Express edition,not managed, defines are SECURE_SCL=0 and NDEBUG=1. <snip>
Did you turn on compiler optimizations? If not, the results are meaningless. If you did, please post the full source code. What you posted was incomplete. Thanks, -- Eric Niebler BoostPro Computing http://www.boostpro.com

Eric Niebler schrieb:
Did you turn on compiler optimizations? If not, the results are meaningless. If you did, please post the full source code. What you posted was incomplete.
sorry..yes I have turned on compiler optimization. #include <algorithm> #include <boost/accumulators/accumulators.hpp> #include <boost/accumulators/statistics.hpp> #include <boost/algorithm/minmax_element.hpp> #include <boost/array.hpp> #include <boost\chrono\chrono.hpp> using namespace boost; using namespace boost::chrono; using namespace std; typedef array<int,10000000> dataarray; dataarray data; int _tmain(int argc, _TCHAR* argv[]) { high_resolution_clock::time_point start = high_resolution_clock::now(); high_resolution_clock::time_point stop = high_resolution_clock::now(); for(unsigned int i = 0; i< data.size();i++) { data[i] = i; } start = high_resolution_clock::now(); dataarray::iterator resmin = std::min_element(data.begin(),data.end()); dataarray::iterator resmax = std::max_element(data.begin(),data.end()); stop = high_resolution_clock::now(); cout<<"min_element,max_element:"<<duration_cast<milliseconds>(stop-start).count()<<endl; cout<<"minVal="<<*resmin<<",maxVal="<<*resmax<<endl; boost::accumulators::accumulator_set<int, boost::accumulators::features<boost::accumulators::tag::min,boost::accumulators::tag::max>> acc; start = high_resolution_clock::now(); acc = std::for_each(data.begin(),data.end(),acc); stop = high_resolution_clock::now(); int minVal = boost::accumulators::extract_result< boost::accumulators::tag::min >(acc); int maxVal = boost::accumulators::extract_result< boost::accumulators::tag::max >(acc); cout<<"min_max: accumulator:"<<duration_cast<milliseconds>(stop-start).count()<<endl; cout<<"minVal="<<minVal<<",maxVal="<<maxVal<<endl; start = high_resolution_clock::now(); pair< vector<int>::iterator, vector<int>::iterator > result = boost::minmax_element(data.begin(), data.end()); stop = high_resolution_clock::now(); cout<<"min_max: minmax_element:"<<duration_cast<milliseconds>(stop-start).count()<<endl; cout<<"minVal="<<*result.first<<",maxVal="<<*result.second<<endl; minVal = (std::numeric_limits<int>::max)(); maxVal = (std::numeric_limits<int>::min)(); start = high_resolution_clock::now(); for(dataarray::iterator it = data.begin(); it!=data.end();it++) { minVal = (std::min)(minVal,*it); maxVal = (std::max)(maxVal,*it); } stop = high_resolution_clock::now(); cout<<"min/max handcoded:"<<duration_cast<milliseconds>(stop-start).count()<<endl; cout<<"minVal="<<minVal<<",maxVal="<<maxVal<<endl; return 0; }

Hansi wrote:
I have made today a few tests to compare the different possibilities to search min,max of a value. I had some really strange results which I don't have expected. As environment I use VS2008 Express edition,not managed, defines are SECURE_SCL=0 and NDEBUG=1.
In the comparsion I got the following timings:
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< min_element,max_element:73 minVal=0,maxVal=9999999 min_max: accumulator:2338 minVal=0,maxVal=9999999 min_max: minmax_element:86 minVal=0,maxVal=9999999 min/max handcoded:135 minVal=0,maxVal=9999999 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
For me the strange things are: - that std::min_element and std::max_element is faster as boost::minmax_element - the accumulator library seems in this example really slow
I made some changes to your code. I switched from chrono to boost::timer because that measures CPU time, rather than wall-clock time (at least, I think that's what your chrono-based code was measuring; it seemed more variable). I made it test an array with random contents as well as one with numbers in order. I increased your array size by a factor of 10 to get more stable numbers. Variation attached. With that, on i686 Linux, gcc -O3 I get: sorted array: min_element,max_element:1.89 minVal=0,maxVal=99999999 min_max: accumulator:0.45 minVal=0,maxVal=99999999 min_max: minmax_element:0.48 minVal=0,maxVal=99999999 min/max handcoded:0.79 minVal=0,maxVal=99999999 random array: min_element,max_element:1.8 minVal=7,maxVal=2147483611 min_max: accumulator:0.36 minVal=7,maxVal=2147483611 min_max: minmax_element:0.6 minVal=7,maxVal=2147483611 min/max handcoded:0.71 minVal=7,maxVal=2147483611 And with icc -O3: sorted array: min_element,max_element:0.74 minVal=0,maxVal=99999999 min_max: accumulator:0.51 minVal=0,maxVal=99999999 min_max: minmax_element:0.48 minVal=0,maxVal=99999999 min/max handcoded:0.51 minVal=0,maxVal=99999999 random array: min_element,max_element:0.75 minVal=7,maxVal=2147483611 min_max: accumulator:0.51 minVal=7,maxVal=2147483611 min_max: minmax_element:0.68 minVal=7,maxVal=2147483611 min/max handcoded:0.51 minVal=7,maxVal=2147483611 I'm quite surprised by these numbers, in that: - gcc sometimes beats icc - gcc's accumulator looks significantly faster on random data than sorted data (indeed, on random it's the fastest thing of all). I'd guess that minmax_element is faster on sorted data because branch prediction works better there. Nevertheless, it's clearly very different from your results, in that we have opposite fastest and slowest methods. I suspect you're missing some crucial optimization (probably the inlining of a function somewhere in accumulator). John Bytheway

<snip>
With that, on i686 Linux, gcc -O3 I get:
sorted array: min_element,max_element:1.89 minVal=0,maxVal=99999999 min_max: accumulator:0.45 minVal=0,maxVal=99999999 min_max: minmax_element:0.48 minVal=0,maxVal=99999999 min/max handcoded:0.79 minVal=0,maxVal=99999999 random array: min_element,max_element:1.8 minVal=7,maxVal=2147483611 min_max: accumulator:0.36 minVal=7,maxVal=2147483611 min_max: minmax_element:0.6 minVal=7,maxVal=2147483611 min/max handcoded:0.71 minVal=7,maxVal=2147483611
And with icc -O3:
sorted array: min_element,max_element:0.74 minVal=0,maxVal=99999999 min_max: accumulator:0.51 minVal=0,maxVal=99999999 min_max: minmax_element:0.48 minVal=0,maxVal=99999999 min/max handcoded:0.51 minVal=0,maxVal=99999999 random array: min_element,max_element:0.75 minVal=7,maxVal=2147483611 min_max: accumulator:0.51 minVal=7,maxVal=2147483611 min_max: minmax_element:0.68 minVal=7,maxVal=2147483611 min/max handcoded:0.51 minVal=7,maxVal=2147483611
I'm quite surprised by these numbers, in that: - gcc sometimes beats icc - gcc's accumulator looks significantly faster on random data than sorted data (indeed, on random it's the fastest thing of all).
I'd guess that minmax_element is faster on sorted data because branch prediction works better there.
Nevertheless, it's clearly very different from your results, in that we have opposite fastest and slowest methods. I suspect you're missing some crucial optimization (probably the inlining of a function somewhere in accumulator).
Your results are really different from my one, but normally I would say more expected. I can't find any wrong settings at the moment in my project. May be it helps when I post the command line parameters: Compiler settings: /GL /D "WIN32" /D "_CONSOLE" /D "_SECURE_SCL=0" /D "NDEBUG=1" /D "_UNICODE" /D "UNICODE" /FD /EHsc /MD /Yu"stdafx.h" /Fp"Release\Statistic.pch" /Fo"Release\\" /Fd"Release\vc90.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt Linker settings /OUT:"D:\projects\sandbox\BoostTests\Release\Statistic.exe" /INCREMENTAL:NO /NOLOGO /MANIFEST /MANIFESTFILE:"Release\Statistic.exe.intermediate.manifest" /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG /PDB:"d:\projects\sandbox\BoostTests\release\Statistic.pdb" /SUBSYSTEM:CONSOLE /OPT:REF /OPT:ICF /LTCG /DYNAMICBASE:NO /MACHINE:X86 /ERRORREPORT:PROMPT kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib BTW if I compile your program with debug I get an compile error : error C2664: 'std::_Vector_iterator<_Ty,_Alloc>::_Vector_iterator(const std::_Vector_iterator<_Ty,_Alloc> &)': can't convert from 'int *const ' to 'const std::_Vector_iterator<_Ty,_Alloc> &' in the line std::pair< std::vector<int>::iterator, std::vector<int>::iterator > result = boost::minmax_element(data.begin(), data.end()); Any ideas? Thanks Hansjörg

On Tue, Feb 17, 2009 at 08:17:59AM +0100, Hansi wrote:
Nevertheless, it's clearly very different from your results, in that we have opposite fastest and slowest methods. I suspect you're missing some crucial optimization (probably the inlining of a function somewhere in accumulator).
May be it helps when I post the command line parameters: Compiler settings: /GL /D "WIN32" /D "_CONSOLE" /D "_SECURE_SCL=0" /D "NDEBUG=1" /D "_UNICODE" /D "UNICODE" /FD /EHsc /MD /Yu"stdafx.h" /Fp"Release\Statistic.pch" /Fo"Release\\" /Fd"Release\vc90.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt
You seem to be missing any optimization options. I suggest turning on /O2, which should make a huge difference to your results. -- Pray remember that Bacchus was a warrior, and that his armies had little fighting to do, because wherever he appeared he taught the cultivation of the vine to the grateful and submissive natives. -- J.B. Morton http://surreal.istic.org/ God save the King!

May be it helps when I post the command line parameters: Compiler settings: /GL /D "WIN32" /D "_CONSOLE" /D "_SECURE_SCL=0" /D "NDEBUG=1" /D "_UNICODE" /D "UNICODE" /FD /EHsc /MD /Yu"stdafx.h" /Fp"Release\Statistic.pch" /Fo"Release\\" /Fd"Release\vc90.pdb" /W3 /nologo /c /Zi /TP /errorReport:prompt
You seem to be missing any optimization options. I suggest turning on /O2, which should make a huge difference to your results.
for some reason Visual studio has displayed to me full optimizations.....but it was not in reality. I had to change it and change it back and then I head it really... min_element,max_element:238 minVal=0,maxVal=99999999 min_max: accumulator:149 minVal=0,maxVal=99999999 min_max: minmax_element:144 minVal=0,maxVal=99999999 min/max handcoded:228 minVal=0,maxVal=99999999 now much more as expected... Regards Hansjörg

Hansi wrote:
Hi,
I have made today a few tests to compare the different possibilities to search min,max of a value. I had some really strange results which I don't have expected. As environment I use VS2008 Express edition,not managed, defines are SECURE_SCL=0 and NDEBUG=1.
In the comparsion I got the following timings:
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< min_element,max_element:73 minVal=0,maxVal=9999999 min_max: accumulator:2338 minVal=0,maxVal=9999999 min_max: minmax_element:86 minVal=0,maxVal=9999999 min/max handcoded:135 minVal=0,maxVal=9999999 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
For me the strange things are: - that std::min_element and std::max_element is faster as boost::minmax_element ...
Why does this surprise you? std::min() (or std::max) should be faster that boost::minmax(). boost::minmax() should be faster than subsequent invocation of std::min() and then std::max(). BR, Dmitry

Dmitry Goncharov schrieb:
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< min_element,max_element:73 minVal=0,maxVal=9999999 min_max: accumulator:2338 minVal=0,maxVal=9999999 min_max: minmax_element:86 minVal=0,maxVal=9999999 min/max handcoded:135 minVal=0,maxVal=9999999 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
For me the strange things are: - that std::min_element and std::max_element is faster as boost::minmax_element ...
Why does this surprise you? std::min() (or std::max) should be faster that boost::minmax(). boost::minmax() should be faster than subsequent invocation of std::min() and then std::max().
minmax_element is slower as 2 subsequent calls of min_element,max_element. This is suprising...
participants (5)
-
Daniel Hulme
-
Dmitry Goncharov
-
Eric Niebler
-
Hansi
-
John Bytheway