Efficiacy of std::vector
Hi,
I was interested in the efficiacy of std::vector.
Using the test program below, I receive the following results.
The "simple" handling is equivalent with the C array access.
The [] and iterator access time of the vector element
is an order of magnitude worse, than that of the simple access.
This is not because of index limit control, it is done by at()
and its use increases that access time by a factor of three.
What is going on in the background here, that the implementation
is so ineffective?
( I did my tests with WinXP/cygwin/g++/time,
using different setting of the #define-s at the beginning)
Shall I expect similar ineffectivity with the other STL objects?
Regards
Janos
Empty cycle
real 0m2.173s
user 0m1.892s
sys 0m0.080s
STL=1, SIMPLE=1, size=0
real 0m4.917s
user 0m4.646s
sys 0m0.070s
STL=1, size=0
real 0m33.939s
user 0m33.658s
sys 0m0.090s
STL=1, ITERATOR=1, size=1000
real 0m34.059s
user 0m33.688s
sys 0m0.100s
STL=1, AT=1, size=1000
real 1m22.769s
user 1m22.368s
sys 0m0.090s
#define STL 1
#define SIMPLE 0
#define ITERATOR 1
#define AT 0
#define EMPTY 1
#if STL
#include <vector>
#endif
using namespace std;
#include <iostream>
#define ARR_SIZE 1000
#define CYCLES 1000000
main (int argc, char** argv)
{
long int i,j,*P;
#if EMPTY
cout << "Empty cycle";
#else
#if STL
#if ITERATOR || AT
std::vector<long int>::iterator pos;
std::vector<long int> arr(ARR_SIZE);
#else
std::vector <long int> arr;
arr.reserve(ARR_SIZE);
#endif
cout << "STL=1";
#if SIMPLE
cout << ", SIMPLE=1";
#endif
#if ITERATOR
cout << ", ITERATOR=1";
#endif
#if AT
cout << ", AT=1";
#endif
cout << ", size=" << arr.size();
#else
long int *arr;
arr = new long int[ARR_SIZE];
cout << "STL=0";
#endif
#endif
cout << endl;
#if SIMPLE && STL
P = &arr[0];
#endif
for(j=0;j
Janos Vegh wrote:
Hi,
I was interested in the efficiacy of std::vector. Using the test program below, I receive the following results. The "simple" handling is equivalent with the C array access. The [] and iterator access time of the vector element is an order of magnitude worse, than that of the simple access. This is not because of index limit control, it is done by at() and its use increases that access time by a factor of three. What is going on in the background here, that the implementation is so ineffective? ( I did my tests with WinXP/cygwin/g++/time, using different setting of the #define-s at the beginning) Shall I expect similar ineffectivity with the other STL objects?
What optimization options did you use? There should be little difference in performance as long as the compiler inlines the vector access functions. Regards, /Patrik
On 8/26/06, Patrik Jonsson
What optimization options did you use? There should be little difference in performance as long as the compiler inlines the vector access functions.
I just used the default settings, no extra switches. BTW, you can edit and run the test program, maybe you can conclude something interesting through varying the optimization. Anyhow, the difference between simple and [] access is too big for a simple optimization, I think so. Janos
On 8/26/06, Janos Vegh
I just used the default settings, no extra switches. BTW, you can edit and run the test program, maybe you can conclude something interesting through varying the optimization. Anyhow, the difference between simple and [] access is too big for a simple optimization, I think so.
main has to return int; Your code isn't legal C++. Also, it seems
like you end up using memory in the vector after only reserving it,
not resizing it.
Your empty cycle time makes it obvious you're not even doing the most
rudimentary of optimizations.
Here's what I get in g++ (g++ (GCC) 4.1.1 (Gentoo 4.1.1)) after making
a few changes while using -O3 -DNDEBUG as arguments to the compiler
(though -DNDEBUG didn't make any signifigant difference):
Empty cycle
real 0m0.003s
user 0m0.000s
sys 0m0.000s
STL=1, SIMPLE=1, size=1000
real 0m1.357s
user 0m1.168s
sys 0m0.008s
STL=1, size=1000
real 0m1.337s
user 0m1.172s
sys 0m0.016s
STL=1, ITERATOR=1, size=1000
real 0m1.315s
user 0m1.172s
sys 0m0.012s
STL=1, AT=1, size=1000
real 0m1.357s
user 0m1.176s
sys 0m0.004s
I really don't see an issue here.
~ Scott McMurray
Code used:
#define STL 1
#define SIMPLE 0
#define ITERATOR 0
#define AT 1
#define EMPTY 0
#if STL
#include <vector>
#endif
using namespace std;
#include <iostream>
#define ARR_SIZE 1000
#define CYCLES 1000000
int main (int argc, char** argv) {
long int i,j,*P;
#if EMPTY
cout << "Empty cycle";
#else
#if STL
#if ITERATOR || AT
std::vector<long int>::iterator pos;
#endif
std::vector<long int> arr(ARR_SIZE);
cout << "STL=1";
#if SIMPLE
cout << ", SIMPLE=1";
#endif
#if ITERATOR
cout << ", ITERATOR=1";
#endif
#if AT
cout << ", AT=1";
#endif
cout << ", size=" << arr.size();
#else
long int *arr;
arr = new long int[ARR_SIZE];
cout << "STL=0";
#endif
#endif
cout << endl;
#if SIMPLE && STL
P = &arr[0];
#endif
for(j=0;j
On 8/26/06, me22
main has to return int; Your code isn't legal C++. Also, it seems like you end up using memory in the vector after only reserving it, not resizing it.
Yes, a forgotten int, my mistake. After reserving the memory, I guess I can write the memory locations. I think neither of the above affects the subject. Your empty cycle time makes it obvious you're not even doing the most
rudimentary of optimizations.
a few changes while using -O3 -DNDEBUG as arguments to the compiler (though -DNDEBUG didn't make any signifigant difference):
Empty cycle real 0m0.003s user 0m0.000s sys 0m0.000s
STL=1, SIMPLE=1, size=1000 real 0m1.357s user 0m1.168s sys 0m0.008s
STL=1, size=1000 real 0m1.337s user 0m1.172s sys 0m0.016s
STL=1, ITERATOR=1, size=1000 real 0m1.315s user 0m1.172s sys 0m0.012s
STL=1, AT=1, size=1000 real 0m1.357s user 0m1.176s sys 0m0.004s
I really don't see an issue here.
I neither, except that it looks like that in the case of AT checking the lower and upper
My goal was to measure the time of the empty processing, in order to estimate the time of the net calculation, rather than to find out that for no operation I need no time. Here's what I get in g++ (g++ (GCC) 4.1.1 (Gentoo 4.1.1)) after making limits of the index needs no extra time(??). Otherwise, on my system $ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=1, ITERATOR=0, AT=0, size=1000 real 0m4.775s user 0m4.476s sys 0m0.080s $ g++ -o3 -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=1, ITERATOR=0, AT=0, size=1000 real 0m4.749s user 0m4.496s sys 0m0.060s and $ g++ -o3 -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=0, size=1000 real 0m33.413s user 0m32.076s sys 0m0.100s $ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=0, size=1000 real 0m34.554s user 0m33.668s sys 0m0.060s and $ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=1, size=1000 real 1m26.753s user 1m24.301s sys 0m0.080s $ g++ -o3 -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=1, size=1000 real 1m29.311s user 1m27.035s sys 0m0.050s After that you were so kind to run my test on your Linux, maybe I need to ask MingW people about this problem. Anyhow, thanks for replying and sorry for bothering the list with my problem Janos
Use g++ -O3 (not -o3) for optimization.
You might try posting this to comp.lang.c++ since this has nothing to
do with boost.
Chris
On 8/26/06, Janos Vegh
On 8/26/06, me22
wrote: main has to return int; Your code isn't legal C++. Also, it seems like you end up using memory in the vector after only reserving it, not resizing it.
Yes, a forgotten int, my mistake. After reserving the memory, I guess I can write the memory locations. I think neither of the above affects the subject.
Your empty cycle time makes it obvious you're not even doing the most rudimentary of optimizations.
My goal was to measure the time of the empty processing, in order to estimate the time of the net calculation, rather than to find out that for no operation I need no time.
Here's what I get in g++ (g++ (GCC) 4.1.1 (Gentoo 4.1.1)) after making a few changes while using -O3 -DNDEBUG as arguments to the compiler (though -DNDEBUG didn't make any signifigant difference):
Empty cycle real 0m0.003s user 0m0.000s sys 0m0.000s
STL=1, SIMPLE=1, size=1000 real 0m1.357s user 0m1.168s sys 0m0.008s
STL=1, size=1000 real 0m1.337s user 0m1.172s sys 0m0.016s
STL=1, ITERATOR=1, size=1000 real 0m1.315s user 0m1.172s sys 0m0.012s
STL=1, AT=1, size=1000 real 0m1.357s user 0m1.176s sys 0m0.004s
I really don't see an issue here.
I neither, except that it looks like that in the case of AT checking the lower and upper limits of the index needs no extra time(??).
Otherwise, on my system $ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=1, ITERATOR=0, AT=0, size=1000 real 0m4.775s user 0m4.476s sys 0m0.080s
$ g++ -o3 -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=1, ITERATOR=0, AT=0, size=1000 real 0m4.749s user 0m4.496s sys 0m0.060s
and
$ g++ -o3 - oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=0, size=1000 real 0m33.413s user 0m32.076s sys 0m0.100s
$ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=0, size=1000 real 0m34.554s user 0m33.668s sys 0m0.060s
and
$ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=1, size=1000 real 1m26.753s user 1m24.301s sys 0m0.080s
$ g++ -o3 -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=1, size=1000 real 1m29.311s user 1m27.035s sys 0m0.050s
After that you were so kind to run my test on your Linux, maybe I need to ask MingW people about this problem.
Anyhow, thanks for replying and sorry for bothering the list with my problem
Janos
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
On 8/26/06, Janos Vegh
[snipped]
I neither, except that it looks like that in the case of AT checking the lower and upper limits of the index needs no extra time(??).
No measurable time you mean.
Otherwise, on my system $ g++ -oeffic2.exe effic2.cpp
It's here that you're making a mistake. You must use the optimization flags, otherwise your benchmark is measuring how slow your debug builds are, which isnt very important, IMO.
[snipped]
Janos
-- Felipe Magno de Almeida
This is most likely that you are not getting your operator[] inlined.
The function call overhead for such a small operation is going to be
very large. Optimizations are your friends.
On 8/26/06, Janos Vegh
On 8/26/06, me22
wrote: main has to return int; Your code isn't legal C++. Also, it seems like you end up using memory in the vector after only reserving it, not resizing it.
Yes, a forgotten int, my mistake. After reserving the memory, I guess I can write the memory locations. I think neither of the above affects the subject.
Your empty cycle time makes it obvious you're not even doing the most rudimentary of optimizations.
My goal was to measure the time of the empty processing, in order to estimate the time of the net calculation, rather than to find out that for no operation I need no time.
Here's what I get in g++ (g++ (GCC) 4.1.1 (Gentoo 4.1.1)) after making a few changes while using -O3 -DNDEBUG as arguments to the compiler (though -DNDEBUG didn't make any signifigant difference):
Empty cycle real 0m0.003s user 0m0.000s sys 0m0.000s
STL=1, SIMPLE=1, size=1000 real 0m1.357s user 0m1.168s sys 0m0.008s
STL=1, size=1000 real 0m1.337s user 0m1.172s sys 0m0.016s
STL=1, ITERATOR=1, size=1000 real 0m1.315s user 0m1.172s sys 0m0.012s
STL=1, AT=1, size=1000 real 0m1.357s user 0m1.176s sys 0m0.004s
I really don't see an issue here.
I neither, except that it looks like that in the case of AT checking the lower and upper limits of the index needs no extra time(??).
Otherwise, on my system $ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=1, ITERATOR=0, AT=0, size=1000 real 0m4.775s user 0m4.476s sys 0m0.080s
$ g++ -o3 -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=1, ITERATOR=0, AT=0, size=1000 real 0m4.749s user 0m4.496s sys 0m0.060s
and
$ g++ -o3 - oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=0, size=1000 real 0m33.413s user 0m32.076s sys 0m0.100s
$ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=0, size=1000 real 0m34.554s user 0m33.668s sys 0m0.060s
and
$ g++ -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=1, size=1000 real 1m26.753s user 1m24.301s sys 0m0.080s
$ g++ -o3 -oeffic2.exe effic2.cpp $ time ./effic2.exe STL=1, SIMPLE=0, ITERATOR=0, AT=1, size=1000 real 1m29.311s user 1m27.035s sys 0m0.050s
After that you were so kind to run my test on your Linux, maybe I need to ask MingW people about this problem.
Anyhow, thanks for replying and sorry for bothering the list with my problem
Janos
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Brian Budge writes:
This is most likely that you are not getting your operator[] inlined. The function call overhead for such a small operation is going to be very large. Optimizations are your friends. On 8/26/06, Janos Vegh
wrote: On 8/26/06, me22
wrote: [...]
Empty cycle real 0m0.003s user 0m0.000s sys 0m0.000s
STL=1, SIMPLE=1, size=1000 real 0m1.357s user 0m1.168s sys 0m0.008s
STL=1, size=1000 real 0m1.337s user 0m1.172s sys 0m0.016s [...]
A comment on this sort of thing: using a shell's "time" is merely an OK way to measure performance -- OK, but not great. I strongly suggest that the OP, and anybody else interested in such things, go get a recent verson of valgrind. The 'callgrind' tool counts the number of instructions execuded, which is a much more accurate and reliable measure. Also, get kcachegrind (a plug-in for KDE); together with callgrind, it'll tell you instruction counts on a line-by-line basis. (The usual caveats apply.) If you can't use valgrind on your platform, then I strongly suggest you aquire some other good profiling tool, such as VTune. Trust me, you'll get much, much better and more consistent results this way. ---------------------------------------------------------------------- Dave Steffen, Ph.D. Software Engineer IV Disobey this command! Numerica Corporation - Douglas Hofstadter dgsteffen at numerica dot us
Hi, I profiled some example code (not yours) using AQTime. Here is a screen shot of the results. You can see that in the inner loop I am accessing the elements using the [] operator, the array and the vector operations perform almost identically, accessing with at() is quite a bit more expensive. Stefan Janos Vegh wrote:
Hi,
I was interested in the efficiacy of std::vector. Using the test program below, I receive the following results. The "simple" handling is equivalent with the C array access. The [] and iterator access time of the vector element is an order of magnitude worse, than that of the simple access. This is not because of index limit control, it is done by at() and its use increases that access time by a factor of three. What is going on in the background here, that the implementation is so ineffective? ( I did my tests with WinXP/cygwin/g++/time, using different setting of the #define-s at the beginning) Shall I expect similar ineffectivity with the other STL objects?
Regards
Janos
Empty cycle real 0m2.173s user 0m1.892s sys 0m0.080s
STL=1, SIMPLE=1, size=0 real 0m4.917s user 0m4.646s sys 0m0.070s
STL=1, size=0 real 0m33.939s user 0m33.658s sys 0m0.090s
STL=1, ITERATOR=1, size=1000 real 0m34.059s user 0m33.688s sys 0m0.100s
STL=1, AT=1, size=1000 real 1m22.769s user 1m22.368s sys 0m0.090s
#define STL 1 #define SIMPLE 0 #define ITERATOR 1 #define AT 0 #define EMPTY 1 #if STL #include <vector> #endif using namespace std; #include <iostream> #define ARR_SIZE 1000 #define CYCLES 1000000 main (int argc, char** argv) { long int i,j,*P; #if EMPTY cout << "Empty cycle"; #else #if STL #if ITERATOR || AT std::vector<long int>::iterator pos; std::vector<long int> arr(ARR_SIZE); #else std::vector <long int> arr; arr.reserve(ARR_SIZE); #endif
cout << "STL=1"; #if SIMPLE cout << ", SIMPLE=1"; #endif #if ITERATOR cout << ", ITERATOR=1"; #endif #if AT cout << ", AT=1"; #endif cout << ", size=" << arr.size(); #else long int *arr; arr = new long int[ARR_SIZE]; cout << "STL=0"; #endif #endif cout << endl; #if SIMPLE && STL P = &arr[0]; #endif for(j=0;j
------------------------------------------------------------------------
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- Stefan Tarrant Senior Project Engineer C-CORE Captain Robert A. Bartlett Building Morrissey Road St. John's, NL Canada A1B 3X5 Celebrating 30 years of Providing Innovative Engineering Solutions, 1975-2005. Tel: 709 737-8372 General Fax: 709 737-4706 http://www.c-core.ca --------------------------------------------------------------------- This communication (including all attachments) is intended solely for the use of the person or persons to whom it is addressed and should be treated as a confidential C-CORE communication. If you are not the intended recipient, any use, distribution, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately delete it from your system and notify the originator. Your cooperation is appreciated.
participants (9)
-
Brian Budge
-
Chris Weed
-
Dave Steffen
-
Felipe Magno de Almeida
-
Janos Vegh
-
loufoque
-
me22
-
Patrik Jonsson
-
Stefan Tarrant