[boost.chrono] Slow down process (if specifing c++11)
Hi have tried the following code compiling it with gcc 4.6.3,
with four combinations:
1) -O3
2) -O3 (commeting out the chrono call)
3) -O3 -std=c++0x
4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms
2) 8915 ms
3) 9400 ms
4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire
process. I have tried the same with gcc 4.8.1 and all times are more or less
the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or
simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards
Gaetano Mendola
Here the code I have used:
#include <iostream>
#include
Hi
I've seen something similar before. I'd check whether the call
boost::chrono triggers a fork() in C++11 mode. If it does, Linux will mark
your pages as copy-on-write and slow down the entire thing.
To make check whether that's the case move the memory allocation after the
call to chrono : the pages allocated after fork() won't be affected.
Regards,
Julien
On Mon, Aug 26, 2013 at 9:54 PM, Gaetano Mendola
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola
Here the code I have used:
#include <iostream> #include
#include #include int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
When I wrote with 4.8.1 the time are more or less the same I meant: all around 8900 ms, so basically no slow down. I have straced it and I don't see any fork involved. I have moved the chrono call around in the code and the times are: ============ gcc 4.6.3 ==================== as first operation in the main: 8900 ms before the gettimeofday: 9500 ms as last operations in the main: 10000 ms ============ gcc 4.8.1 ==================== as first operation in the main: 8900 ms before the gettimeofday: 8900 ms as last operations in the main: 8900 ms ============ clang 3.3 ==================== as first operation in the main: 9400 ms before the gettimeofday: 9400 ms as last operations in the main: 9400 ms I'm not sure if it's worth at this stage to open an issue with GCC people. Regards Gaetano Mendola On 08/26/2013 04:29 PM, Julien Nitard wrote:
Hi
I've seen something similar before. I'd check whether the call boost::chrono triggers a fork() in C++11 mode. If it does, Linux will mark your pages as copy-on-write and slow down the entire thing.
To make check whether that's the case move the memory allocation after the call to chrono : the pages allocated after fork() won't be affected.
Regards,
Julien
On Mon, Aug 26, 2013 at 9:54 PM, Gaetano Mendola
wrote: Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola
Here the code I have used:
#include <iostream> #include
#include #include int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Le 26/08/13 15:54, Gaetano Mendola a écrit :
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola I don't know if changing the order could changes the results.
gettimeofday(&myStart, 0); boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now(); and std::cout << "t1: " << t1 << std::endl; gettimeofday(&myStop, 0); timersub(&myStop,&myStart,&myResult); std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; Or to be neutral execute either the chrono computation either the timeval one. HTH, Vicente
Here the code I have used:
#include <iostream> #include
#include #include int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
See my reply to Julien Ritard in the thread, just the simply call to boost chrono even as last operation in the main slows it down. Regards Gaetano Mendola On 08/28/2013 10:48 PM, Vicente J. Botet Escriba wrote:
Le 26/08/13 15:54, Gaetano Mendola a écrit :
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola I don't know if changing the order could changes the results.
gettimeofday(&myStart, 0);
boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now();
and
std::cout << "t1: " << t1 << std::endl;
gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl;
Or to be neutral execute either the chrono computation either the timeval one.
HTH, Vicente
Here the code I have used:
#include <iostream> #include
#include #include int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 09/06/2013 11:28 AM, Gaetano Mendola wrote:
See my reply to Julien Ritard in the thread, just the simply call to boost chrono even as last operation in the main slows it down.
Regards Gaetano Mendola
In the mean time I have filled a bug with gcc: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58333 let see what they say. Regards Gaetano Mendola
On 08/28/2013 10:48 PM, Vicente J. Botet Escriba wrote:
Le 26/08/13 15:54, Gaetano Mendola a écrit :
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola I don't know if changing the order could changes the results.
gettimeofday(&myStart, 0);
boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now();
and
std::cout << "t1: " << t1 << std::endl;
gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl;
Or to be neutral execute either the chrono computation either the timeval one.
HTH, Vicente
Here the code I have used:
#include <iostream> #include
#include #include int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_pointboost::chrono::steady_clock t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (3)
-
Gaetano Mendola
-
Julien Nitard
-
Vicente J. Botet Escriba