[boost.chrono] Slow down process (if specifing c++11)

Hi have tried the following code compiling it with gcc 4.6.3, with four combinations: 1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call) and I'm obtaining the following runtime values: 1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same. At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series. Regards Gaetano Mendola Here the code I have used: #include <iostream> #include <boost/chrono.hpp> #include <sys/time.h> #include <math.h> int main() { cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask); volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; struct timeval myStart; struct timeval myStop; struct timeval myResult; boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now(); gettimeofday(&myStart, 0); for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB; for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0); timersub(&myStop,&myStart,&myResult); std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }

Hi I've seen something similar before. I'd check whether the call boost::chrono triggers a fork() in C++11 mode. If it does, Linux will mark your pages as copy-on-write and slow down the entire thing. To make check whether that's the case move the memory allocation after the call to chrono : the pages allocated after fork() won't be affected. Regards, Julien On Mon, Aug 26, 2013 at 9:54 PM, Gaetano Mendola <mendola@gmail.com> wrote:
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola
Here the code I have used:
#include <iostream> #include <boost/chrono.hpp> #include <sys/time.h> #include <math.h>
int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

When I wrote with 4.8.1 the time are more or less the same I meant: all around 8900 ms, so basically no slow down. I have straced it and I don't see any fork involved. I have moved the chrono call around in the code and the times are: ============ gcc 4.6.3 ==================== as first operation in the main: 8900 ms before the gettimeofday: 9500 ms as last operations in the main: 10000 ms ============ gcc 4.8.1 ==================== as first operation in the main: 8900 ms before the gettimeofday: 8900 ms as last operations in the main: 8900 ms ============ clang 3.3 ==================== as first operation in the main: 9400 ms before the gettimeofday: 9400 ms as last operations in the main: 9400 ms I'm not sure if it's worth at this stage to open an issue with GCC people. Regards Gaetano Mendola On 08/26/2013 04:29 PM, Julien Nitard wrote:
Hi
I've seen something similar before. I'd check whether the call boost::chrono triggers a fork() in C++11 mode. If it does, Linux will mark your pages as copy-on-write and slow down the entire thing.
To make check whether that's the case move the memory allocation after the call to chrono : the pages allocated after fork() won't be affected.
Regards,
Julien
On Mon, Aug 26, 2013 at 9:54 PM, Gaetano Mendola <mendola@gmail.com> wrote:
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola
Here the code I have used:
#include <iostream> #include <boost/chrono.hpp> #include <sys/time.h> #include <math.h>
int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Le 26/08/13 15:54, Gaetano Mendola a écrit :
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola I don't know if changing the order could changes the results.
gettimeofday(&myStart, 0); boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now(); and std::cout << "t1: " << t1 << std::endl; gettimeofday(&myStop, 0); timersub(&myStop,&myStart,&myResult); std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; Or to be neutral execute either the chrono computation either the timeval one. HTH, Vicente
Here the code I have used:
#include <iostream> #include <boost/chrono.hpp> #include <sys/time.h> #include <math.h>
int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }

See my reply to Julien Ritard in the thread, just the simply call to boost chrono even as last operation in the main slows it down. Regards Gaetano Mendola On 08/28/2013 10:48 PM, Vicente J. Botet Escriba wrote:
Le 26/08/13 15:54, Gaetano Mendola a écrit :
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola I don't know if changing the order could changes the results.
gettimeofday(&myStart, 0);
boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now();
and
std::cout << "t1: " << t1 << std::endl;
gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl;
Or to be neutral execute either the chrono computation either the timeval one.
HTH, Vicente
Here the code I have used:
#include <iostream> #include <boost/chrono.hpp> #include <sys/time.h> #include <math.h>
int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On 09/06/2013 11:28 AM, Gaetano Mendola wrote:
See my reply to Julien Ritard in the thread, just the simply call to boost chrono even as last operation in the main slows it down.
Regards Gaetano Mendola
In the mean time I have filled a bug with gcc: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58333 let see what they say. Regards Gaetano Mendola
On 08/28/2013 10:48 PM, Vicente J. Botet Escriba wrote:
Le 26/08/13 15:54, Gaetano Mendola a écrit :
Hi have tried the following code compiling it with gcc 4.6.3, with four combinations:
1) -O3 2) -O3 (commeting out the chrono call) 3) -O3 -std=c++0x 4) -O3 -std=c++0x (commenting out the chrono call)
and I'm obtaining the following runtime values:
1) 8921 ms 2) 8915 ms 3) 9400 ms 4) 8933 ms
as you can see the combination: "c++11 and chrono call" slows down the entire process. I have tried the same with gcc 4.8.1 and all times are more or less the same.
At this point I'm not sure if it's a problem of chrono with gcc 4.6.3 or simply an issue of 4.6.3 disappeared in 4.8.1 series.
Regards Gaetano Mendola I don't know if changing the order could changes the results.
gettimeofday(&myStart, 0);
boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now();
and
std::cout << "t1: " << t1 << std::endl;
gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl;
Or to be neutral execute either the chrono computation either the timeval one.
HTH, Vicente
Here the code I have used:
#include <iostream> #include <boost/chrono.hpp> #include <sys/time.h> #include <math.h>
int main() {
cpu_set_t myAffinityMask; CPU_ZERO( &myAffinityMask ); CPU_SET(0, &myAffinityMask ); sched_setaffinity(0, sizeof(myAffinityMask), &myAffinityMask);
volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)];
struct timeval myStart; struct timeval myStop; struct timeval myResult;
boost::chrono::time_point<boost::chrono::steady_clock> t1 = boost::chrono::high_resolution_clock::now();
gettimeofday(&myStart, 0);
for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] = i; myMemoryB[i] = i+1; } delete []myMemoryA; delete []myMemoryB;
for (size_t j = 0; j < 100; ++j) { volatile float* myMemoryA = new float[(1<<24)]; volatile float* myMemoryB = new float[(1<<24)]; for (size_t i = 0; i < (1<<24); ++i) { myMemoryA[i] *= sqrtf(myMemoryB[i]); } delete []myMemoryA; delete []myMemoryB; } gettimeofday(&myStop, 0);
timersub(&myStop,&myStart,&myResult);
std::cout << "Time: " << myResult.tv_sec*1000 + myResult.tv_usec/1000.0 << std::endl; std::cout << "t1: " << t1 << std::endl; }
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
participants (3)
-
Gaetano Mendola
-
Julien Nitard
-
Vicente J. Botet Escriba