data:image/s3,"s3://crabby-images/b051f/b051fe4e968e76c84b4b8df9d987cb387973c9f9" alt=""
Hello there. I am implementing matrix multiplication and am trying to have my stuff run as fast as possible. But then I noticed something strange. Using Xcode, resultmatrix = prod(matrix1,matrix2); is 10 to 16 times slower (with 100x100 matrixes) than simply doing something like that: for(...) for(...) for(..) multiply_things(); And 20 to 40 times slower than a simple threaded approach. Using visual studio, it's 2 times faster than naive multiplication. This is on a release build, with #define BOOST_UBLAS_NDEBUG 1 #define NDEBUG 1 So what gives?
data:image/s3,"s3://crabby-images/36f02/36f0244623cd1f52e247defdcee259c8b80286a6" alt=""
On 30 Jun 2010, at 16:55, Raphaël Amiot wrote:
Hello there.
I am implementing matrix multiplication and am trying to have my stuff run as fast as possible. But then I noticed something strange. Using Xcode,
resultmatrix = prod(matrix1,matrix2);
is 10 to 16 times slower (with 100x100 matrixes) than simply doing something like that:
for(...) for(...) for(..) multiply_things();
And 20 to 40 times slower than a simple threaded approach.
Using visual studio, it's 2 times faster than naive multiplication.
This is on a release build, with #define BOOST_UBLAS_NDEBUG 1 #define NDEBUG 1
So what gives?_______________________________________________
Could you provide a small complete program which shows your problem? Chris
data:image/s3,"s3://crabby-images/b051f/b051fe4e968e76c84b4b8df9d987cb387973c9f9" alt=""
Code here. You'll need to change the path to boost headers in the includes to make it run.
Part that runs way too slow:
chrono.begin();
for(int i=0; i On 30 Jun 2010, at 16:55, Raphaël Amiot wrote: Hello there. I am implementing matrix multiplication and am trying to have my stuff run as fast as possible.
But then I noticed something strange.
Using Xcode, resultmatrix = prod(matrix1,matrix2); is 10 to 16 times slower (with 100x100 matrixes) than simply doing something like that: for(...)
for(...)
for(..)
multiply_things(); And 20 to 40 times slower than a simple threaded approach. Using visual studio, it's 2 times faster than naive multiplication. This is on a release build, with
#define BOOST_UBLAS_NDEBUG 1
#define NDEBUG 1 So what gives?_______________________________________________ Could you provide a small complete program which shows your problem? Chris _______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (2)
-
Christopher Jefferson
-
Raphaël Amiot