
Hello, I did some performance tests with uBLAS and MTL. Here are some results for those who is interested. The following tests were run: 1) uBLAS dense matrix multiplication: - row_major * row_major: ures.assign(prod(ur, ur)) - row_major * column_major: ures.assign(prod(ur, uc)) - column_major * row_major: ures.assign(prod(uc, ur)) 2) MTL dense matrix multiplication: - row_major * row_major: mult(mr, mr, mres) - row_major * column_major: mult(mr, mc, mres) - column_major * row_major: mult(mc, mr, mres) 3) C matrix multiplication (basic multiplication algorithm for C array). 4) uBLAS sparse matrix multiplication (20% of non-zera elements): - row_major * row_major: ures.assign(prod(ur, ur)) - row_major * column_major: ures.assign(prod(ur, uc)) - column_major * row_major: ures.assign(prod(uc, ur)) ures - is dense matrix. 5)MTL sparse marix multiplication (20% of non-zera elements): - row_major * row_major: mult(mr, mr, mres) - row_major * column_major: mult(mr, mc, mres) - column_major * row_major: mult(mc, mr, mres) mres - is dense matrix. All tests were run on Windows2000, I used gcc 3.2 with -O3 optimization flag. boost 1.29 was used. Here are some results: 1) uBLAS and MTL have approximately the same performance for dense matrix mult. uBLAS is a bit faster with small matrices (< 50-100), MTL is faster with large ones (>100). 2) When working with C array multiplication it is 5-6 times faster than both uBLAS and MTL. (!!!) 3) If I use my own simple mult funcion to mutiply uBLAS or MTL matrices: template <typename Mat> void mat_mat_mult(const Mat& m1, const Mat& m2, Mat& res, int rank) { for(int i = 0; i < rank; ++i) for(int j = 0; j < rank; ++j) { res(i, j) = 0; for(int k = 0; k < rank; ++k) res(i, j) += m1(i, k) * m2(k, j); } } It works 2 times faster than native uBLAS and MTL implementations of multiplication. (Iterators overhead?) 4) Dense matrix performance doesn't depend on row orientation neither for uBLAS nor for MTL. 5) MTL sparse matrix mult doesn't depend on row orientation and in any case much (2 times) faster than the best case for uBLAS. 6) uBLAS gives best performance for sparse matrices in row_major * column_major case (it is 3 times faster than column_major * row_major). 7) When I tryed latest uBLAS from boost cvs. It worked much (up to 2 times) slower for sparse matrices than 1.29. I would need to do other tests to make any real conclusions, but preliminary it seams to me that abstraction penalty for both uBLAS and MTL is too big. Alexei.

Hi Alexei, you wrote:
I did some performance tests with uBLAS and MTL. Here are some results for those who is interested. The following tests were run: 1) uBLAS dense matrix multiplication: - row_major * row_major: ures.assign(prod(ur, ur)) - row_major * column_major: ures.assign(prod(ur, uc)) - column_major * row_major: ures.assign(prod(uc, ur)) 2) MTL dense matrix multiplication: - row_major * row_major: mult(mr, mr, mres) - row_major * column_major: mult(mr, mc, mres) - column_major * row_major: mult(mc, mr, mres) 3) C matrix multiplication (basic multiplication algorithm for C array). 4) uBLAS sparse matrix multiplication (20% of non-zera elements): - row_major * row_major: ures.assign(prod(ur, ur)) - row_major * column_major: ures.assign(prod(ur, uc)) - column_major * row_major: ures.assign(prod(uc, ur)) ures - is dense matrix. 5)MTL sparse marix multiplication (20% of non-zera elements): - row_major * row_major: mult(mr, mr, mres) - row_major * column_major: mult(mr, mc, mres) - column_major * row_major: mult(mc, mr, mres) mres - is dense matrix.
All tests were run on Windows2000, I used gcc 3.2 with -O3 optimization flag. boost 1.29 was used.
Here are some results: 1) uBLAS and MTL have approximately the same performance for dense matrix mult. uBLAS is a bit faster with small matrices (< 50-100), MTL is faster with large ones (>100). 2) When working with C array multiplication it is 5-6 times faster than both uBLAS and MTL. (!!!)
This is too much abstraction penalty for uBLAS. Did you define -DNDEBUG (enabling expression templates and disabling bounds and type checks)?
3) If I use my own simple mult funcion to mutiply uBLAS or MTL matrices:
template <typename Mat> void mat_mat_mult(const Mat& m1, const Mat& m2, Mat& res, int rank) { for(int i = 0; i < rank; ++i) for(int j = 0; j < rank; ++j) { res(i, j) = 0; for(int k = 0; k < rank; ++k) res(i, j) += m1(i, k) * m2(k, j); } }
It works 2 times faster than native uBLAS and MTL implementations of multiplication. (Iterators overhead?)
uBLAS normally doesn't use iterators when multiplying dense matrices.
4) Dense matrix performance doesn't depend on row orientation neither for uBLAS nor for MTL.
Maybe if your matrices fit into the cache, otherwise one probably should use blocked operations.
5) MTL sparse matrix mult doesn't depend on row orientation and in any case much (2 times) faster than the best case for uBLAS. 6) uBLAS gives best performance for sparse matrices in row_major * column_major case (it is 3 times faster than column_major * row_major). 7) When I tryed latest uBLAS from boost cvs. It worked much (up to 2 times) slower for sparse matrices than 1.29.
There are more debug runtime checks for sparse matrices now (see the remark regarding NDEBUG).
I would need to do other tests to make any real conclusions, but preliminary it seams to me that abstraction penalty for both uBLAS and MTL is too big.
Best regards Joerg

Thanks Joerg,
This is too much abstraction penalty for uBLAS. Did you define -
DNDEBUG
(enabling expression templates and disabling bounds and type checks)?
Oh yeah. NDEBUG really makes a difference. Now dense matrix performance rocks. I was confused a bit, cause usually special flag is used to compile debug version, not release one. But it works too as long as one knows about the trick. I will test more with sparse matrices. Regards. Alexei.

I know this is going to sound terribly naive, but how does one define DNDEBUG? I am using VS.NET and Borland Enterprise 6.0. Thanks! Jon Agiato JonAgiato@nyc.rr.com ----- Original Message ----- From: alexei_novakov To: Boost-Users@yahoogroups.com Sent: Friday, November 22, 2002 4:15 PM Subject: [Boost-Users] Re: uBLAS, MTL performance tests. Thanks Joerg,
This is too much abstraction penalty for uBLAS. Did you define -
DNDEBUG
(enabling expression templates and disabling bounds and type checks)?
Oh yeah. NDEBUG really makes a difference. Now dense matrix performance rocks. I was confused a bit, cause usually special flag is used to compile debug version, not release one. But it works too as long as one knows about the trick. I will test more with sparse matrices. Regards. Alexei. Info: http://www.boost.org Wiki: http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl Unsubscribe: mailto:boost-users-unsubscribe@yahoogroups.com Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service. [Non-text portions of this message have been removed]

--- Jon Agiato
I know this is going to sound terribly naive, but how does one define DNDEBUG? I am using VS.NET and Borland Enterprise 6.0. Thanks!
add -DNDEBUG to command line arguments. Unfortunately I don't know where exactly you can edit command line arguments in VC.NET. Alexei.
Jon Agiato JonAgiato@nyc.rr.com
----- Original Message ----- From: alexei_novakov To: Boost-Users@yahoogroups.com Sent: Friday, November 22, 2002 4:15 PM Subject: [Boost-Users] Re: uBLAS, MTL performance tests.
Thanks Joerg,
This is too much abstraction penalty for uBLAS.
Did you define - DNDEBUG
(enabling expression templates and disabling bounds and type checks)?
Oh yeah. NDEBUG really makes a difference. Now dense matrix performance rocks. I was confused a bit, cause usually special flag is used to compile debug version, not release one. But it works too as long as one knows about the trick.
I will test more with sparse matrices.
Regards.
Alexei.
__________________________________________________ Do you Yahoo!? Yahoo! Mail Plus � Powerful. Affordable. Sign up now. http://mailplus.yahoo.com

Thanks for the response Alexei, I'll look into it and report back in case
anyone else is interested.
Jon Agiato
JonAgiato@nyc.rr.com
----- Original Message -----
From: "Alexei Novakov"
--- Jon Agiato
wrote: I know this is going to sound terribly naive, but how does one define DNDEBUG? I am using VS.NET and Borland Enterprise 6.0. Thanks!
add -DNDEBUG to command line arguments. Unfortunately I don't know where exactly you can edit command line arguments in VC.NET.
Alexei.
Jon Agiato JonAgiato@nyc.rr.com
----- Original Message ----- From: alexei_novakov To: Boost-Users@yahoogroups.com Sent: Friday, November 22, 2002 4:15 PM Subject: [Boost-Users] Re: uBLAS, MTL performance tests.
Thanks Joerg,
This is too much abstraction penalty for uBLAS.
Did you define - DNDEBUG
(enabling expression templates and disabling bounds and type checks)?
Oh yeah. NDEBUG really makes a difference. Now dense matrix performance rocks. I was confused a bit, cause usually special flag is used to compile debug version, not release one. But it works too as long as one knows about the trick.
I will test more with sparse matrices.
Regards.
Alexei.
__________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
Info: http://www.boost.org Wiki: http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl Unsubscribe: mailto:boost-users-unsubscribe@yahoogroups.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
participants (4)
-
Alexei Novakov
-
alexei_novakov
-
jhr.walter@t-online.de
-
Jon Agiato