[boost] performance of a linear algebra/matrix library

6 May 2010

      hi all

i work on a linear algebra library with an intent to propose it for
inclusion into boost collection of libs

so my question is: how fast such a lib should be to make everybody
happy?

i found that to increase the performance of a generic linear algebra
library one must complicate the implementation to a very high degree
(though this statement should be obvious to everyone)

i struggled with poor performance of matrix multiplication and finally
identified the bottleneck

i improved the implementation (seriosly complicating it, i'm afraid i
will forget how it works in a month) and now it performs some 33%
slower than C code for virtually any size of matrix

the C code is as follows:

  typedef double type;
  const size_t n = 42;
  type
    *a = new type[n*n],
    *a1 = new type[n*n],
    *a2 = new type[n*n];
  //filling 'a1' and 'a2'
  for (size_t i = 0; i<n; ++i)
  {
    type * const a_i = a + i*n;
    for (size_t j = 0; j<n; ++j)
      a_i[j] = 0.;
    for (size_t k = 0; k<n; ++k)
    {
      type *a = a_i;
      const type a1_ik = a1[i*n + k];
      const type *a2_k = a2 + k*n;
      for (size_t j = 0; j<n; ++j, ++a, ++a2_k)
        *a += a1_ik*(*a2_k);
    }
  }

the C++ code is just

  matrix<type> m(n, n), m1(n, n), m2(n, n);
  //filling 'm1' and 'm2'
  m.assign(m1, m2);

more detailed results:
'n' is the dimensions of matrices, 'ratio' is runtime of C++ code
divided by runtime of C code (time_cpp/time_c)

n     |    2 |    4 |    8 |   16 |   32 |   64 |  128 |  256 |  512 | 1024 | 2048
ratio | 1,77 | 1,57 | 1,34 | 1,12 | 1,05 | 1,12 | 1,32 | 1,33 | 1,23 | 1,23 | 1,23

as you can see starting with 8 by 8 matices C++ code is roughly 33%
slower than C code

is that enough to make you (the reader) happy?

--
Pavel

[boost] performance of a linear algebra/matrix library

DE