Re: [boost] [local] Help for the Alternatives section

3 Apr 2011

      On Sun, Apr 3, 2011 at 1:23 PM, Lorenzo Caminiti <lorcaminiti@gmail.com> wrote:
...
I have done an (hopefully more correct) benchmark of Boost.Local
performances compared with the alternative methods -- please check my
doing :)
In summary:
1) Boost.Phoenix, global functors, and local functors run in ~15s.
2) Boost.Lambda runs in ~40s.
3) Boost.Local runs in ~53s.
4) I don't have a C++0x lambda compiler so I could not benchmark C++0x lambdas.
I don't know why Boost.Lambda takes longer than 1). Boost.Local seems
to take longer than 1) because of the extra virtual function call
introduced by the trick that allows to pass the local struct as a
template parameter -- I will follow up with another email to explain
this point in detail.
This is the trick used by Boost.Local to call a function defined
within a local struct from a functor passed as template parameter:

1) The local struct inherits from abstract_function and implementing operator().
2) The local function is of a non-local type `function` so it can be
passed as template parameter.
3) The functor `function` calls abstract_function's virtual operator()
which in turn calls the local struct implementation of operator().

The following code shows the idea:

#include <iostream>
#include <vector>
#include <algorithm>

#define N 10000

template<typename R, typename A0>
struct abstract_function {
    virtual R operator()(A0) = 0;
};

template<typename R, typename A0>
struct function {
    function(abstract_function<R, A0>& ref): ptr_(&ref) {}
    R operator()(A0 a0) { return (*ptr_)(a0); }
private:
    abstract_function<R, A0>* ptr_;
};

int main() {
    double sum = 0.0;
    int factor = 10;

    struct add_function: abstract_function<void, const double&> {
        add_function(double& _sum, const int& _factor):
                sum_(_sum), factor_(_factor) {}
        void operator()(const double& num) { return body(num, sum_, factor_); }
    private:
        double& sum_;
        const int& factor_;
        void body(const double& num, double& sum, const int& factor) {
            sum += factor * num;
        }
    };
    add_function functor_add(sum, factor);
    function<void, const double&> add(functor_add);

    std::vector<double> v(N * 100);
    std::fill(v.begin(), v.end(), 10);

    for (size_t n = 0; n < N; ++n) {
//        for (size_t i = 0; i < v.size(); ++i) {
//            functor_add(v[i]); // (1)
//            add(v[i]);  // (2)
//        }
        std::for_each(v.begin(), v.end(), add); // (3) OK add as tparam!
    }

    std::cout << sum << std::endl;
    return 0;
}

The call at line (3) calls functionr::operator() --<<virtual>>-->
abstract_function::operator() --> add_function::operator() so the
local struct operator() is called.

A) Now if I comment line (3), uncomment the for-loop and line (2):

    ...
    for (size_t n = 0; n < N; ++n) {
        for (size_t i = 0; i < v.size(); ++i) {
//            functor_add(v[i]); // (1)
            add(v[i]);  // (2)
        }
//        std::for_each(v.begin(), v.end(), add); // (3) OK add as tparam!
    }
    ...

This runs in 49s.

B) But if I use line (1) instead of (2):

    ...
    for (size_t n = 0; n < N; ++n) {
        for (size_t i = 0; i < v.size(); ++i) {
            functor_add(v[i]); // (1)
//            add(v[i]);  // (2)
        }
//        std::for_each(v.begin(), v.end(), add); // (3) OK add as tparam!
    }
    ...

This runs in 15s!!

As far I can see, line (1) is faster because it does not have the
overhead of the virtual function call that line (2) has:

A) Line (2) call: functionr::operator() --<<virtual>>-->
abstract_function::operator() --> add_function::operator() (runs in
49s)
B) Line (1) call: add_function::operator() (runs in 15s)

What do you think? Is there any way I can make the A) faster (i.e.,
run-time similarly to B))?

Thank you very much.

--
Lorenzo