sorry, I have overseen it somehow... Probably too late already... That's true, that array argument to a function is a pointer. Sorry.
I have tested the code with gcc under MacOS X, the function with arguments as 3 doubles was NOT necesserely faster as the boost::array param. It might be not worse implementing the macro based approach...
Here is my test app compiled with O3 optimization flags. (Below are the timing on a Dual Core 2.4 processor machine with 4 gb ram)
//============================================================================
// Name : CppTest.cpp
// Author :
// Version :
// Copyright : Your copyright notice
// Description : Hello World in C++, Ansi-style
//============================================================================
#include <iostream>
#include <numeric>
#include <boost/array.hpp>
#include <boost/progress.hpp>
using namespace std;
typedef boost::array<double, 3> array_type;
array_type::value_type sum_arr_copy(array_type a)
{
return (a[0]+=a[1])+=a[2];
}
array_type::value_type sum_carr(array_type const& a)
{
return a[0]+a[1]+a[2];
}
array_type::value_type sum_carr_temp_result(array_type const& a)
{
double result=a[0];
return (result+=a[1])+=a[2];
}
array_type::value_type sum_carr_accumulate(array_type const& a)
{
return std::accumulate(a.begin(), a.end(), 0);
}
array_type::value_type sum_arr_copy_accumulate(array_type const& a)
{
return std::accumulate(a.begin(), a.end(), 0);
}
double sum_doubles_copy(double d1, double d2, double d3)
{
return d1+d2+d3;
}
double sum_doubles_copy_optimized(double d1, double d2, double d3)
{
return (d1+=d2)+=d3;
}
double sum_doubles_copy_temp_result(double d1, double d2, double d3)
{
double result=d1;
return (result+=d2)+=d3;
}
double sum_doubles_ref_temp_result(double const& d1, double const& d2, double const d3)
{
double result=d1;
return (result+=d2)+=d3;
}
/// sorry for macro code. I was too lazy to copy paste it...
#define ARR_P() a
#define DBL_P() d1,d2,d3
#define DO_TEST(function, param_seq) \
{ \
cout << "--------------------------\n" \
<< #function"\n"; \
double result=0; \
{ \
boost::progress_timer t; \
for(unsigned long i=0; i<times; ++i) \
result+=function(param_seq()); \
} \
cout << "result: " << result << "\n"; \
}
int main() {
const unsigned long times=~0;
array_type a = {1.0, 2.3, 3.33};
double d1 = 1.0, d2 = 2.3, d3 = 3.33;
DO_TEST(sum_arr_copy, ARR_P);
DO_TEST(sum_carr, ARR_P);
DO_TEST(sum_arr_copy_accumulate, ARR_P);
DO_TEST(sum_carr_temp_result, ARR_P);
DO_TEST(sum_carr_accumulate, ARR_P);
DO_TEST(sum_doubles_copy, DBL_P);
DO_TEST(sum_doubles_copy_optimized, DBL_P);
DO_TEST(sum_doubles_copy_temp_result, DBL_P);
DO_TEST(sum_doubles_ref_temp_result, DBL_P);
cout << "tests done" << endl;
return 0;
}
Please take a look, that accumulate produced wrong result.
Resulting timings after running each test 4294967295 times:
--------------------------
sum_arr_copy
65.19 s
result: 2.84756e+10
--------------------------
sum_carr
14.99 s
result: 2.84756e+10
--------------------------
sum_arr_copy_accumulate
59.97 s
result: 2.57698e+10
--------------------------
sum_carr_temp_result
14.97 s
result: 2.84756e+10
--------------------------
sum_carr_accumulate
59.84 s
result: 2.57698e+10
--------------------------
sum_doubles_copy
15.00 s
result: 2.84756e+10
--------------------------
sum_doubles_copy_optimized
14.98 s
result: 2.84756e+10
--------------------------
sum_doubles_copy_temp_result
14.92 s
result: 2.84756e+10
--------------------------
sum_doubles_ref_temp_result
14.93 s
result: 2.84756e+10
tests done
I hope that helps and will not require you to develop some solution, which might not be worth the effort.
Good Luck!
Ovanes
AMDGAh. You still need the preprocessor, but you can rearrange the
Hicham Mouline wrote:
> After some though, here is more precisely what I'd like to have...
> I apologize that it is quite different from the initial problem:
>
> template<int n> class Tree {
> static double sum(); //
> };
>
>
> If the user instantiates tree<2>, he should get:
>
> template<> class Tree<2> {
> static double sum(double d1, double d2);
> };
>
> template<> class Tree<3> {
> static double sum(double d1, double d2, double d3);
> };
> etc etc...
>
>
> so that in user code, for e.g.:
>
> double d= Tree<4>::sum(d1, d2, d3, d4);
>
> should compile.
>
>
> Is it possible for me to just define the template Tree for the n-case
> without the 2- and 3- specializations?
>
definitions slightly.
(untested)
template<int N>
struct TreeSumImpl;
#define TREE_SUM_DEF(z, n, data)\
template<>\
struct TreeSumImpl<n> {\
static double sum(BOOST_PP_ENUM_PARAMS_Z(z, n, double arg)) { ... }\
};
BOOST_PP_REPEAT(20, TREE_SUM_DEF, ~)
template<int N>
struct Tree : TreeSumImpl<N> {
// other code};
In Christ,
Steven Watanabe
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users