Hi,
2) Then, it is suggested furthermore that if foo() is a (simple?) nested function, say T=double and,
double foo(double x) { return 2*x; }
then one should do 'functional composition', i.e. create the functor for the previous point 1) by
boost::bind(std::multiplies<double>(),2,_1)
Replace all that by just 2*_1
As you say, all this is very trivial, and was meant to understand why Karlsson is exactly doing this, i.e. replacing something like 2*_1 with std::multiplies<double>(),2,_1 (Its on page 256 in my copy in chapter Library 9: Bind) So, I was trying to see if the compiler transforms the latter into the former - which seemingly does not happen. (which leaves me a little nervous about what boost::bind will do to you in the end anyway)
the way you did it your iteration in your original post isn't really the best way to iterate, albeit most compilers will optimize that
as said, the code was not meant to be optimal, rather I was (naively) hoping that either case 1 or 2 would be comparable to case 3
At least for this little program, the old-style code is faster by roughly a factor of 2 as compared to the boost::bind approach. (compiled with gcc version 4.3.2 and optimized)
I ran your test. Case 3 runs in 0.65s, case 1 and 2 in 0.82s. That's with GCC 4.4.3.
That is really strange? For me Case 3 runs in ~1.8...1.9s, case 1 and 2 in 0.81...0.84s (I compile with gcc -O3 -Wall -lstdc++, nothing special) This leaves me even more nervous.
Comparing 3 and 1 is not even fair, since 3 does the unnecessary sin and cos calculations like 2, but the compiler seems to optimize these.
Well, that is why I have explicitly included both cases 1 and 2 in the code and already commented on this in the code, in order to show what you seem to have found yourself, namely, that it does not seem to be the number of calls to cos and sin which accounts for the runtime of case 1 vs. 2 since, as for you, case 1 and 2 need approximately the same time for me. Do you suggest, that if I'd look at the assembler I'd find only one sin and one cos call for case 1 and 2 but I'd find more calls for them in case 3 ? Again, this would leave me nervous with boost::bind an gcc. Moreover I have included case 0, where I perform no access to the vector<T> at all and only loop over the sin(), cos(), and random() stuff. For me this case runs ~0.16...0.19s. I.e. it seems that the 'old-fashioned' C-things are definitely not the problem, at least with gcc. Finally, I don't understand at all why for you case 3 runs even shorter than case 1 and 2. All this is very 'miraculous' - and I understand even less now when to go for functional composition with boost::bind ;) wb