Re: [Boost-users] Container iteration macro that is equivalent to handcoded iteration?

11 Oct 2007

      ...
Michael Marcin wrote:
...
Erik wrote:
...
Or is it possible to configure BOOST_FOREACH to be as efficient as my macro?      
I don't know but that is a good question.
I considered using BOOST_FOREACH until I checked its generated output... 
which was worse than std::for_each with a boost::bind which was worse 
than std::for_each with a hand coded functor which was worse than a hand 
coded for loop like yours above.
I won't deny that the abstraction penalty of BOOST_FOREACH is not zero, 
but have either of you actually measured the overhead? I have, and I 
found BOOST_FOREACH to be about 5% slower than the equivalent hand-coded 
loop when compiler optimizations are turned on. It's really very small, 
and that 5% buys you a lot of expressivity. YMMV
I have not made any timings, but looked at the number of instructions in
Eric Niebler skrev:
the assembly. I assume this corresponds to code size. The good news is
that with g++-4.2.0 -O3 there appears to be no abstraction penalty at
all. BOOST_FOREACH is equivalent to the handcoded loop and my macro.
They both have 21 instructions (which appear to be equivalent; some of
them are reordered and some have their parameters swapped). But with -Os
it is possible to get only 20 instructions with handcoded/my macro,
while BOOST_FOREACH actually gives 57 instructions (-Os giving more
instructions than -O3 for ANY code looks like a compiler bug to me). And
then I tried -O2 of course. handcoded/my macro yield 21 instructions
while BOOST_FOREACH  yields 31.

With g++-4.1.2 handcoded/my macro gives the same result as with
g++-4.2.0. But with BOOST_FOREACH it is another story. At -O3
BOOST_FOREACH yields as much as 28 instructions, at -O2 35 instructions
and at -Os 90 instructions.

So I suppose that as long as everyone uses g++-4.2.0 (or higher I i
suppose but did not test) and -O3 it should be fine to use
BOOST_FOREACH. Those who rely on getting small code with
-Os can forget about using BOOST_FOREACH with any of those compiler
versions.

So it seems like the gcc people did a quite good job of optimizing this
very complex hack. I suppose I will suggest that we start using
BOOST_FOREACH in our project. We already use -O3 for release builds and
in a few months (or maybe a year) almost everyone will have at least
gcc-4.2.0. But it would be interesting if others could verify my
measurements and maybe test other examples of code. I did some tests and
got the exact same results for the following use of BOOST_FOREACH (and
the corresponding use of my macro and a handcoded loop):
//////////////////////////////////////////////////////
#include <boost/foreach.hpp>

#include <vector>

void f(float);
struct A {
   void g() const;
   std::vector<float> const v;
};

void A::g() const {
   BOOST_FOREACH(float i, v)
      f(i);
}
//////////////////////////////////////////////////////