Container iteration macro that is equivalent to handcoded iteration?
data:image/s3,"s3://crabby-images/8ac67/8ac674cdc2e195af5da24e64f987283bf48693a8" alt=""
I have a lot of handcoded loops that look something like this:
//////////////////////////////////////////////////////////////////////////////
#include <vector>
void f(float);
void g(std::vector<float> const & v) {
std::vector<float>::const_iterator const v_end = v.end();
for (std::vector<float>::const_iterator it = v.begin(); it != v_end; ++it)
f(*it);
}
//////////////////////////////////////////////////////////////////////////////
I need to replace it with something equivalent but simpler. I tried
BOOST_FOREACH (from boost-1.34):
//////////////////////////////////////////////////////////////////////////////
#include
From this I conclude that my macro is as good as a handcoded loop. (Although I of course wonder why the compiler chose to move "movl 4(%eax), %esi" further down and swap the parameters of cmpl from "%ebx, %esi" to "%esi, %ebx".)
But my macro is not as versatile as BOOST_FOREACH. Is there any way to improve it? In particular, is it possible get rid of the macro parameter value_type? Is it possible to make it work with other containers than vectors, as long as they define const_iterator/iterator, begin, end and value_type? Something like this maybe: ////////////////////////////////////////////////////////////////////////////// #define iterate_container_const(it, v) \ for \ (struct { \ typeof(v)::const_iterator current; \ typeof(v)::const_iterator const end; \ typeof(v)::value_type const & operator*() {return *current;} \ } it = {v.begin(), v.end()}; \ it.current != it.v_end; \ ++it.current) \ #include <vector> void f(float); void g(std::vector<float> const & v, std::list<float> const & l) { iterate_containter_const(it, v) f(*it); iterate_containter_const(it, l) f(*it); } ////////////////////////////////////////////////////////////////////////////// Or is it possible to configure BOOST_FOREACH to be as efficient as my macro?
data:image/s3,"s3://crabby-images/6cb8a/6cb8ab24223832e0e3097d239b5684350994d26f" alt=""
Erik wrote:
I have a lot of handcoded loops that look something like this: ////////////////////////////////////////////////////////////////////////////// #include <vector>
void f(float); void g(std::vector<float> const & v) { std::vector<float>::const_iterator const v_end = v.end(); for (std::vector<float>::const_iterator it = v.begin(); it != v_end; ++it) f(*it); } //////////////////////////////////////////////////////////////////////////////
I need to replace it with something equivalent but simpler. I tried BOOST_FOREACH (from boost-1.34): ////////////////////////////////////////////////////////////////////////////// #include
#include <vector>
void f(float); void g(std::vector<float> const & v) { BOOST_FOREACH(float i, v) f(i); } //////////////////////////////////////////////////////////////////////////////
But when I compared the assembly output of this with that of the handcoded version I discovered that the boost version is more complicated, so I tried to create something better myself, with the requirement that the generated code is not allowed to be any more complicated than that of the handcoded loop. The following is the best I could think of right now: ////////////////////////////////////////////////////////////////////////////// #define iterate_vector_const(value_type, it, v) \ for \ (struct { \ std::vector
::const_iterator current; \ std::vector ::const_iterator const end; \ value_type const & operator*() {return *current;} \ } it = {v.begin(), v.end()}; \ it.current != it.end; \ ++it.current) \ #include <vector>
void f(float); void g(std::vector<float> const & v) { iterate_vector_const(float, it, v) f(*it); } //////////////////////////////////////////////////////////////////////////////
I looked at the difference between my macro and the handcoded loop: $ diff -U2 iteration-handcoded-g++-4.2.0-O3.S iteration-iterate_vector-g++-4.2.0-O3.S --- iteration-handcoded-g++-4.2.0-O3.S 2007-10-07 19:38:56.000000000 +0200 +++ iteration-iterate_vector-g++-4.2.0-O3.S 2007-10-07 20:00:53.000000000 +0200 @@ -1,3 +1,3 @@ - .file "iteration-handcoded.cc" + .file "iteration-iterate_vector.cc" .text .align 2 @@ -18,7 +18,7 @@ .LCFI4: movl 8(%ebp), %eax - movl 4(%eax), %esi movl (%eax), %ebx - cmpl %ebx, %esi + movl 4(%eax), %esi + cmpl %esi, %ebx je .L4 .p2align 4,,7
From this I conclude that my macro is as good as a handcoded loop. (Although I of course wonder why the compiler chose to move "movl 4(%eax), %esi" further down and swap the parameters of cmpl from "%ebx, %esi" to "%esi, %ebx".)
But my macro is not as versatile as BOOST_FOREACH. Is there any way to improve it? In particular, is it possible get rid of the macro parameter value_type? Is it possible to make it work with other containers than vectors, as long as they define const_iterator/iterator, begin, end and value_type? Something like this maybe: ////////////////////////////////////////////////////////////////////////////// #define iterate_container_const(it, v) \ for \ (struct { \ typeof(v)::const_iterator current; \ typeof(v)::const_iterator const end; \ typeof(v)::value_type const & operator*() {return *current;} \ } it = {v.begin(), v.end()}; \ it.current != it.v_end; \ ++it.current) \
#include <vector>
void f(float); void g(std::vector<float> const & v, std::list<float> const & l) { iterate_containter_const(it, v) f(*it); iterate_containter_const(it, l) f(*it); } //////////////////////////////////////////////////////////////////////////////
Or is it possible to configure BOOST_FOREACH to be as efficient as my macro?
I don't know but that is a good question. I considered using BOOST_FOREACH until I checked its generated output... which was worse than std::for_each with a boost::bind which was worse than std::for_each with a hand coded functor which was worse than a hand coded for loop like yours above. - Michael Marcin p.s. Compilers make me sad
data:image/s3,"s3://crabby-images/459b0/459b05c510e36271c5487efcfc0bde5e3554adf1" alt=""
Michael Marcin wrote:
Erik wrote:
Or is it possible to configure BOOST_FOREACH to be as efficient as my macro?
I don't know but that is a good question.
I considered using BOOST_FOREACH until I checked its generated output... which was worse than std::for_each with a boost::bind which was worse than std::for_each with a hand coded functor which was worse than a hand coded for loop like yours above.
I won't deny that the abstraction penalty of BOOST_FOREACH is not zero, but have either of you actually measured the overhead? I have, and I found BOOST_FOREACH to be about 5% slower than the equivalent hand-coded loop when compiler optimizations are turned on. It's really very small, and that 5% buys you a lot of expressivity. YMMV. -- Eric Niebler Boost Consulting www.boost-consulting.com
data:image/s3,"s3://crabby-images/58c09/58c0952898ca00532e5d2618d64b370cc90a8b9b" alt=""
-----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Eric Niebler Sent: Wednesday, October 10, 2007 3:04 PM To: boost-users@lists.boost.org Subject: Re: [Boost-users] Container iteration macro that is equivalent tohandcoded iteration?
Michael Marcin wrote:
Erik wrote:
Or is it possible to configure BOOST_FOREACH to be as efficient as
my macro?
I don't know but that is a good question.
I considered using BOOST_FOREACH until I checked its generated output... which was worse than std::for_each with a boost::bind which was worse than std::for_each with a hand coded functor which was worse than a hand coded for loop like yours above.
I won't deny that the abstraction penalty of BOOST_FOREACH is not zero, but have either of you actually measured the overhead? I have, and I found BOOST_FOREACH to be about 5% slower than the equivalent hand-coded loop when compiler optimizations are turned on. It's really very small, and that 5% buys you a lot of expressivity. YMMV.
-- Eric Niebler Boost Consulting www.boost-consulting.com _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Where does the 5% come from? Which part of BOOST_FOREACH (which is way to advanced for me to understand) is the part that causes the most trouble with compilers? Is it the extra if statement trick that seems to be used to initialize the variable? -- John
data:image/s3,"s3://crabby-images/459b0/459b05c510e36271c5487efcfc0bde5e3554adf1" alt=""
John Femiani wrote:
From: Eric Niebler
I won't deny that the abstraction penalty of BOOST_FOREACH is not zero, but have either of you actually measured the overhead? I have, and I found BOOST_FOREACH to be about 5% slower than the equivalent hand-coded loop when compiler optimizations are turned on. It's really very small, and that 5% buys you a lot of expressivity. YMMV.
Where does the 5% come from? Which part of BOOST_FOREACH (which is way to advanced for me to understand) is the part that causes the most trouble with compilers? Is it the extra if statement trick that seems to be used to initialize the variable?
There is a nested for loop that executes once per iteration, and it's only there so that the iteration variable can be a reference. Its presence incurs an extra test and set of a hidden bool at each iteration. And on some compilers, there are a couple of extra Boolean checks before entering the loop. (Aside: please try to limit the amount of text you quote in your messages.) -- Eric Niebler Boost Consulting www.boost-consulting.com
data:image/s3,"s3://crabby-images/8ac67/8ac674cdc2e195af5da24e64f987283bf48693a8" alt=""
Michael Marcin wrote:
Erik wrote:
Or is it possible to configure BOOST_FOREACH to be as efficient as my macro? I don't know but that is a good question.
I considered using BOOST_FOREACH until I checked its generated output... which was worse than std::for_each with a boost::bind which was worse than std::for_each with a hand coded functor which was worse than a hand coded for loop like yours above.
I won't deny that the abstraction penalty of BOOST_FOREACH is not zero, but have either of you actually measured the overhead? I have, and I found BOOST_FOREACH to be about 5% slower than the equivalent hand-coded loop when compiler optimizations are turned on. It's really very small, and that 5% buys you a lot of expressivity. YMMV I have not made any timings, but looked at the number of instructions in
Eric Niebler skrev:
the assembly. I assume this corresponds to code size. The good news is
that with g++-4.2.0 -O3 there appears to be no abstraction penalty at
all. BOOST_FOREACH is equivalent to the handcoded loop and my macro.
They both have 21 instructions (which appear to be equivalent; some of
them are reordered and some have their parameters swapped). But with -Os
it is possible to get only 20 instructions with handcoded/my macro,
while BOOST_FOREACH actually gives 57 instructions (-Os giving more
instructions than -O3 for ANY code looks like a compiler bug to me). And
then I tried -O2 of course. handcoded/my macro yield 21 instructions
while BOOST_FOREACH yields 31.
With g++-4.1.2 handcoded/my macro gives the same result as with
g++-4.2.0. But with BOOST_FOREACH it is another story. At -O3
BOOST_FOREACH yields as much as 28 instructions, at -O2 35 instructions
and at -Os 90 instructions.
So I suppose that as long as everyone uses g++-4.2.0 (or higher I i
suppose but did not test) and -O3 it should be fine to use
BOOST_FOREACH. Those who rely on getting small code with
-Os can forget about using BOOST_FOREACH with any of those compiler
versions.
So it seems like the gcc people did a quite good job of optimizing this
very complex hack. I suppose I will suggest that we start using
BOOST_FOREACH in our project. We already use -O3 for release builds and
in a few months (or maybe a year) almost everyone will have at least
gcc-4.2.0. But it would be interesting if others could verify my
measurements and maybe test other examples of code. I did some tests and
got the exact same results for the following use of BOOST_FOREACH (and
the corresponding use of my macro and a handcoded loop):
//////////////////////////////////////////////////////
#include
data:image/s3,"s3://crabby-images/767fc/767fc7a1aac0195406cf316fa3358d4381b30e69" alt=""
On 10/11/07, Erik
I have not made any timings, but looked at the number of instructions in the assembly. I assume this corresponds to code size. The good news is that with g++-4.2.0 -O3 there appears to be no abstraction penalty at all. BOOST_FOREACH is equivalent to the handcoded loop and my macro. They both have 21 instructions (which appear to be equivalent; some of them are reordered and some have their parameters swapped). But with -Os it is possible to get only 20 instructions with handcoded/my macro, while BOOST_FOREACH actually gives 57 instructions (-Os giving more instructions than -O3 for ANY code looks like a compiler bug to me). And
$ man gcc -----8<----- -Os Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimiza‐ tions designed to reduce code size. ----8<------ The result of combining optimization flags for a specific program is unknown. The -Ox options of gcc use some combinations that work well most of the time, but not all the time ! Unfortunately, choosing the best combination of optimization flags is an unsolved problem and depends of both the code and the data set. So, if the optimizations don't work for your code, it's not inevitably a bug of your compiler. P.S.: If you have to select a good set of optimization flags, there are softwares that aim is to automatize this selection... Regards, -- Johan
data:image/s3,"s3://crabby-images/8ac67/8ac674cdc2e195af5da24e64f987283bf48693a8" alt=""
I did some tests and got the exact same results for the following use of BOOST_FOREACH (and the corresponding use of my macro and a handcoded loop): ////////////////////////////////////////////////////// #include
#include <vector>
void f(float); struct A { void g() const; std::vector<float> const v; };
void A::g() const { BOOST_FOREACH(float i, v) f(i); } ///////////////////////////////////////////////////// Of course I got the exact same results with this code, because v is the first member of A, so A::v has the same address as A. When I realized
Erik skrev:
this I modified the testcase:
//////////////////////////////////////////////////////
#include
data:image/s3,"s3://crabby-images/6c5e8/6c5e8355a1099045fd81360a7a2c99dbfc837d03" alt=""
Erik wrote:
Unfortunately it does not look so good for BOOST_FOREACH after this modification. It will increase to 24 instructions, while the handcoded is still only 21. I created a script (attached) to test the examples systematically with different compiler versions and optimization levels.
Forgive me if this was answered in an earlier post, but how complex is the loop body, including the complexity of any functions it calls? If the total is more than a few lines long, contains several function calls, or performs even moderatley complex arithmetic, then three extra instructions in the for loop's header may not matter. Do you have access to a profiler? They often surprise me when I look at a report showing what parts of my program are actually using the most processor time (or wall clock time, if that is more important to you).
data:image/s3,"s3://crabby-images/cf6aa/cf6aa9b0ff60e1e77a1e1a2d15aefb2207ffe99c" alt=""
On 10/11/07, Andrew Holden
[snip]
Do you have access to a profiler? They often surprise me when I look at a report showing what parts of my program are actually using the most processor time (or wall clock time, if that is more important to you).
Profiling is the only way to optimize. Anything else is just wizardry. Regards, -- Felipe Magno de Almeida
data:image/s3,"s3://crabby-images/8ac67/8ac674cdc2e195af5da24e64f987283bf48693a8" alt=""
Andrew Holden skrev:
Erik wrote:
Unfortunately it does not look so good for BOOST_FOREACH after this modification. It will increase to 24 instructions, while the handcoded is still only 21. I created a script (attached) to test the examples systematically with different compiler versions and optimization levels.
Forgive me if this was answered in an earlier post, The answer is available by executing the test script in my previous post and then running the diff command that I showed in that post: diff -dU2 iterate_vector-member-g++-4.2.0-O3.s BOOST_FOREACH-member-g++-4.2.0-O3.s|kompare -
but how complex is the loop body, including the complexity of any functions it calls? The size of the loop body increases from 6 to 7 instructions and the code before the loop increases with 2 instructions.
data:image/s3,"s3://crabby-images/0425d/0425d767771932af098628cd72e2ccd4040cb8a0" alt=""
Erik wrote:
Andrew Holden skrev:
Erik wrote:
Unfortunately it does not look so good for BOOST_FOREACH after this modification. It will increase to 24 instructions, while the handcoded is still only 21.
Forgive me if this was answered in an earlier post, but how complex is the loop body, including the complexity of any functions it calls?
The answer is available by executing the test script in my previous post and then running the diff command that I showed in that post: The size of the loop body increases from 6 to 7 instructions and the code before the loop increases with 2 instructions.
No, that's not what Andrew asked. He was asking: in a *real* loop in your real application, wouldn't the work being done by the loop completely dwarf the loop-control overhead? In production code, how many loops do we write with such trivial bodies that the loop-control overhead makes a measurable difference?
data:image/s3,"s3://crabby-images/758ed/758ed636272ddc947a4ce1398eb6dee6f687ebf4" alt=""
completely dwarf the loop-control overhead? In production code, how many loops do we write with such trivial bodies that the loop-control overhead makes a measurable difference?
I was going to comment that cache and memory organiztion usually are most relevant to real performance but since you mention this, I have run into counter-examples. However, these usually can unroll so the loop penalty goes to zero :) Adding code size optimization request would presumably discourage a compiler from trying this.
From: Nat Goodspeed
Reply-To: boost-users@lists.boost.org To: boost-users@lists.boost.org Subject: Re: [Boost-users] Container iteration macro that is equivalent to handcoded iteration? Date: Thu, 11 Oct 2007 12:48:21 -0400 Erik wrote:
Andrew Holden skrev:
Erik wrote:
Unfortunately it does not look so good for BOOST_FOREACH after this modification. It will increase to 24 instructions, while the handcoded is still only 21.
Forgive me if this was answered in an earlier post, but how complex is the loop body, including the complexity of any functions it calls?
The answer is available by executing the test script in my previous post and then running the diff command that I showed in that post: The size of the loop body increases from 6 to 7 instructions and the code before the loop increases with 2 instructions.
No, that's not what Andrew asked. He was asking: in a *real* loop in your real application, wouldn't the work being done by the loop completely dwarf the loop-control overhead? In production code, how many loops do we write with such trivial bodies that the loop-control overhead makes a measurable difference? _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_________________________________________________________________ Capture the missing critters! Play Search Queries and earn great prizes. http://club.live.com/search_queries.aspx?icid=sq_hotmailtextlink1_oct
data:image/s3,"s3://crabby-images/6c5e8/6c5e8355a1099045fd81360a7a2c99dbfc837d03" alt=""
Erik wrote:
Erik wrote:
Unfortunately it does not look so good for BOOST_FOREACH after this modification. It will increase to 24 instructions, while the handcoded is still only 21. I created a script (attached) to test the examples systematically with different compiler versions and optimization levels.
Forgive me if this was answered in an earlier post, The answer is available by executing the test script in my
Andrew Holden skrev: previous post and then running the diff command that I showed in that post: diff -dU2 iterate_vector-member-g++-4.2.0-O3.s BOOST_FOREACH-member-g++-4.2.0-O3.s|kompare -
but how complex is the loop body, including the complexity of any functions it calls? The size of the loop body increases from 6 to 7 instructions and the code before the loop increases with 2 instructions.
Okay. I see that the loop body contains a call to a function "f", which
is not defined. Here is the code from one of the tests:
//Begin BOOST_FOREACH-parameter.cc
void f(float);
#include
data:image/s3,"s3://crabby-images/8ac67/8ac674cdc2e195af5da24e64f987283bf48693a8" alt=""
Andrew Holden skrev:
Erik wrote:
Andrew Holden skrev:
Erik wrote:
Unfortunately it does not look so good for BOOST_FOREACH after this modification. It will increase to 24 instructions, while the handcoded is still only 21. I created a script (attached) to test the examples systematically with different compiler versions and optimization levels. Forgive me if this was answered in an earlier post,
The answer is available by executing the test script in my previous post and then running the diff command that I showed in that post: diff -dU2 iterate_vector-member-g++-4.2.0-O3.s BOOST_FOREACH-member-g++-4.2.0-O3.s|kompare -
but how complex is the loop body, including the complexity of any functions it calls?
The size of the loop body increases from 6 to 7 instructions and the code before the loop increases with 2 instructions.
Okay. I see that the loop body contains a call to a function "f", which is not defined. Here is the code from one of the tests:
//Begin BOOST_FOREACH-parameter.cc void f(float); #include
#include <vector> void g(const std::vector<float> & v) {
BOOST_FOREACH(float i, v) f(i); } //End BOOST_FOREACH-parameter.cc
This is the test case where BOOST_FOREACH actually has no overhead with the right compiler version and optimization level. To study the problem, look at BOOST_FOREACH-member instead.
We would need to know the contents of f before we can really discuss the consequences of the extra instructions in BOOST_FOREACH. No we do not. The compiler does not need to know the contents of f to generate that code. Neither do we need to know the contents of f to discuss the code that the compiler generates in this case. I deliberately left f undefined so that loop efficiency can be studied in isolation.
The advantage of BOOST_FOREACH is isolated to the loops, so to weight the advantages against the disadvantages, we must study the disadvantages in isolation as well. Sure, any relatively small disadvantage in a program construct can be hidden by adding everything else that the program needs to do to the comparison. Suppose that 2 out of 3 loops become 12 byte longer with BOOST_FOREACH and that there are 2 000 loops in a program. That means 16kB. Add to that somewhat slower execution. Not a huge disadvantage assuming that the whole program is several MB, but then BOOST_FOREACH is just a small building block among others. But I still hope that this can be fixed by improving g++ (or possible BOOST_FOREACH), so that using BOOST_FOREACH will be strictly better and not a tradeoff.
data:image/s3,"s3://crabby-images/758ed/758ed636272ddc947a4ce1398eb6dee6f687ebf4" alt=""
No we do not. The compiler does not need to know the contents of f to generate that code. Neither do we need to know the contents of f to discuss the code that the compiler generates in this case. I deliberately left f undefined so that loop efficiency can be studied in isolation.
If you don't write something the compiler could inline,I wouldn't worry about things that are on the order of a subroutine call. If you put something specific there, then the compiler has lots of options. Again, for very simple things you could inline yourself, for more complicated things, in almost every case I've written, memory access patterns dominate performance. A few cache misses can kill you. Intel had some good references, you can skim their site for recent stuff: http://www.google.com/search?hl=en&q=ia32+performance+optimization+site%3Aintel.com Anyone have a recent Intel compiler? I know in one case I had some hand written SSE code with real clever register usage specialized for a certain wavelet transform. As I recall, the naively written general c++ code executed in about the same time- I don't remember specifically why, but most of the Vtune results, IIRC, pointed to memory limits and not instruction count.
From: Erik
Reply-To: boost-users@lists.boost.org To: boost-users@lists.boost.org Subject: Re: [Boost-users] Container iteration macro that is equivalent to handcoded iteration? Date: Fri, 12 Oct 2007 01:10:07 +0200
_________________________________________________________________ Make every IM count. Download Messenger and join the im Initiative now. Its free. http://im.live.com/messenger/im/home/?source=TAGHM
data:image/s3,"s3://crabby-images/b4e66/b4e6618abd88571690777d58d3e735c7f53bb18c" alt=""
on Thu Oct 11 2007, "Mike Marchywka"
No we do not. The compiler does not need to know the contents of f to generate that code. Neither do we need to know the contents of f to discuss the code that the compiler generates in this case. I deliberately left f undefined so that loop efficiency can be studied in isolation.
If you don't write something the compiler could inline,I wouldn't worry about things that are on the order of a subroutine call.
Furthermore, if you do write something inlineable, the compiler may be able to dispense with additional overhead from BOOST_FOREACH that cannot be legally removed without knowing the full loop body. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com
data:image/s3,"s3://crabby-images/58c09/58c0952898ca00532e5d2618d64b370cc90a8b9b" alt=""
-----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users- bounces@lists.boost.org] On Behalf Of Michael Marcin Sent: Wednesday, October 10, 2007 1:07 PM To: boost-users@lists.boost.org Subject: Re: [Boost-users] Container iteration macro that is equivalent tohandcoded iteration?
Erik wrote:
I have a lot of handcoded loops that look something like this:
//////////////////////////////////////////////////////////////////////// //
////
#include <vector>
void f(float); void g(std::vector<float> const & v) { std::vector<float>::const_iterator const v_end = v.end(); for (std::vector<float>::const_iterator it = v.begin(); it != v_end; ++it) f(*it); }
//////////////////////////////////////////////////////////////////////// //
////
I need to replace it with something equivalent but simpler. I tried BOOST_FOREACH (from boost-1.34):
//////////////////////////////////////////////////////////////////////// //
////
#include
#include <vector>
void f(float); void g(std::vector<float> const & v) { BOOST_FOREACH(float i, v) f(i); }
//////////////////////////////////////////////////////////////////////// //
////
But when I compared the assembly output of this with that of the handcoded version I discovered that the boost version is more complicated, so I tried to create something better myself, with the requirement that the generated code is not allowed to be any more complicated than that of the handcoded loop. The following is the
best I
could think of right now:
//////////////////////////////////////////////////////////////////////// //
////
#define iterate_vector_const(value_type, it, v) \ for \ (struct { \ std::vector
::const_iterator current; \ std::vector ::const_iterator const end; \ value_type const & operator*() {return *current;} \ } it = {v.begin(), v.end()}; \ it.current != it.end; \ ++it.current) \ #include <vector>
void f(float); void g(std::vector<float> const & v) { iterate_vector_const(float, it, v) f(*it); }
////
I looked at the difference between my macro and the handcoded loop: $ diff -U2 iteration-handcoded-g++-4.2.0-O3.S
iteration-iterate_vector- g++-4.2.0-O3.S
--- iteration-handcoded-g++-4.2.0-O3.S 2007-10-07 19:38:56.000000000 +0200 +++ iteration-iterate_vector-g++-4.2.0-O3.S 2007-10-07 20:00:53.000000000 +0200 @@ -1,3 +1,3 @@ - .file "iteration-handcoded.cc" + .file "iteration-iterate_vector.cc" .text .align 2 @@ -18,7 +18,7 @@ .LCFI4: movl 8(%ebp), %eax - movl 4(%eax), %esi movl (%eax), %ebx - cmpl %ebx, %esi + movl 4(%eax), %esi + cmpl %esi, %ebx je .L4 .p2align 4,,7
From this I conclude that my macro is as good as a handcoded loop. (Although I of course wonder why the compiler chose to move "movl 4(%eax), %esi" further down and swap the parameters of cmpl from "%ebx, %esi" to "%esi, %ebx".)
But my macro is not as versatile as BOOST_FOREACH. Is there any way to improve it? In particular, is it possible get rid of the macro
value_type? Is it possible to make it work with other containers
//////////////////////////////////////////////////////////////////////// // parameter than
vectors, as long as they define const_iterator/iterator, begin, end and value_type? Something like this maybe:
//////////////////////////////////////////////////////////////////////// //
////
#define iterate_container_const(it, v) \ for \ (struct { \ typeof(v)::const_iterator current; \ typeof(v)::const_iterator const end; \ typeof(v)::value_type const & operator*() {return *current;} \ } it = {v.begin(), v.end()}; \ it.current != it.v_end; \ ++it.current) \
#include <vector>
void f(float); void g(std::vector<float> const & v, std::list<float> const & l) { iterate_containter_const(it, v) f(*it); iterate_containter_const(it, l) f(*it); }
//////////////////////////////////////////////////////////////////////// //
////
Or is it possible to configure BOOST_FOREACH to be as efficient as
my macro?
I don't know but that is a good question.
I considered using BOOST_FOREACH until I checked its generated output... which was worse than std::for_each with a boost::bind which was worse than std::for_each with a hand coded functor which was worse than a hand coded for loop like yours above.
- Michael Marcin
p.s. Compilers make me sad
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
I think you will need to pass SOME type unless you are comfortable with BOOST_AUTO or BOOST_TYPEOF. I think the problem is on some compilers they require each type to be 'registered'. Maybe if you use a NEXT macro as well things will work out, because you can put multiple braces in the next macro. #define FOREACH(X, COL, COLT) \ { \ COLT& __col__ = (COL); \ for (iterator_type<COLT>::type __i__ = begin(__col__); \ __i__ != end(__col__);\ ++__i__) \ { \ X = *__i__; \ { #define NEXT() \ }\ }\ } I haven't tried this, but it seems like one could do: #define FOREACH(X, COL) \ { \ BOOST_TYPEOF(COL)& __col__ = (COL); \ BOOST_AUTO(__cur__, boost::begin(__col__)); BOOST_AUTO(__end__, boost::end(__col__)); for (; __cur__ != __end__; ++__cur__) \ { \ X = *__cur__; \ { #define NEXT() \ }\ }\ } Then you could use it like: vector<int> vec; FOREACH(int x, vec){ //do something with x }NEXT(); Or if BOOST_AUTO doesn't work it is vector<int> vec; //vector<int> does not have a comma in the type name! FOREACH(int x, vec, vector<int>) { //do something with x }NEXT(); I have no idea how well BOOST_AUTO will work though. You would also need a TPL version I think. IN my own code I have a special concern because I deal with half_edge structures where begin()==end() and I need to do a bottom testing loop. In these cases I made iteration macros for each container: I.e. MY_FOREACH_VERTEX_EDGE(e, vertex){ //e is an edge iterator }MY_NEXT_VERTEX_EDGE() I found that to be readable. I did not use BGL stuff because I did not know how to deal with begin() == end(). -- John
participants (10)
-
Andrew Holden
-
David Abrahams
-
Eric Niebler
-
Erik
-
Felipe Magno de Almeida
-
Johan Oudinet
-
John Femiani
-
Michael Marcin
-
Mike Marchywka
-
Nat Goodspeed