On Fri, 2010-04-09 at 00:56 -0400, Brett Gmoser wrote:
The documentation clearly says that v(x, y) is slower than iteration.
Interestingly, I've just been benchmarking the various GIL image traversal methods, as I find the coordinate access method very convenient, but I wanted to get some idea of how much improvement I should realistically expect from converting code to use iterators. Results obtained on standard Debian Lenny on an Intel Core i7, compiled with -march=core2 -mfpmath=sse -msse4.1 -O3 -DNDEBUG ; the main bits of code appended to this email. GIL coord access 1120.9697 Megapixels/s GIL row iterator access 1293.0218 Megapixels/s GIL image iterator access 77.9477 Megapixels/s I was pretty surprised just how efficient the v(x,y) accessor is, and even more surprised by how inefficient the whole-image iterator is! Inspecting the assember, the inner loop of v(x,y) looks like: .L768: xorb (%rcx,%rdx), %sil movq %rax, %rdx incq %rax cmpq %r8, %rax jne .L768 which is very lean, but not quite as good as the inner loop of the row iterator: .L734: xorb (%rdx,%rax), %cl incq %rax cmpq %rax, %r8 jg .L734 However, the inner loop of the all-image iterator is: .L696: movzbl (%rcx), %eax incq %rdx leaq 1(%rcx,%rbp), %rcx xorl %eax, %r10d .L708: testq %rdx, %rdx jne .L696 cmpq %rcx, %rbx jne .L696 which is a bit more complicated, although it seems remarkable it runs ~15 times slower than the other methods. What I took away from this: - Avoid the all-image iterator like the plague (although I don't really understand how it manages to be quite so spectacularly slow). - You need to be pretty desperate for performance to convert working and basically fast enough coordinate-access based code to iterators. - Compilers can do a pretty nice job with GIL classes. I've used other image classes which leave far more to run-time (e.g virtual function calls) and you have to basically "unload" the class information to pointers and ints and do it all yourself to get performant inner loops. ----- BOOST_AUTO_TEST_CASE(coord_access_benchmark) { unsigned char hash=0; scoped_timer t("GIL coord access",images().size(),"Megapixels"); for (images_t::const_iterator it=images().begin();it!=images().end();++it) { const boost::gil::gray8c_view_t v=boost::gil::const_view(**it); for (int y=0;y<v.height();++y) for (int x=0;x<v.width();++x) hash=(hash^v(x,y)); } force_result=hash; } BOOST_AUTO_TEST_CASE(row_iterator_access_benchmark) { unsigned char hash=0; scoped_timer t("GIL row iterator access",images().size(),"Megapixels"); for (images_t::const_iterator it=images().begin();it!=images().end();++it) { const boost::gil::gray8c_view_t v=boost::gil::const_view(**it); for (int y=0;y<v.height();++y) { boost::gil::gray8c_view_t::x_iterator p=v.row_begin(y); for (int x=0;x<v.width();++x,++p) hash=(hash^*p); } } force_result=hash; } BOOST_AUTO_TEST_CASE(image_iterator_access_benchmark) { unsigned char hash=0; scoped_timer t("GIL image iterator access",images().size(),"Megapixels"); for (images_t::const_iterator it=images().begin();it!=images().end();++it) { const boost::gil::gray8c_view_t v=boost::gil::const_view(**it); for (boost::gil::gray8c_view_t::iterator p=v.begin();p!=v.end();++p) { hash=(hash^*p); } } force_result=hash; } ----- Hope that's of interest to some Tim