[GIL] Some questions about 2D iteration
data:image/s3,"s3://crabby-images/c23ca/c23ca736dc17dbd15de7e7afb98db53ac3363676" alt=""
Hello everybody,
I've been toying with GIL a bit for the last few days in my spare time,
and I've got a few questions that I can't seem to find answers to in the
documentation. The following function which I've written works, but I'd
like to get a better understanding of how to use iterators to accomplish
the same task (the documentation says that view(x, y) = z is
inefficient), as well as a few other things. The job of this function
is to crop an image, discarding any white background. In other words,
find the top-most, left-most, right-most, and bottom-most pixel that
isn't the background color (white), and crop the image around those
coordinates.
template
data:image/s3,"s3://crabby-images/bdc6d/bdc6d58d359cc9389a17fab9e95d2f2f1c70fdd4" alt=""
Hi Brett, don't have much time right now. Have you read the tutorial?
http://www.boost.org/doc/libs/1_42_0/libs/gil/doc/html/giltutorial.html
I'll read your email for thoroughly tomorrow.
Regards,
Christian
On Wed, Mar 31, 2010 at 5:33 PM, Brett Gmoser
Hello everybody,
I've been toying with GIL a bit for the last few days in my spare time, and I've got a few questions that I can't seem to find answers to in the documentation. The following function which I've written works, but I'd like to get a better understanding of how to use iterators to accomplish the same task (the documentation says that view(x, y) = z is inefficient), as well as a few other things. The job of this function is to crop an image, discarding any white background. In other words, find the top-most, left-most, right-most, and bottom-most pixel that isn't the background color (white), and crop the image around those coordinates.
template
void crop_image(ViewType& src, boost::scoped_ptr<ImageType>& dst) const { int top = 0; int right = 0; int bottom = 0; int left = 0; // Find the top, right, bottom, and left extents. First, find the top. for(int y=0; y < src.height(); ++y) { for(typename ViewType::x_iterator xpos = src.row_begin(y); xpos != src.row_end(y); ++xpos) { if(rgb8_pixel_t(255,255,255) != *xpos) { top = y; break; } } if(top) break; }
// ... Do similar things for finding the right, bottom, and left extents.
// We should now have the rectangle of the image, and it should be easy to copy everything over. dst.reset(new ImageType(1 + (right - left), 1 + (bottom - top))); for(int x = left; x <= right; ++x) for(int y = top; y <= bottom; ++y) view(*dst)((x - left), (y - top)) = src(x, y); }
So, my first question is - is there a better way to do that first chore of finding each extent? Such as a way to use std::find or std::find_if, keeping in mind that I have to do two of the four sides in reverse (bottom-to-top to find bottom-most, right-to-left to find right-most)?
The next question is about the final loop which copies from the source view to the newly created destination image. It seems that using iterators, I could do something like this, assuming that I had the above loops find iterator positions for all four extents (left, top, bottom, and right are iterators):
// Start the copying process by creating a column iterator for the destination table beginning at column 0. // The top iterator starts at the top-most extent - and we cycle through the source image in a top-to-bottom, // left-to-right fashion. for(typename ImageType::view_t::y_iterator dst_y = view(*dst).col_begin(0); top <= bottom; ++dst_y,++top) { // The left iterator needs to be initialized with whatever row we're on - this appears to be impossible. There doesn't // seem to be any suitable way to do this, so consider it pseudocode. typename ViewType::x_iterator src_x = left.y();
// The destination X iterator also needs to know what Y position to begin on. You'd think you would be able to initialize // it with an iterator position, but it doesn't seem that you can. Again, this doesn't actually compile (it's the // view(*dst).row_begin(y) part). for(typename ImageType::view_t::x_iterator dst_x = view(*dst).row_begin(y); // Here, I need to be able to tell if src_x has hit the right extent. src_x <= right doesn't seem it will work, because // again right doesn't know it's Y position. src_x != right;
// And increment both src and destination iterators. ++src_x,++dst_x ) { // Finally do the should-be-simple task of assigning the source pixel to the destination pixel. *dst_x = *src_x; } }
If you read the comments, most of this doesn't seem to be possible. I've seen that there is a "transform_pixel_positions" algorithm that may do something close to what I want, but unfortunately I cannot find any documentation on it (other than mentioning that it exists, and a short example that really doesn't help - one of the few examples actually provided by the documentation). It seems to pass a unary argument to a functor, but that doesn't seem useful if I don't have both the X and Y coordinates, or an X iterator and Y iterator. Also, nowhere can I find where it says what the argument passed to the functor actually /is/.
So after I figure these few things out, I'm hoping to be able to speed up my program a little bit. Right now everything is pretty slow using that view(x, y) method. I'd appreciate any help that anybody can give, thanks!
Brett
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
data:image/s3,"s3://crabby-images/c23ca/c23ca736dc17dbd15de7e7afb98db53ac3363676" alt=""
On 3/31/2010 5:48 PM, Christian Henning wrote:
Hi Brett, don't have much time right now. Have you read the tutorial?
http://www.boost.org/doc/libs/1_42_0/libs/gil/doc/html/giltutorial.html
I'll read your email for thoroughly tomorrow.
Regards, Christian
Christian, Thanks. I've read the tutorial (or at least glanced over most of it), but I honestly can't find anything (besides possibly the transform_pixel_positions algorithm) that would do what I want to do. Also, all of the X/Y 2d loops seem to use a numeric loop for the outer loop (I was hoping to be able to use iterators for both, since it would be easier to go from left-to-right and top-to-bottom extents, where left isn't necessarily 0 and top isn't necessarily 0). Thanks again, Brett
data:image/s3,"s3://crabby-images/bdc6d/bdc6d58d359cc9389a17fab9e95d2f2f1c70fdd4" alt=""
Hi Brett,
Thanks. I've read the tutorial (or at least glanced over most of it), but I honestly can't find anything (besides possibly the transform_pixel_positions algorithm) that would do what I want to do. Also, all of the X/Y 2d loops seem to use a numeric loop for the outer loop (I was hoping to be able to use iterators for both, since it would be easier to go from left-to-right and top-to-bottom extents, where left isn't necessarily 0 and top isn't necessarily 0).
Sorry for getting back so late. Do you still have problems? If though, please send me some sample code that I can work with. Regards, Christian
data:image/s3,"s3://crabby-images/c23ca/c23ca736dc17dbd15de7e7afb98db53ac3363676" alt=""
On 3/31/2010 5:48 PM, Christian Henning wrote:
Hi Brett, don't have much time right now. Have you read the tutorial?
http://www.boost.org/doc/libs/1_42_0/libs/gil/doc/html/giltutorial.html
I'll read your email for thoroughly tomorrow.
Regards, Christian
Christian, I'm sorry to bother you. Have you happened to have any time to look at my question more thoroughly? :) Thanks, Brett
data:image/s3,"s3://crabby-images/6ad2e/6ad2e6fa09e4a89330505265bee6ea8e6f4113b8" alt=""
Brett, For your first question, it's probable that using 1d iterators would not speed things, as they're less performant for 2d traversal. To gain some speed, you could try storing the results of 'src.height()' and 'src.row_end(y)' into variables, this would allow the compiler to know for sure that these values are constant. Also note that using numerical values are not really a problem for your algorithm reusability : you could feed it with, say, a view transformed with 'flipped_up_down_viewhttp://www.boost.org/doc/libs/1_42_0/libs/gil/doc/html/g_i_l_0163.html' so it could operate bottom-up, and so on. For your second question, I'd look at creating a 'subimage_view' from your source view, and then use 'copy_pixels' to copy it to your destination image. This should be pretty close to optimal gil usage. Cheers, Nicolas.
data:image/s3,"s3://crabby-images/c23ca/c23ca736dc17dbd15de7e7afb98db53ac3363676" alt=""
On 4/7/2010 2:39 AM, Nicolas Lelong wrote:
Brett,
For your first question, it's probable that using 1d iterators would not speed things, as they're less performant for 2d traversal. To gain some speed, you could try storing the results of 'src.height()' and 'src.row_end(y)' into variables, this would allow the compiler to know for sure that these values are constant.
Also note that using numerical values are not really a problem for your algorithm reusability : you could feed it with, say, a view transformed with 'flipped_up_down_view http://www.boost.org/doc/libs/1_42_0/libs/gil/doc/html/g_i_l_0163.html' so it could operate bottom-up, and so on.
For your second question, I'd look at creating a 'subimage_view' from your source view, and then use 'copy_pixels' to copy it to your destination image. This should be pretty close to optimal gil usage.
Nicolas, Thanks so much for your help! Your suggestions really helped me out. I was able to use the subimg_view's for the moving around in the destination image (I hadn't thought of it that way), and the flipped_up_down view stuff really helped to make my "find extents" algorithm more generic (I wound up using the roated_90cw, rotated_90ccw, rotated_180 views). The only thing I'm confused about is that you say that using 1d iterators wouldn't help much (and, btw, I was able to use the iterators correctly now that I didn't need to add to the x/y coordinates in the destination view). The documentation clearly says that v(x, y) is slower than iteration. Would you mind clarifying what you mean? What I'm doing now, and what a lot of the samples do, is something like this: for(int x = 0; x < width, ++x) for(y_iterator y = src.y_begin(x); y != src.y_end(x); ++y) /* ... */ My confusion earlier was that apparently you cannot use iterators for both the inner and outer loop - one of them needs to be an integer, which doesn't seem to be an issue. It makes sense that v(x, y) would be slower because it needs to calculate both the x and y position. Thanks again for your help! I'm a lot better off now having understood your response to my question. Brett
data:image/s3,"s3://crabby-images/6ad2e/6ad2e6fa09e4a89330505265bee6ea8e6f4113b8" alt=""
Brett,
glad I could help ! The only thing I'm confused about is that you say that using 1d iterators
wouldn't help much (and, btw, I was able to use the iterators correctly now that I didn't need to add to the x/y coordinates in the destination view). The documentation clearly says that v(x, y) is slower than iteration. Would you mind clarifying what you mean? What I'm doing now, and what a lot of the samples do, is something like this:
for(int x = 0; x < width, ++x) for(y_iterator y = src.y_begin(x); y != src.y_end(x); ++y) /* ... */
Actually, you're approach is the good one. I should not have said '1d iterators', its confusing. I only wanted to warn you that according to http://www.boost.org/doc/libs/1_42_0/libs/gil/doc/html/giltutorial.html#OneD...: *GIL image views provide **begin()** and **end()** methods that return one
dimensional pixel iterators which iterate over each pixel in the view, left to right and top to bottom. They do a proper "carriage return" - they skip any unused bytes at the end of a row. As such, they are slightly suboptimal, because they need to keep track of their current position with respect to the end of the row. Their increment operator performs one extra check (are we at the end of the row?), a check that is avoided if two nested loops are used instead.*
So, what I meant is to avoid using something like : for (iterator it = src.begin(); it != src.end(); ++it) /* ... */ which is often suboptimal to traverse the pixels of a view. Just one comment about your loop, as you mentioned a need for performance, be careful about the order of the loops, as the performance of your traversal may vary with the memory layout of the pixel data of your view. Most of the time, its recommended to iterate over 'y' in the outer loop : for(int y = 0; y < height, ++x) for(x_iterator x = src.row_begin(x), end = src.row_end(); x != end; ++x) /* ... */ This allows, considering a typical image memory layout, to traverse raw images in a cache coherent way. Note also that calling row_end() at every pixel may not be nicely optimized away by your compiler. Best regards, Nicolas.
data:image/s3,"s3://crabby-images/c23ca/c23ca736dc17dbd15de7e7afb98db53ac3363676" alt=""
Nicolas,
glad I could help !
I'm glad you saw my message! Sorry it took me a day or two to get back into the swing of this project, I was worried you wouldn't see my reply just because I wanted to thank you. The previous one and this one was a huge help.
Actually, you're approach is the good one. I should not have said '1d iterators', its confusing. I only wanted to warn you that according to http://www.boost.org/doc/libs/1_42_0/libs/gil/doc/html/giltutorial.html#OneD... :
Actually, '1d iterators' makes perfect sense now that I understand what you mean (my fault for not understanding earlier). I can definitely see why that would be less efficient. I guess there is a padding to the right and bottom side of the image (I think you graphics guys call it the "stride"?), since by the actual data layout of the image (just an array) you would think the 1d iteration would actually be faster as long as you didn't need to know your exact X/Y coordinate.
Just one comment about your loop, as you mentioned a need for performance, be careful about the order of the loops, as the performance of your traversal may vary with the memory layout of the pixel data of your view. Most of the time, its recommended to iterate over 'y' in the outer loop :
for(int y = 0; y < height, ++x) for(x_iterator x = src.row_begin(x), end = src.row_end(); x != end; ++x) /* ... */
This allows, considering a typical image memory layout, to traverse raw images in a cache coherent way. Note also that calling row_end() at every pixel may not be nicely optimized away by your compiler.
Again, a most excellent tip! I had a bunch with x as the outer loop, I'll switch them over tomorrow, as well as the calling row_end() on every pixel. Thanks again for your help. Developing this small project (a CAPTCHA, if I hadn't mentioned) has been fun with GIL - it's rare I get to do something outside of my normal day-to-day stuff, and this is what makes programming fun. That said, if anybody on this list reading this happens to be in need of a CAPTCHA generator using GIL (I didn't find anything that was already made using C++ and/or GIL), I'd be happy to release my code into the public domain. Some samples of the same word generated a bunch of times is here: http://www.codexterous.com/captchas/ - and it's actually much improved now (more features added, such as the anti-segmentation line is waved). Brett
data:image/s3,"s3://crabby-images/63515/635154bb02cb5e24a2eda06b9480c22bdc6093aa" alt=""
On Fri, 2010-04-09 at 00:56 -0400, Brett Gmoser wrote:
The documentation clearly says that v(x, y) is slower than iteration.
Interestingly, I've just been benchmarking the various GIL image
traversal methods, as I find the coordinate access method very
convenient, but I wanted to get some idea of how much improvement I
should realistically expect from converting code to use iterators.
Results obtained on standard Debian Lenny on an Intel Core i7, compiled
with -march=core2 -mfpmath=sse -msse4.1 -O3 -DNDEBUG ; the main bits of
code appended to this email.
GIL coord access 1120.9697 Megapixels/s
GIL row iterator access 1293.0218 Megapixels/s
GIL image iterator access 77.9477 Megapixels/s
I was pretty surprised just how efficient the v(x,y) accessor is,
and even more surprised by how inefficient the whole-image iterator is!
Inspecting the assember, the inner loop of v(x,y) looks like:
.L768:
xorb (%rcx,%rdx), %sil
movq %rax, %rdx
incq %rax
cmpq %r8, %rax
jne .L768
which is very lean, but not quite as good as the inner loop of the row
iterator:
.L734:
xorb (%rdx,%rax), %cl
incq %rax
cmpq %rax, %r8
jg .L734
However, the inner loop of the all-image iterator is:
.L696:
movzbl (%rcx), %eax
incq %rdx
leaq 1(%rcx,%rbp), %rcx
xorl %eax, %r10d
.L708:
testq %rdx, %rdx
jne .L696
cmpq %rcx, %rbx
jne .L696
which is a bit more complicated, although it seems remarkable it runs
~15 times slower than the other methods.
What I took away from this:
- Avoid the all-image iterator like the plague (although I don't really
understand how it manages to be quite so spectacularly slow).
- You need to be pretty desperate for performance to convert working
and basically fast enough coordinate-access based code to iterators.
- Compilers can do a pretty nice job with GIL classes. I've used other
image classes which leave far more to run-time (e.g virtual function
calls) and you have to basically "unload" the class information to
pointers and ints and do it all yourself to get performant inner loops.
-----
BOOST_AUTO_TEST_CASE(coord_access_benchmark)
{
unsigned char hash=0;
scoped_timer t("GIL coord access",images().size(),"Megapixels");
for (images_t::const_iterator it=images().begin();it!=images().end();++it)
{
const boost::gil::gray8c_view_t v=boost::gil::const_view(**it);
for (int y=0;y
participants (4)
-
Brett Gmoser
-
Christian Henning
-
Nicolas Lelong
-
Tim Day