[gil::io] Feedback for scanline_read_iterator

Christian Henning

12 Feb 2013 12 Feb '13

1:13 a.m.

Hi all, I'm getting close to finally commit the new io extension for boost::gil. As it was required from the boost review I have implemented a basic image_read_iterator to read large images. I would like to get some feedback on my interface and maybe on the actual implementation, as well. I must admit this is my first iterator that I have designed and implemented from scratch with some help by other people. The latest io code can be found here via svn: http://code.google.com/p/gil-contributions/source/browse/#svn%2Ftrunk%2Fgil_... A basic usage of the code is here: http://pastebin.com/3E7KpfwE Thanks! Christian

Show replies by date

Mateusz Loskot

14 Feb 14 Feb

12:04 p.m.

On 12 February 2013 01:13, Christian Henning <chhenning@gmail.com> wrote:

...

I'm getting close to finally commit the new io extension for boost::gil. As it was required from the boost review I have implemented a basic image_read_iterator to read large images.

Hi Christian, It's great to see it rolling forward.

...

I would like to get some feedback on my interface and maybe on the actual implementation, as well. I must admit this is my first iterator that I have designed and implemented from scratch with some help by other people.

I have update my clone of GIL IOv2 and briefly played with your example a bit [1] using VS 2012 on Windows 8. FYI, your example compiles and runs well. [1] https://github.com/mloskot/boost-gil-workshop A few questions from me: 1) Are we speaking of image_read_iterator or scanline_read_iterator? 2) What is the difference between two templates in two different headers: /boost/gil/extension/io_new/scanline_read_iterator.hpp /boost/gil/extension/io_new/detail/image_read_iterator.hpp 3) scanline_read_iterator is an input iterator, do you also plan output one too? 4) The meat of your example: reader_t reader = make_scanline_reader( file_name, bmp_tag() ); byte_t* buffer = new byte_t[ reader._scanline_length ]; scanline_read_iterator<reader_t> it = scanline_read_iterator<reader_t>(reader, buffer); scanline_read_iterator<reader_t> end = scanline_read_iterator<reader_t>(); int row = 0; while( it != end ) { *it; // ? copy_pixels( interleaved_view(reader._info._width, 1, (image_t::view_t::x_iterator) buffer, reader._scanline_length) , subimage_view(view(dst), 0, row, reader._info._width, 1)); ++row; } a) Initially, I was wondering why not to hide the buffer behind the reader. But, I think the proposed composition of iterator from reader and buffer given separately gives better flexibility. b) reader_t::begin(buffer) could be added, for convenience c) Any particular reason you don't use result of the iterator deference directly? (image_t::view_t::x_iterator) *it d) By the way, have you tested writing the dst into a file? Finally, question not related directly to iterators, but to GIL interface in general. GIL functions require to cast various objects to pointers or iterators, like in the example above: (image_t::view_t::x_iterator) buffer IMO, it's weak link and it's the point where users mostly get into trouble with GIL. Would it be possible to at least document rules for such conversions? What are the basic steps to decide on particular conversion of image buffer or view to certain type of pointer/iterator/location. Obviously, it generally depends on image layout and configuration. The trouble for me so far was that to figure out valid and required conversions, one has to look at concrete types generated, frequently dig deep into GIL code. Shortly, would it be possible to describe systematic way of deciding about these conversions in GIL? Thanks for the good work you do for GIL! Best regards, -- Mateusz Loskot, http://mateusz.loskot.net

Mateusz Loskot

2:42 p.m.

On 14 February 2013 12:04, Mateusz Loskot <mateusz@loskot.net> wrote:

...

On 12 February 2013 01:13, Christian Henning <chhenning@gmail.com> wrote:

...
I'm getting close to finally commit the new io extension for boost::gil. As it was required from the boost review I have implemented a basic image_read_iterator to read large images.

Hi Christian,

It's great to see it rolling forward.

...
I would like to get some feedback on my interface and maybe on the actual implementation, as well. I must admit this is my first iterator that I have designed and implemented from scratch with some help by other people.

I have update my clone of GIL IOv2 and briefly played with your example a bit [1] using VS 2012 on Windows 8. FYI, your example compiles and runs well.

[1] https://github.com/mloskot/boost-gil-workshop

A few questions from me: [...]

5) Have you considered boost::iterator_facade to avoid much of the boilerplate? Best regards, -- Mateusz Loskot, http://mateusz.loskot.net

Christian Henning

15 Feb 15 Feb

10:16 p.m.

...

5) Have you considered boost::iterator_facade to avoid much of the boilerplate?

I have no clue about iterator::facade and I believe I couldn't figure out how to make it work for me. Christian

Dave Abrahams

16 Feb 16 Feb

4:52 a.m.

on Fri Feb 15 2013, Christian Henning <chhenning-AT-gmail.com> wrote:

...

...
5) Have you considered boost::iterator_facade to avoid much of the boilerplate?

I have no clue about iterator::facade and I believe I couldn't figure out how to make it work for me.

Aww, really? It's got an excellent tutorial. http://www.boost.org/doc/libs/1_53_0/libs/iterator/doc/iterator_facade.html#... -- Dave Abrahams

Christian Henning

18 Feb 18 Feb

5:13 p.m.

Hi Dave,

...

Aww, really? It's got an excellent tutorial. http://www.boost.org/doc/libs/1_53_0/libs/iterator/doc/iterator_facade.html#...

I'm not disagreeing that interator_facade has excellent tutorials. I just never really understood how it could help me. It's entirely my fault. Christian

Christian Henning

15 Feb 15 Feb

10:14 p.m.

Hi Mateusz,

...

I have update my clone of GIL IOv2 and briefly played with your example a bit [1] using VS 2012 on Windows 8. FYI, your example compiles and runs well.

[1] https://github.com/mloskot/boost-gil-workshop

That's good to know!

...

A few questions from me:

1) Are we speaking of image_read_iterator or scanline_read_iterator?

scanline_read_iterator. I decided to rename the iterator to a name that represents better the actual behavior. Turns out there are many ways to read an image.

...

2) What is the difference between two templates in two different headers: /boost/gil/extension/io_new/scanline_read_iterator.hpp /boost/gil/extension/io_new/detail/image_read_iterator.hpp

scanline_read_iterator.hpp is obsolete and has been removed.

...

3) scanline_read_iterator is an input iterator, do you also plan output one too?

Not before the release. But logically I don't think there is much interest. What do you think?

...

4) The meat of your example:

reader_t reader = make_scanline_reader( file_name, bmp_tag() ); byte_t* buffer = new byte_t[ reader._scanline_length ]; scanline_read_iterator<reader_t> it = scanline_read_iterator<reader_t>(reader, buffer); scanline_read_iterator<reader_t> end = scanline_read_iterator<reader_t>(); int row = 0; while( it != end ) { *it; // ?

copy_pixels( interleaved_view(reader._info._width, 1, (image_t::view_t::x_iterator) buffer, reader._scanline_length) , subimage_view(view(dst), 0, row, reader._info._width, 1));

++row; }

a) Initially, I was wondering why not to hide the buffer behind the reader. But, I think the proposed composition of iterator from reader and buffer given separately gives better flexibility.

Agreed.

...

b) reader_t::begin(buffer) could be added, for convenience

Mhmm, you think so?

...

c) Any particular reason you don't use result of the iterator deference directly?

(image_t::view_t::x_iterator) *it

I don't need to in my example. The dereference operator returns the buffer which we already have access to.

...

d) By the way, have you tested writing the dst into a file?

I'll add that to my test. Just to be on the save side.

...

Finally, question not related directly to iterators, but to GIL interface in general. GIL functions require to cast various objects to pointers or iterators, like in the example above:

(image_t::view_t::x_iterator) buffer

IMO, it's weak link and it's the point where users mostly get into trouble with GIL. Would it be possible to at least document rules for such conversions?

I agree that gil's documentation is somewhat lacking and that the learning curve is quite steep. On the other side gil does a lot behind the scene and sometimes I find it amazing that it just works! Now I cannot really comment on the reasoning why the initial developers designed parts the way they did. But for x_iterator I believe they did it well. Somehow you have to tell interleaved_view what kind of view you will create. It's also very handy when dealing for instance, with rgb and bgr images and generic algorithms. A next version of gil will be better documented to at least boost standard. That's at least a goal of me.

...

What are the basic steps to decide on particular conversion of image buffer or view to certain type of pointer/iterator/location.

??? You lost me here.

...

Obviously, it generally depends on image layout and configuration. The trouble for me so far was that to figure out valid and required conversions, one has to look at concrete types generated, frequently dig deep into GIL code.

Any example?

...

Shortly, would it be possible to describe systematic way of deciding about these conversions in GIL?

What conversion? For instance there is no conversion when dealing with "compatible" images, like rgb and bgr. Conversion takes place when changing the color space. Thanks again for your insight. Christian

Mateusz Loskot

16 Feb 16 Feb

12:54 a.m.

On 15 February 2013 22:14, Christian Henning <chhenning@gmail.com> wrote:

...

Mateusz Loskot <mateusz@loskot.net> wrote:

...
3) scanline_read_iterator is an input iterator, do you also plan output one too?

Not before the release. But logically I don't think there is much interest. What do you think?

I don't know if there is any interest, but my question was triggered by the symmetry pursuit. It could make a logical addition. Just asking, not expecting it

...

...
b) reader_t::begin(buffer) could be added, for convenience

Mhmm, you think so?

With this suggestion, I hoped to provoke some wider discussion about such interface. Not very common prototype of begin() function, is it? It seems convenient as it hides the "boring" boilerplate, analogously to std::make_pair and other such functions.

...

...
c) Any particular reason you don't use result of the iterator deference directly?

(image_t::view_t::x_iterator) *it

I don't need to in my example. The dereference operator returns the buffer which we already have access to.

Yes, I understand that, but IMHO it makes the purpose of this iterator obscure. You know the internal details of the iterator, so you use the fact that the iterator simply advances the buffer pointer. You simply use the iteration side effect here. But, I think most of users who don't dig how this iterator is implemented, will wonder why the result of derefence is not used. I think, discoverability is important and most users will try to apply the knowledge about iterators they already have got. Is there similar analogy to this in std:: or other commonly used iterators that I've missed?

...

...
Finally, question not related directly to iterators, but to GIL interface in general. GIL functions require to cast various objects to pointers or iterators, like in the example above:

(image_t::view_t::x_iterator) buffer

IMO, it's weak link and it's the point where users mostly get into trouble with GIL. Would it be possible to at least document rules for such conversions?

I agree that gil's documentation is somewhat lacking and that the learning curve is quite steep. On the other side gil does a lot behind the scene and sometimes I find it amazing that it just works!

Now I cannot really comment on the reasoning why the initial developers designed parts the way they did. But for x_iterator I believe they did it well. Somehow you have to tell interleaved_view what kind of view you will create. It's also very handy when dealing for instance, with rgb and bgr images and generic algorithms.

I believe these sequence of posts exchanged during the GIL review explains it quite well, with most important part from the third link being "We do indeed cast the user-supplied channel* to pixel_t*, so that we can construct an interleaved view of the user-supplied memory. " http://lists.boost.org/Archives/boost/2006/10/112053.php http://lists.boost.org/Archives/boost/2006/10/112059.php http://lists.boost.org/Archives/boost/2006/10/112064.php but...

...

...
What are the basic steps to decide on particular conversion of image buffer or view to certain type of pointer/iterator/location.

??? You lost me here.

Simple question users frequently wonder about (mostly on SO, AFAIR): How do I decide in particular case between casting to x_iterator or rgb8_pixel_t*?

...

...
Shortly, would it be possible to describe systematic way of deciding about these conversions in GIL?

What conversion? For instance there is no conversion when dealing with "compatible" images, like rgb and bgr. Conversion takes place when changing the color space.

I meant conversions by casting, a type conversion within C++ domain. Not color/channel/pixel conversions, not raster domain. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net

Christian Henning

11:30 p.m.

Hi Mateusz,

...

...
...
b) reader_t::begin(buffer) could be added, for convenience

Mhmm, you think so?

With this suggestion, I hoped to provoke some wider discussion about such interface. Not very common prototype of begin() function, is it? It seems convenient as it hides the "boring" boilerplate, analogously to std::make_pair and other such functions.

Could you give me some prototype code?

...

...
...
c) Any particular reason you don't use result of the iterator deference directly?

(image_t::view_t::x_iterator) *it

I don't need to in my example. The dereference operator returns the buffer which we already have access to.

Yes, I understand that, but IMHO it makes the purpose of this iterator obscure. You know the internal details of the iterator, so you use the fact that the iterator simply advances the buffer pointer. You simply use the iteration side effect here. But, I think most of users who don't dig how this iterator is implemented, will wonder why the result of derefence is not used.

I think, discoverability is important and most users will try to apply the knowledge about iterators they already have got. Is there similar analogy to this in std:: or other commonly used iterators that I've missed?

I don't wanna create an "obscure" iterator and try my best to make sure it's easy to use and also to behave as expected. I know I'm not an expert in the field but I'm here to learn. Anyway, there are two thing I need to accomplish when reading image and the common iterator interface doesn't seem to support it naturally: 1) read a scanline, via dereference operator 2) don't read the scanline but still move the internal scanline counter, via ++operator I'm open to better ideas!

...

...
I agree that gil's documentation is somewhat lacking and that the learning curve is quite steep. On the other side gil does a lot behind the scene and sometimes I find it amazing that it just works!

Now I cannot really comment on the reasoning why the initial developers designed parts the way they did. But for x_iterator I believe they did it well. Somehow you have to tell interleaved_view what kind of view you will create. It's also very handy when dealing for instance, with rgb and bgr images and generic algorithms.

I believe these sequence of posts exchanged during the GIL review explains it quite well, with most important part from the third link being

"We do indeed cast the user-supplied channel* to pixel_t*, so that we can construct an interleaved view of the user-supplied memory. "

http://lists.boost.org/Archives/boost/2006/10/112053.php http://lists.boost.org/Archives/boost/2006/10/112059.php http://lists.boost.org/Archives/boost/2006/10/112064.php

but...

...
...
What are the basic steps to decide on particular conversion of image buffer or view to certain type of pointer/iterator/location.

??? You lost me here.

Simple question users frequently wonder about (mostly on SO, AFAIR):

How do I decide in particular case between casting to x_iterator or rgb8_pixel_t*?

For rgb8_image_t the x_iterator is rgb8_pixel_t*. x_iterator you use when you don't know the image_t or view_t during compile time. Does that make sense? Regards, Christian

Mateusz Loskot

17 Feb 17 Feb

10:46 p.m.

On 16 February 2013 23:30, Christian Henning <chhenning@gmail.com> wrote:

...

Hi Mateusz,

...
...
...
b) reader_t::begin(buffer) could be added, for convenience

Mhmm, you think so?

With this suggestion, I hoped to provoke some wider discussion about such interface. Not very common prototype of begin() function, is it? It seems convenient as it hides the "boring" boilerplate, analogously to std::make_pair and other such functions.

Could you give me some prototype code?

...

From top of my head, using reader template not necessarily exactly defined in GIL:

template<template Device, template Tag> struct scanline_reader { typedef typename scanline_reader<Device, Tag> reader_type: // deduce accurate type of buffer from reader's Device scanline_read_iterator<reader_type> begin(const char* buffer) { return scanline_read_iterator<reader_type>(*this, buffer); } };

...

...
...
...
c) Any particular reason you don't use result of the iterator deference directly?

(image_t::view_t::x_iterator) *it

I don't need to in my example. The dereference operator returns the buffer which we already have access to.

Yes, I understand that, but IMHO it makes the purpose of this iterator obscure. You know the internal details of the iterator, so you use the fact that the iterator simply advances the buffer pointer. You simply use the iteration side effect here. But, I think most of users who don't dig how this iterator is implemented, will wonder why the result of derefence is not used.

I think, discoverability is important and most users will try to apply the knowledge about iterators they already have got. Is there similar analogy to this in std:: or other commonly used iterators that I've missed?

I don't wanna create an "obscure" iterator and try my best to make sure it's easy to use and also to behave as expected. I know I'm not an expert in the field but I'm here to learn.

In fact, I have tried to provoke some opinions on my "judgement of obscurity" above, if my impression is isolated or others would confirm it.

...

Anyway, there are two thing I need to accomplish when reading image and the common iterator interface doesn't seem to support it naturally:

1) read a scanline, via dereference operator 2) don't read the scanline but still move the internal scanline counter, via ++operator

Sure, it is not uncommon, that advance operation is very 'lightweight', but the real work on data (fetch, copy) occurs during dereference operation. (continuing my other question, not related to the scanline iterator)

...

...
How do I decide in particular case between casting to x_iterator or rgb8_pixel_t*?

For rgb8_image_t the x_iterator is rgb8_pixel_t*. x_iterator you use when you don't know the image_t or view_t during compile time.

In other words, for every type of image or view, x_iterator is a valid pixel iterator. It follows, that for every type of image, this form of cast is always valid: boost::gil::interleaved_view(my_width, my_height, (my_view::x_iterator)pixels, my_rowsize_in_bytes) Then, I see no point in "teaching" users to use hard-wired rgb8_pixel_t* as pixel iterator when the x_iterator is the valid yet generic type of conversion here. And, it does not matter that in scope of example we know that the x_iterator is in fact rgb8_pixel_t*, this knowledge should not be used as we have a generic type available for that purpose.. My point is this: if examples in GIL use explicit type like rgb8_pixel_t*, then in consequence this knowledge is useless for users who work with different type of images. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net

Christian Henning

18 Feb 18 Feb

5:16 p.m.

Hi Mateusz,

...

...
Could you give me some prototype code?

From top of my head, using reader template not necessarily exactly defined in GIL:

template<template Device, template Tag> struct scanline_reader { typedef typename scanline_reader<Device, Tag> reader_type:

// deduce accurate type of buffer from reader's Device scanline_read_iterator<reader_type> begin(const char* buffer) { return scanline_read_iterator<reader_type>(*this, buffer); } };

OK, I can see that would be a valid code addition.

...

My point is this: if examples in GIL use explicit type like rgb8_pixel_t*, then in consequence this knowledge is useless for users who work with different type of images.

Yes, that should be fixed. Christian

Phil Endecott

4:52 p.m.

Hi Christian, Christian Henning wrote:

...

...
3) scanline_read_iterator is an input iterator, do you also plan output one too?

Not before the release. But logically I don't think there is much interest. What do you think?

The example that I originally presented for needing line-by-line access was to read in a very large image and cut it into tiles, without needing RAM proportional to the image size. There is an exact converse case where you might want to combine many smaller images into one larger one, without needing enough RAM for the whole image. Consider, for example, the final step in a panorama-stitching program like hugin: it's entirely reasonable to want to do this on a memory-constrained platform like a camera or phone.

...

...
4) The meat of your example:

reader_t reader = make_scanline_reader( file_name, bmp_tag() ); byte_t* buffer = new byte_t[ reader._scanline_length ]; scanline_read_iterator<reader_t> it = scanline_read_iterator<reader_t>(reader, buffer); scanline_read_iterator<reader_t> end = scanline_read_iterator<reader_t>(); int row = 0; while( it != end ) { *it; // ?

copy_pixels( interleaved_view(reader._info._width, 1, (image_t::view_t::x_iterator) buffer, reader._scanline_length) , subimage_view(view(dst), 0, row, reader._info._width, 1));

++row; }

...

...
c) Any particular reason you don't use result of the iterator deference directly?

I agree that the way the iterator is dereferenced in that code ("*it;") is non-obvious. I can't think of any other case where dereferencing an iterator has such significant side-effects i.e. it writes into an implied buffer and advances itself. More conventional code might look like: buffer_t buffer = ......; iterator begin(.....); iterator end(); for( iterator it = begin; it != end; ++it ) { const buffer_t& b = *it; copy_pixels( .... ); } Here I am explicitly (a) incrementing the iterator, and (b) using the result of the dereference. I think you can do that and still get the same performance. Regards, Phil.

Mateusz Loskot

5:05 p.m.

On 18 February 2013 16:52, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:

...

Christian Henning wrote:

...
...
3) scanline_read_iterator is an input iterator, do you also plan output one too?

Not before the release. But logically I don't think there is much interest. What do you think?

The example that I originally presented for needing line-by-line access was to read in a very large image and cut it into tiles, without needing RAM proportional to the image size.

Phil, Thanks for bringing this subject. It would be an important use case for myself too. I have assumed, at some point we may achieve this by the following: * add tile_read_iterator which would dereference to tiles as views * use scanline_read_iterator to access lines of current view (a memory of tile data) I'm thinking along the lines of TIFF spec, perhaps similar approach could be based applied to strips-based access combined with scanline_read_iteartor.

...

...
...
c) Any particular reason you don't use result of the iterator deference directly?

I agree that the way the iterator is dereferenced in that code ("*it;") is non-obvious. I can't think of any other case where dereferencing an iterator has such significant side-effects i.e. it writes into an implied buffer and advances itself. More conventional code might look like:

buffer_t buffer = ......; iterator begin(.....); iterator end(); for( iterator it = begin; it != end; ++it ) { const buffer_t& b = *it; copy_pixels( .... ); }

Here I am explicitly (a) incrementing the iterator, and (b) using the result of the dereference. I think you can do that and still get the same performance.

Thanks for completing the picture, I agree with that. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net

Mateusz Loskot

20 Feb 20 Feb

4:24 p.m.

On 18 February 2013 17:05, Mateusz Loskot <mateusz@loskot.net> wrote:

...

On 18 February 2013 16:52, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:

...
Christian Henning wrote:

...
...
3) scanline_read_iterator is an input iterator, do you also plan output one too?

Not before the release. But logically I don't think there is much interest. What do you think?

The example that I originally presented for needing line-by-line access was to read in a very large image and cut it into tiles, without needing RAM proportional to the image size.

Phil,

Thanks for bringing this subject. It would be an important use case for myself too. I have assumed, at some point we may achieve this by the following:

* add tile_read_iterator which would dereference to tiles as views * use scanline_read_iterator to access lines of current view (a memory of tile data)

I'm thinking along the lines of TIFF spec, perhaps similar approach could be based applied to strips-based access combined with scanline_read_iteartor.

I'm wondering, would that tiling scheme be irrelevant or plain unreasonable? I'd appreciate critique on these ideas Would anyone? Best regards, -- Mateusz Loskot, http://mateusz.loskot.net

Christian Henning

18 Feb 18 Feb

5:31 p.m.

Hi Phil,

...

...
...
3) scanline_read_iterator is an input iterator, do you also plan output one too?

Not before the release. But logically I don't think there is much interest. What do you think?

The example that I originally presented for needing line-by-line access was to read in a very large image and cut it into tiles, without needing RAM proportional to the image size. There is an exact converse case where you might want to combine many smaller images into one larger one, without needing enough RAM for the whole image. Consider, for example, the final step in a panorama-stitching program like hugin: it's entirely reasonable to want to do this on a memory-constrained platform like a camera or phone.

You bring up a valid point and now I agree there should be scaline_write_iterator.

...

I agree that the way the iterator is dereferenced in that code ("*it;") is non-obvious. I can't think of any other case where dereferencing an iterator has such significant side-effects i.e. it writes into an implied buffer and advances itself. More conventional code might look like:

buffer_t buffer = ......; iterator begin(.....); iterator end(); for( iterator it = begin; it != end; ++it ) { const buffer_t& b = *it; copy_pixels( .... ); }

OK, my code sample wasn't clear. The dereference operator does return a reference to the buffer which I ignored in my code.

...

Here I am explicitly (a) incrementing the iterator, and (b) using the result of the dereference. I think you can do that and still get the same performance.

Mhmm, the ++operator will skip a scanline which is very format specific and the dereference operator will read a scanline which of course is also very format specific. So I guess your for loop you presented above wont work as expected with my current implementation. The difference to a "normal" iterator is that I have two distinct functionalities that I need to squeeze into the standard iterator interface. I would like to hear your opinion on how this could be accomplished! Thanks, Christian

Phil Endecott

8:02 p.m.

Christian Henning wrote:

...

The difference to a "normal" iterator is that I have two distinct functionalities that I need to squeeze into the standard iterator interface.

I don't follow; what two functionalities?

...

I would like to hear your opinion on how this could be accomplished!

I imagine your line-by-line reading iterator would look something like (PSEUDO-CODE!!!): class line_by_line_iterator { buffer_t& buffer; public: const buffer_t& operator*() const { return buffer; } void operator++() { format_specific_read_one_line_into(buffer); } }; I'd do it using iterator_facade. A case that this fails to manage efficiently is when your format-specific implementation has an efficient way to skip forward without actually decoding. This is not something that I've ever had to worry about, but I think you could do it easily enough by adding a flag and doing the read lazily in operator*. Regards, Phil.

Christian Henning

8:46 p.m.

Hi Phil,

...

I don't follow; what two functionalities?

1) read scanline 2) skip scanline

...

I imagine your line-by-line reading iterator would look something like (PSEUDO-CODE!!!):

class line_by_line_iterator { buffer_t& buffer;

public: const buffer_t& operator*() const { return buffer; } void operator++() { format_specific_read_one_line_into(buffer); } };

I'd do it using iterator_facade.

A case that this fails to manage efficiently is when your format-specific implementation has an efficient way to skip forward without actually decoding. This is not something that I've ever had to worry about, but I think you could do it easily enough by adding a flag and doing the read lazily in operator*.

you mean pass a boolean into operator*? Thanks, Christian

Nathan Ridge

8:56 p.m.

...

...
I imagine your line-by-line reading iterator would look something like (PSEUDO-CODE!!!):

class line_by_line_iterator { buffer_t& buffer;

public: const buffer_t& operator*() const { return buffer; } void operator++() { format_specific_read_one_line_into(buffer); } };

I'd do it using iterator_facade.

A case that this fails to manage efficiently is when your format-specific implementation has an efficient way to skip forward without actually decoding. This is not something that I've ever had to worry about, but I think you could do it easily enough by adding a flag and doing the read lazily in operator*.

you mean pass a boolean into operator*?

Surely not, as the signature of an overloaded operator is fixed :) I think Phil means rather than do the read in operator++, just set a flag in operator++ that the read should be done, and actually do it in operator* (which then clears the flag). Then if someone calls operator++ again without calling operator* (which you can detect by the flag being set in operator++), you can do a skip in operator++, and thus avoid decoding the line you didn't need. Regards, Nate

Christian Henning

11:33 p.m.

Hi Nathan,

...

...
you mean pass a boolean into operator*?

Surely not, as the signature of an overloaded operator is fixed :)

I think Phil means rather than do the read in operator++, just set a flag in operator++ that the read should be done, and actually do it in operator* (which then clears the flag). Then if someone calls operator++ again without calling operator* (which you can detect by the flag being set in operator++), you can do a skip in operator++, and thus avoid decoding the line you didn't need.

Interesting idea. I'll update the code. Christian

Christian Henning

19 Feb 19 Feb

12:23 a.m.

...

...
I think Phil means rather than do the read in operator++, just set a flag in operator++ that the read should be done, and actually do it in operator* (which then clears the flag). Then if someone calls operator++ again without calling operator* (which you can detect by the flag being set in operator++), you can do a skip in operator++, and thus avoid decoding the line you didn't need.

Interesting idea. I'll update the code.

Actually that doesn't work. How would the user signal to just skip a scanline? I cannot add a parameter to operator++. scanline_read_iterator< Reader >& operator++( bool read_scanline = true ) { //to something } Christian

Jeffrey Lee Hellrung, Jr.

12:50 a.m.

On Mon, Feb 18, 2013 at 4:23 PM, Christian Henning <chhenning@gmail.com>wrote:

...

...
...
I think Phil means rather than do the read in operator++, just set a flag in operator++ that the read should be done, and actually do it in operator* (which then clears the flag). Then if someone calls operator++ again without calling operator* (which you can detect by the flag being set in operator++), you can do a skip in operator++, and thus avoid decoding the line you didn't need.

Interesting idea. I'll update the code.

Actually that doesn't work. How would the user signal to just skip a scanline? I cannot add a parameter to operator++.

scanline_read_iterator< Reader >& operator++( bool read_scanline = true ) { //to something }

The "do something" is in operator*, no? - Jeff

Christian Henning

12:54 a.m.

...

...
Actually that doesn't work. How would the user signal to just skip a scanline? I cannot add a parameter to operator++.

scanline_read_iterator< Reader >& operator++( bool read_scanline = true ) { //to something }

The "do something" is in operator*, no?

Correct. I changed the code now so that the user will have to skip() to jump over scanlines. I think that's a good solution. If the user doesn't call skip the current scanline will be read. Christian

Jeffrey Lee Hellrung, Jr.

12:58 a.m.

On Mon, Feb 18, 2013 at 4:54 PM, Christian Henning <chhenning@gmail.com>wrote:

...

...
...
Actually that doesn't work. How would the user signal to just skip a scanline? I cannot add a parameter to operator++.

scanline_read_iterator< Reader >& operator++( bool read_scanline = true ) { //to something }

The "do something" is in operator*, no?

Correct. I changed the code now so that the user will have to skip() to jump over scanlines. I think that's a good solution. If the user doesn't call skip the current scanline will be read.

So...doing 2 operator++'s in a row isn't logically equivalent? - Jeff

Christian Henning

1:03 a.m.

...

So...doing 2 operator++'s in a row isn't logically equivalent?

Right now, operator++ only sets the _read_scanline flag. So, doing 2 operator++ wont do anything different. Do have something in mind? Here is a link to the current version. http://svn.boost.org/svn/boost/trunk/boost/gil/extension/io/detail/scanline_... Christian

Christian Henning

1:04 a.m.

On Mon, Feb 18, 2013 at 8:03 PM, Christian Henning <chhenning@gmail.com> wrote:

...

...
So...doing 2 operator++'s in a row isn't logically equivalent?

Right now, operator++ only sets the _read_scanline flag. So, doing 2 operator++ wont do anything different.

Do have something in mind?

Here is a link to the current version.

http://svn.boost.org/svn/boost/trunk/boost/gil/extension/io/detail/scanline_...

Here is a link to my test code http://pastebin.com/6HnS9nc9

Jeffrey Lee Hellrung, Jr.

1:46 a.m.

On Mon, Feb 18, 2013 at 5:03 PM, Christian Henning <chhenning@gmail.com>wrote:

...

...
So...doing 2 operator++'s in a row isn't logically equivalent?

Right now, operator++ only sets the _read_scanline flag. So, doing 2 operator++ wont do anything different.

Do have something in mind?

Here is a link to the current version.

http://svn.boost.org/svn/boost/trunk/boost/gil/extension/io/detail/scanline_...

My first reaction is, it doesn't look like the copy constructor will do the right thing (or is it okay to do _reader->clean_up() multiple times?). Secondly, the behavior of operator* and operator++ doesn't read right, but I'm not intimately familiar with what the implementation is suppose to do. Here they are (simplified and reformatted): reference operator*() { if( _read_scanline ) { _reader->read( _buffer, _pos ); } else { _reader->skip( _buffer, _pos ); } ++_pos; _read_scanline = false; return _buffer; } /// Pre-Increment Operator scanline_read_iterator& operator++() { _read_scanline = true; return *this; } So, operator* effectively advances the iterator, while operator++...doesn't? Huh? (Do I have that transcribed correctly?) What I *think* you want is for operator++ to conditionally call _reader->skip (if the previous operator++ had not been followed by a operator*), while operator* should conditionally call _reader->read (depending on if there had been an intervening operator++ after the previous operator*). So, I think you'd need 2 boolean flags. Does that make sense? (I might objectively not be making sense, I'm not sure.) - Jeff

Christian Henning

2:01 a.m.

...

My first reaction is, it doesn't look like the copy constructor will do the right thing (or is it okay to do _reader->clean_up() multiple times?).

clean_up is format specific and so it could call a 3rd party library function. Better not to allow multiple calls.

...

So, operator* effectively advances the iterator, while operator++...doesn't? Huh? (Do I have that transcribed correctly?)

I fixed that already operator++ is now solely advancing the internal counter.

...

What I *think* you want is for operator++ to conditionally call _reader->skip (if the previous operator++ had not been followed by a operator*), while operator* should conditionally call _reader->read (depending on if there had been an intervening operator++ after the previous operator*). So, I think you'd need 2 boolean flags.

I need to sleep over this. Skipping and reading are both format specific ( jpeg for instance still reads when skipping ) and so I need to know when to read and when to skip. Just increasing the counter doesn't do it. So, you might be right with suggesting that two boolean flags are needed. Thanks a lot for your insight! Christian

Christian Henning

20 Feb 20 Feb

1:33 a.m.

...

What I *think* you want is for operator++ to conditionally call _reader->skip (if the previous operator++ had not been followed by a operator*), while operator* should conditionally call _reader->read (depending on if there had been an intervening operator++ after the previous operator*). So, I think you'd need 2 boolean flags.

Does that make sense? (I might objectively not be making sense, I'm not sure.)

I think your thinking makes sense! I have updated the code here: http://svn.boost.org/svn/boost/trunk/boost/gil/extension/io/detail/scanline_... If you could have a look that would be very kind. Thanks, Christian

Jeffrey Lee Hellrung, Jr.

4:15 a.m.

On Tue, Feb 19, 2013 at 5:33 PM, Christian Henning <chhenning@gmail.com>wrote:

...

...
What I *think* you want is for operator++ to conditionally call _reader->skip (if the previous operator++ had not been followed by a operator*), while operator* should conditionally call _reader->read (depending on if there had been an intervening operator++ after the previous operator*). So, I think you'd need 2 boolean flags.

Does that make sense? (I might objectively not be making sense, I'm not sure.)

I think your thinking makes sense! I have updated the code here:

http://svn.boost.org/svn/boost/trunk/boost/gil/extension/io/detail/scanline_...

If you could have a look that would be very kind.

I had a quick look, and I immediately noticed that you (still, I guess?) are incrementing the _pos member upon dereferencing the first time...is there some reason you don't increment it unconditionally in operator++? I'm guessing the end result is the same, given the pair of boolean flags encoding the state, but it's a "code surprise", which increases the cognitive overhead to read it :) - Jeff

Jeffrey Lee Hellrung, Jr.

4:40 a.m.

On Feb 19, 2013 8:15 PM, "Jeffrey Lee Hellrung, Jr." < jeffrey.hellrung@gmail.com> wrote:

...

On Tue, Feb 19, 2013 at 5:33 PM, Christian Henning <chhenning@gmail.com>

...

...
...
What I *think* you want is for operator++ to conditionally call _reader->skip (if the previous operator++ had not been followed by a operator*), while operator* should conditionally call _reader->read (depending on if there had been an intervening operator++ after the previous operator*). So, I think you'd need 2 boolean flags.

Does that make sense? (I might objectively not be making sense, I'm not sure.)

I think your thinking makes sense! I have updated the code here:

http://svn.boost.org/svn/boost/trunk/boost/gil/extension/io/detail/scanline_...

...
If you could have a look that would be very kind.

I had a quick look, and I immediately noticed that you (still, I guess?) are incrementing the _pos member upon dereferencing the first time...is

wrote: there some reason you don't increment it unconditionally in operator++? I'm guessing the end result is the same, given the pair of boolean flags encoding the state, but it's a "code surprise", which increases the cognitive overhead to read it :)

...

- Jeff

Oh, and after thinking about this a little, effectively incrementing your position within operator* will make operator== behave strangely :/ - Jeff

Christian Henning

3:14 p.m.

Hi Jeff,

...

...
I had a quick look, and I immediately noticed that you (still, I guess?)

are incrementing the _pos member upon dereferencing the first time...is there some reason you don't increment it unconditionally in operator++? I'm guessing the end result is the same, given the pair of boolean flags encoding the state, but it's a "code surprise", which increases the cognitive overhead to read it :)

...
Oh, and after thinking about this a little, effectively incrementing your position within operator* will make operator== behave strangely :/

I don't remember why I was thinking that increasing the pos inside the operator* was a good idea. I have to sit down at home tonight or tomorrow and go over my tests again. Will let you know when I have the next version. Thanks again for your insight, Christian

Christian Henning

22 Feb 22 Feb

1:09 a.m.

...

I had a quick look, and I immediately noticed that you (still, I guess?) are incrementing the _pos member upon dereferencing the first time...is there some reason you don't increment it unconditionally in operator++? I'm guessing the end result is the same, given the pair of boolean flags encoding the state, but it's a "code surprise", which increases the cognitive overhead to read it :)

This issue has been fixed. I don't need to increase the position anymore in the operator*. Thanks! Christian

Mateusz Loskot

19 Feb 19 Feb

1:13 a.m.

AFAIU, the proposal is to support equivalent to this: scanline_read_iterator it =... std::advance (it, 7); // no I/O performed auto& buf = *it; // fetch 7th scanline data now -- Mateusz Loskot, http://mateusz.loskot.net (Sent from mobile, apology for top-posting or broken quotes) On Feb 19, 2013 12:54 AM, "Christian Henning" <chhenning@gmail.com> wrote:

...

...
...
Actually that doesn't work. How would the user signal to just skip a scanline? I cannot add a parameter to operator++.

scanline_read_iterator< Reader >& operator++( bool read_scanline = true ) { //to something }

The "do something" is in operator*, no?

Correct. I changed the code now so that the user will have to skip() to jump over scanlines. I think that's a good solution. If the user doesn't call skip the current scanline will be read.

Christian

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Christian Henning

1:23 a.m.

On Mon, Feb 18, 2013 at 8:13 PM, Mateusz Loskot <mateusz@loskot.net> wrote:

...

AFAIU, the proposal is to support equivalent to this:

scanline_read_iterator it =... std::advance (it, 7); // no I/O performed auto& buf = *it; // fetch 7th scanline data now

Ok, I just need to increase the position inside operator++. But this brings up an interesting point. What would be the difference_type of a scanline_read_iterator? Christian

Jeffrey Lee Hellrung, Jr.

1:49 a.m.

On Mon, Feb 18, 2013 at 5:23 PM, Christian Henning <chhenning@gmail.com>wrote:

...

On Mon, Feb 18, 2013 at 8:13 PM, Mateusz Loskot <mateusz@loskot.net> wrote:

...
AFAIU, the proposal is to support equivalent to this:

scanline_read_iterator it =... std::advance (it, 7); // no I/O performed auto& buf = *it; // fetch 7th scanline data now

Ok, I just need to increase the position inside operator++. But this brings up an interesting point. What would be the difference_type of a scanline_read_iterator?

_pos is an int (presently), but I'm guessing it can only take nonnegative values (other than the sentinel -1 value)? If so, int as the difference type should be sufficient, right? - Jeff

Michael Marcin

20 Feb 20 Feb

4:36 p.m.

Jeffrey Lee Hellrung, Jr. wrote:

...

On Mon, Feb 18, 2013 at 5:23 PM, Christian Henning <chhenning@gmail.com>wrote:

...
On Mon, Feb 18, 2013 at 8:13 PM, Mateusz Loskot <mateusz@loskot.net> wrote:

...
AFAIU, the proposal is to support equivalent to this:

scanline_read_iterator it =... std::advance (it, 7); // no I/O performed auto& buf = *it; // fetch 7th scanline data now

Ok, I just need to increase the position inside operator++. But this brings up an interesting point. What would be the difference_type of a scanline_read_iterator?

_pos is an int (presently), but I'm guessing it can only take nonnegative values (other than the sentinel -1 value)? If so, int as the difference type should be sufficient, right?

I'm not sure exactly what _reader->clean_up(); does but the destructor seems likely wrong. If you make a copy of the iterator and the original or the copy gets destructed it calls clean_up while the other copy may still be in use. Shouldn't clean_up just be part of the scanline_reader destructor and not the iterator? The set_reader and set_buffer member functions seem unnecessary and confuse things. I think the checks at the top of the dereference operator are unnecessary. You're reporting programming errors with exceptions. if( _reader == NULL ) { io_error( "Reader cannot be null for this operation." ); } if( _buffer == NULL ) { io_error( "Buffer cannot be null for this operation." ); } Same in increase_pos if( this->_reader == NULL ) { io_error("Reader cannot be null for this operation."); } The process of generating the buffer and supplying it to the iterator feels strange to me but I could be wrong. There is an equal member function that has nothing to do with operator==. Seems like equal should just go away. I don't see why end has to be constructed in that manner. It would seem to me that an end iterator constructed with the reader makes the implementation a whole lot simpler. I would prefer an interface where you just call reader.begin() and reader.end() to get your iterators. You'll no longer need a sentinel value for past-the-end, _pos could be initialized to reader->_info._height. Speaking of which is _info._height part of the Reader concept? Is the Reader concept documented anywhere I couldn't find it but I only did a cursory search. I don't really understand the second parameter to _reader->skip and _reader->read. It seems to be ignored for everything except the TIFF reader. Which leads me to believe that if TIFF reader needs to keep track of its current row it should do so internally as all the other readers track their internally (seemingly implicitly). Also I'm not sure but given the interface it seems that TIFF skip could maybe be a noop if the row is passed to read still or just a simple increment of an internal row counter. Note also that I think the requirement on InputIterator concept that Postincrement and dereference (*i++) be valid forces lazy evaluation. If you did eager reading/skipping somehow in increment you'd break that expression. Here is a sketch of a possible implementation. http://codepad.org/9hjn3SpQ

Michael Marcin

21 Feb 21 Feb

2:36 a.m.

Michael Marcin wrote:

...

I'm not sure exactly what _reader->clean_up(); does but the destructor seems likely wrong. If you make a copy of the iterator and the original or the copy gets destructed it calls clean_up while the other copy may still be in use. Shouldn't clean_up just be part of the scanline_reader destructor and not the iterator?

The set_reader and set_buffer member functions seem unnecessary and confuse things.

I think the checks at the top of the dereference operator are unnecessary. You're reporting programming errors with exceptions.

if( _reader == NULL ) { io_error( "Reader cannot be null for this operation." ); } if( _buffer == NULL ) { io_error( "Buffer cannot be null for this operation." ); }

Same in increase_pos if( this->_reader == NULL ) { io_error("Reader cannot be null for this operation."); }

The process of generating the buffer and supplying it to the iterator feels strange to me but I could be wrong.

There is an equal member function that has nothing to do with operator==. Seems like equal should just go away.

I don't see why end has to be constructed in that manner. It would seem to me that an end iterator constructed with the reader makes the implementation a whole lot simpler. I would prefer an interface where you just call reader.begin() and reader.end() to get your iterators. You'll no longer need a sentinel value for past-the-end, _pos could be initialized to reader->_info._height.

Speaking of which is _info._height part of the Reader concept? Is the Reader concept documented anywhere I couldn't find it but I only did a cursory search.

I don't really understand the second parameter to _reader->skip and _reader->read. It seems to be ignored for everything except the TIFF reader. Which leads me to believe that if TIFF reader needs to keep track of its current row it should do so internally as all the other readers track their internally (seemingly implicitly). Also I'm not sure but given the interface it seems that TIFF skip could maybe be a noop if the row is passed to read still or just a simple increment of an internal row counter.

Note also that I think the requirement on InputIterator concept that Postincrement and dereference (*i++) be valid forces lazy evaluation. If you did eager reading/skipping somehow in increment you'd break that expression.

Here is a sketch of a possible implementation. http://codepad.org/9hjn3SpQ

I don't particularly love the class name. An istream_iterator iterates over the elements of the stream. A directory_iterator iterates over the contents of a directory. A vector::iterator iterates over the elements of the vector. A scanline_read_iterator iterates over the scanlines of an image. One might reasonably expect a scanline_read_iterator to iterate over the elements of a scanline. I could be wrong and I don't really have a better name aside from maybe scanline_reader::iterator. (reader_t::iterator in your example code) Also scanline_read_iterator.hpp is in the detail folder. If this class isn't meant for public consumption completely ignore my naming concerns.

Christian Henning

2:41 p.m.

Hi Michael,

...

I don't particularly love the class name.

An istream_iterator iterates over the elements of the stream. A directory_iterator iterates over the contents of a directory. A vector::iterator iterates over the elements of the vector.

A scanline_read_iterator iterates over the scanlines of an image.

One might reasonably expect a scanline_read_iterator to iterate over the elements of a scanline.

I could be wrong and I don't really have a better name aside from maybe scanline_reader::iterator. (reader_t::iterator in your example code)

Also scanline_read_iterator.hpp is in the detail folder. If this class isn't meant for public consumption completely ignore my naming concerns.

Thanks a lot for your insight! I hope I'll some time tonight to address all your points. Christian

Christian Henning

22 Feb 22 Feb

2:27 a.m.

...

I don't particularly love the class name.

An istream_iterator iterates over the elements of the stream. A directory_iterator iterates over the contents of a directory. A vector::iterator iterates over the elements of the vector.

A scanline_read_iterator iterates over the scanlines of an image.

One might reasonably expect a scanline_read_iterator to iterate over the elements of a scanline.

I choose that name since I can imagine several types of image iterators. Like a tile_read_iterator, multiple_scanline_read_iterator, etc.

...

I could be wrong and I don't really have a better name aside from maybe scanline_reader::iterator. (reader_t::iterator in your example code)

Reader class now has a typedef iterator_t. But still, what would I call the scanline_read_iterator class?

...

Also scanline_read_iterator.hpp is in the detail folder. If this class isn't meant for public consumption completely ignore my naming concerns.

Users are meant to include format specific header, like bmp_all.hpp, or tiff_write.hpp, or png_read.hpp. So, I put all the other headers in the detail folder even so some of them are not inside the detail namespace. Christian

Christian Henning

2:22 a.m.

Hi Michael,

...

I'm not sure exactly what _reader->clean_up(); does but the destructor seems likely wrong. If you make a copy of the iterator and the original or the copy gets destructed it calls clean_up while the other copy may still be in use. Shouldn't clean_up just be part of the scanline_reader destructor and not the iterator?

I have removed clean_up. It was not used by any image format implementation.

...

The set_reader and set_buffer member functions seem unnecessary and confuse things.

Done.

...

I think the checks at the top of the dereference operator are unnecessary. You're reporting programming errors with exceptions.

if( _reader == NULL ) { io_error( "Reader cannot be null for this operation." ); } if( _buffer == NULL ) { io_error( "Buffer cannot be null for this operation." ); }

Same in increase_pos if( this->_reader == NULL ) { io_error("Reader cannot be null for this operation."); }

Done.

...

The process of generating the buffer and supplying it to the iterator feels strange to me but I could be wrong.

I moved the buffer creation internally.

...

There is an equal member function that has nothing to do with operator==. Seems like equal should just go away.

OK.

...

I don't see why end has to be constructed in that manner. It would seem to me that an end iterator constructed with the reader makes the implementation a whole lot simpler. I would prefer an interface where you just call reader.begin() and reader.end() to get your iterators. You'll no longer need a sentinel value for past-the-end, _pos could be initialized to reader->_info._height.

The readers now have a begin() and end() members.

...

Speaking of which is _info._height part of the Reader concept? Is the Reader concept documented anywhere I couldn't find it but I only did a cursory search.

No there is no Reader concept. What do you have in mind?

...

I don't really understand the second parameter to _reader->skip and _reader->read. It seems to be ignored for everything except the TIFF reader. Which leads me to believe that if TIFF reader needs to keep track of its current row it should do so internally as all the other readers track their internally (seemingly implicitly). Also I'm not sure but given the interface it seems that TIFF skip could maybe be a noop if the row is passed to read still or just a simple increment of an internal row counter.

Skipping can be different for different formats. jpeg for instance still reads the scanline when skipping. For now, I think, we are good.

...

Note also that I think the requirement on InputIterator concept that Postincrement and dereference (*i++) be valid forces lazy evaluation. If you did eager reading/skipping somehow in increment you'd break that expression.

Yes, the postincrement operator is missing. I'll have to look up how it's usually implemented.

...

Here is a sketch of a possible implementation. http://codepad.org/9hjn3SpQ

I see you're using iterator_facade. I was a little afraid of using it since I have never used it before. What do you think of the updated implementation. I think we are getting closer to what we want! If you have the time, would you mind and have another look? Thank you very much for your insight. It's been very helpful! Regards, Christian

Jeffrey Lee Hellrung, Jr.

2:36 a.m.

On Thu, Feb 21, 2013 at 6:22 PM, Christian Henning <chhenning@gmail.com>wrote:

...

Hi Michael,

[...]

...

...
Note also that I think the requirement on InputIterator concept that Postincrement and dereference (*i++) be valid forces lazy evaluation. If you did eager reading/skipping somehow in increment you'd break that expression.

Yes, the postincrement operator is missing. I'll have to look up how it's usually implemented.

...
Here is a sketch of a possible implementation. http://codepad.org/9hjn3SpQ

I see you're using iterator_facade. I was a little afraid of using it since I have never used it before.

If you use iterator_facade, you wouldn't have to worry about defining the post-increment operator :) I don't think iterator_facade is difficult to read up on, it's just using CRTP to define some of the more boilerplate functions of an iterator in terms of others, which you define. E.g., the post-increment operator is implemented in terms of the pre-increment operator, the latter of which you define. operator!= is defined in terms of operator== for you as well. - Jeff

Michael Marcin

5:59 p.m.

New subject: [boost] [gil::io] Feedback for scanline_read_iterator

Christian Henning <chhenning <at> gmail.com> writes:

...

...
The process of generating the buffer and supplying it to the iterator feels strange to me but I could be wrong.

I moved the buffer creation internally.

I think this might be even worse unfortunately, if the iterator is copied, such as with (*i++) it's going to make a full copy of the buffer. Now that readers have a begin and end maybe the reader could own the buffer?

...

Skipping can be different for different formats. jpeg for instance still reads the scanline when skipping. For now, I think, we are good.

Sure that makes sense, but the interface lies IMO. If I have jpeg and I call skip( buffer, 0 ) it's not going to be skipping over scanline 0 (unless it happens to be the first call to read/skip), it's just skipping forward one scanline, the second parameter is unused. Similarly the TIFF reader seems to read skipped scanlines unnecessarily as skip forwards to read but the TIFF interface supports reading any scanline you want directly with TIFFReadScanline.

...

...
Note also that I think the requirement on InputIterator concept that Postincrement and dereference (*i++) be valid forces lazy evaluation. If you did eager reading/skipping somehow in increment you'd break that expression.

Yes, the postincrement operator is missing. I'll have to look up how it's usually implemented.

...
Here is a sketch of a possible implementation. http://codepad.org/9hjn3SpQ

I see you're using iterator_facade. I was a little afraid of using it since I have never used it before.

iterator_facade is super duper simple. You just implement a handful of private member functions, specify the appropriate iterator tag and it generates the rest of the iterator for you with no hassles. I highly recommend trying it.

...

What do you think of the updated implementation. I think we are getting closer to what we want! If you have the time, would you mind and have another look?

I'm not confident your increment operator is correct. I don't think a correct increment can call skip/read for an input_iterator and yours can still call skip. I think this is a better implementation (recreated here with more comments and without the iterator_facade so as not to confuse anything) scanline_read_iterator( Reader& reader ) : _pos( 0 ) , _last_pos( -1 ) { } scanline_read_iterator< Reader >& operator++() { // simply advance _pos, that's all we need to do ++_pos; return (*this); } reference operator*() { // if this is the first dereference, or we've advanced // pos since the last dereference we have work to do // otherwise buffer already contains the scanline data for _pos if ( _last_pos != _pos ) { // advance last_pos skipping any lines until we reach _pos // if we reach _pos after one increment we don't need to do // any skipping and we can just read as it's the next scanline while ( ++_last_pos != _pos ) { _reader->skip( _buffer, _last_pos ); } _reader->read( _buffer, _pos ); } return _buffer; }

Christian Henning

24 Feb 24 Feb

9:07 p.m.

Hi Michael,

...

...
I moved the buffer creation internally.

I think this might be even worse unfortunately, if the iterator is copied, such as with (*i++) it's going to make a full copy of the buffer. Now that readers have a begin and end maybe the reader could own the buffer?

I see two ways, apart from moving the buffer into the reader. I could use a shared ptr or the user has to pass the buffer into the constructor. I lean towards the shared pointer. Do you have an opinion?

...

...
Skipping can be different for different formats. jpeg for instance still reads the scanline when skipping. For now, I think, we are good.

Sure that makes sense, but the interface lies IMO. If I have jpeg and I call skip( buffer, 0 ) it's not going to be skipping over scanline 0 (unless it happens to be the first call to read/skip), it's just skipping forward one scanline, the second parameter is unused.

Similarly the TIFF reader seems to read skipped scanlines unnecessarily as skip forwards to read but the TIFF interface supports reading any scanline you want directly with TIFFReadScanline.

As far as I can tell tiff needs the row number to be read and also tiff rows cannot be read in random order. Jpeg also needs to be read in order which means there is no skipping. Now, because of that and to ensure a common interface among all scanline_readers I need to pass the pos into the skip function.

...

iterator_facade is super duper simple. You just implement a handful of private member functions, specify the appropriate iterator tag and it generates the rest of the iterator for you with no hassles. I highly recommend trying it.

Ok, I agree it's foolish of me not to make use of interator_facade. I have committed the new version. Thanks again for your insight. Christian

Jeffrey Lee Hellrung, Jr.

25 Feb 25 Feb

1:01 a.m.

On Sun, Feb 24, 2013 at 1:07 PM, Christian Henning <chhenning@gmail.com>wrote:

...

Hi Michael,

...
...
I moved the buffer creation internally.

I think this might be even worse unfortunately, if the iterator is copied, such as with (*i++) it's going to make a full copy of the buffer. Now that readers have a begin and end maybe the reader could own the buffer?

I see two ways, apart from moving the buffer into the reader. I could use a shared ptr or the user has to pass the buffer into the constructor. I lean towards the shared pointer. Do you have an opinion?

Does that mean all copies of an iterator share the same buffer? Won't that screw things up if iterators which share the same buffer are dereferenced from different positions in the image? Sounds like the thing to do is to create the buffer on-demand upon dereferencing (and retain it for further dereferences until the iterator is destroyed), and when an iterator is copied, the buffer isn't. Or, maybe having a shared buffer is okay if your iterators are only advertised as input iterators or (equivalently, I think) readable + incrementable. - Jeff - Jeff

Michael Marcin

5:22 a.m.

Jeffrey Lee Hellrung, Jr. wrote:

...

On Sun, Feb 24, 2013 at 1:07 PM, Christian Henning <chhenning@gmail.com>wrote:

...
Hi Michael,

...
...
I moved the buffer creation internally.

I think this might be even worse unfortunately, if the iterator is copied, such as with (*i++) it's going to make a full copy of the buffer. Now that readers have a begin and end maybe the reader could own the buffer?

I see two ways, apart from moving the buffer into the reader. I could use a shared ptr or the user has to pass the buffer into the constructor. I lean towards the shared pointer. Do you have an opinion?

Does that mean all copies of an iterator share the same buffer? Won't that screw things up if iterators which share the same buffer are dereferenced from different positions in the image?

I don't think that is a valid use case for an InputIterator.

Michael Marcin

5:25 a.m.

Michael Marcin wrote:

...

Jeffrey Lee Hellrung, Jr. wrote:

...
On Sun, Feb 24, 2013 at 1:07 PM, Christian Henning <chhenning@gmail.com>wrote:

...
Hi Michael,

...
...
I moved the buffer creation internally.

I think this might be even worse unfortunately, if the iterator is copied, such as with (*i++) it's going to make a full copy of the buffer. Now that readers have a begin and end maybe the reader could own the buffer?

I see two ways, apart from moving the buffer into the reader. I could use a shared ptr or the user has to pass the buffer into the constructor. I lean towards the shared pointer. Do you have an opinion?

Does that mean all copies of an iterator share the same buffer? Won't that screw things up if iterators which share the same buffer are dereferenced from different positions in the image?

I don't think that is a valid use case for an InputIterator.

Jeffrey Lee Hellrung, Jr. wrote:

...

Or, maybe having a shared buffer is okay if your iterators are only advertised as input iterators or (equivalently, I think) readable + incrementable.

Ooops somhow missed this. Yes I believe it is an InputIterator.

Jeffrey Lee Hellrung, Jr.

5:50 a.m.

On Sun, Feb 24, 2013 at 9:25 PM, Michael Marcin <mike.marcin@gmail.com>wrote:

...

Michael Marcin wrote:

...
Jeffrey Lee Hellrung, Jr. wrote:

...
On Sun, Feb 24, 2013 at 1:07 PM, Christian Henning <chhenning@gmail.com>wrote:

Hi Michael,

...
I moved the buffer creation internally.

...
...
I think this might be even worse unfortunately, if the iterator is

copied, such

...
as with (*i++) it's going to make a full copy of the buffer. Now that

readers

...
have a begin and end maybe the reader could own the buffer?

I see two ways, apart from moving the buffer into the reader. I could use a shared ptr or the user has to pass the buffer into the constructor. I lean towards the shared pointer. Do you have an opinion?

Does that mean all copies of an iterator share the same buffer? Won't that screw things up if iterators which share the same buffer are dereferenced from different positions in the image?

I don't think that is a valid use case for an InputIterator.

Jeffrey Lee Hellrung, Jr. wrote:

...
Or, maybe having a shared buffer is okay if your iterators are only advertised as input iterators or (equivalently, I think) readable + incrementable.

Ooops somhow missed this.

Yes I believe it is an InputIterator.

Yeah that's the category that makes the most sense. - Jeff

Mateusz Loskot

1:17 a.m.

On 24 February 2013 21:07, Christian Henning <chhenning@gmail.com> wrote:

...

...
...
Skipping can be different for different formats. jpeg for instance still reads the scanline when skipping. For now, I think, we are good.

Sure that makes sense, but the interface lies IMO. If I have jpeg and I call skip( buffer, 0 ) it's not going to be skipping over scanline 0 (unless it happens to be the first call to read/skip), it's just skipping forward one scanline, the second parameter is unused.

Similarly the TIFF reader seems to read skipped scanlines unnecessarily as skip forwards to read but the TIFF interface supports reading any scanline you want directly with TIFFReadScanline.

As far as I can tell tiff needs the row number to be read and also tiff rows cannot be read in random order.

It's not precise, because rows of some TIFFs can be accessed in random order. The spec confirms that too: "random access to individual scanlines can only be provided when data is not stored in a compressed format, or when the number of rows in a strip of image data is set to one (RowsPerStrip is one)." http://www.remotesensing.org/libtiff/libtiff.html Best regards, -- Mateusz Loskot, http://mateusz.loskot.net

Michael Marcin

6:55 a.m.

Christian Henning wrote:

...

As far as I can tell tiff needs the row number to be read and also tiff rows cannot be read in random order. Jpeg also needs to be read in order which means there is no skipping.

Now, because of that and to ensure a common interface among all scanline_readers I need to pass the pos into the skip function.

Seems like all the more reason to manage the line counter internally to the TIFF reader. Nothing else afaik needs it as it's implicit in the other readers. And you need to have a detailed knowledge of the internal state of the tiff reader in order to pass in any correct value. Mateusz Loskot wrote:

...

It's not precise, because rows of some TIFFs can be accessed in random order. The spec confirms that too:

"random access to individual scanlines can only be provided when data is not stored in a compressed format, or when the number of rows in a strip of image data is set to one (RowsPerStrip is one)."

http://www.remotesensing.org/libtiff/libtiff.html

Even more of a reason if certain tiff formats get efficiency boosts by managing skips differently (noops for uncompressed, reads for compressed).

Christian Henning

9:36 p.m.

Hi Mateusz,

...

...
As far as I can tell tiff needs the row number to be read and also tiff rows cannot be read in random order.

It's not precise, because rows of some TIFFs can be accessed in random order. The spec confirms that too:

"random access to individual scanlines can only be provided when data is not stored in a compressed format, or when the number of rows in a strip of image data is set to one (RowsPerStrip is one)."

http://www.remotesensing.org/libtiff/libtiff.html

Yes, you're correct uncompressed images can be read in random order. I'll add such optimization. Christian

Phil Endecott

19 Feb 19 Feb

10:27 a.m.

Christian Henning wrote:

...

...
...
I think Phil means rather than do the read in operator++, just set a flag in operator++ that the read should be done, and actually do it in operator* (which then clears the flag). Then if someone calls operator++ again without calling operator* (which you can detect by the flag being set in operator++), you can do a skip in operator++, and thus avoid decoding the line you didn't need.

Interesting idea. I'll update the code.

Actually that doesn't work. How would the user signal to just skip a scanline? I cannot add a parameter to operator++.

PSEUDO-CODE!!!! class iterator { buffer_t& buffer; size_t pending_advance_rows; public: const buffer_t& operator*() { while (pending_advance_rows > 1) { format_specific_skip_row(); --pending_advance_rows; } if (pending_advance_rows == 1) { format_specific_read_row_into(buffer); --pending_advance_rows; } return buffer; } void operator++() { ++pending_advance_rows; } }; or in iterator-facade-style: class iterator: public iterator_facade<........> { buffer_t& buffer; size_t pending_advance_rows; const buffer_t& dereference() { while (pending_advance_rows > 1) { format_specific_skip_row(); --pending_advance_rows; } if (pending_advance_rows == 1) { format_specific_read_row_into(buffer); --pending_advance_rows; } return buffer; } void increment() { ++pending_advance_rows; } };

Christian Henning

20 Feb 20 Feb

1:39 a.m.

Hi Phil,

...

PSEUDO-CODE!!!!

class iterator { buffer_t& buffer; size_t pending_advance_rows;

public: const buffer_t& operator*() { while (pending_advance_rows > 1) { format_specific_skip_row(); --pending_advance_rows; } if (pending_advance_rows == 1) { format_specific_read_row_into(buffer); --pending_advance_rows; } return buffer; }

void operator++() { ++pending_advance_rows; } };

I think the scanline_read_iterator is a little more complicated. I have uploaded the lateted "iteration" here: http://svn.boost.org/svn/boost/trunk/boost/gil/extension/io/detail/scanline_... It's crucial that the reader's "read" and "skip" functions are called appropriately. These two functions behave very differently for different image formats. For instance two calls to operator* should not result in two reads. Or when a user does std::advance( it, 20 ) the iterator needs to call "skip" 20 times without any "reads". Jeffery suggested to use two booleans to get the order of calls correct. Let's hope I got it right this time. Thanks, Christian

4531

Age (days ago)

4544

Last active (days ago)

List overview

Download

51 comments

7 participants

participants (7)

Christian Henning
Dave Abrahams
Jeffrey Lee Hellrung, Jr.
Mateusz Loskot
Michael Marcin
Nathan Ridge
Phil Endecott