GIL Review - PixelDereferenceAdaptors vs. DataAccessors

Hi again, the issue of PixelDereferenceAdaptors vs. DataAccessors has been intensively discussed during this review, especially the issues arising when the PixelDereferenceAdaptor must support both rvalue and lvalue use. To get objective data on this, I want to do some experiments comparing the concepts. But to my big surprise, I couldn't find any PixelDereferenceAdaptor in the public part of GIL which supports lvalue use. I'd appreciate if the GIL authors could provide an implementation of a mutable interleaved_rgb_image_view(a_planar_rgb_image) or point me to the right construct. Thanks Ulli -- ________________________________________________________________ | | | Ullrich Koethe Universitaet Hamburg / University of Hamburg | | FB Informatik / Dept. of Informatics | | AB Kognitive Systeme / Cognitive Systems Group | | | | Phone: +49 (0)40 42883-2573 Vogt-Koelln-Str. 30 | | Fax: +49 (0)40 42883-2572 D - 22527 Hamburg | | Email: u.koethe@computer.org Germany | | koethe@informatik.uni-hamburg.de | | WWW: http://kogs-www.informatik.uni-hamburg.de/~koethe/ | |________________________________________________________________|

Hi Ulli, We don't use a PixelDereferenceAdaptor to model planar organization. We only need the adaptor in cases where we want to perform some arbitrary transformation upon dereferencing. For planar images, say RGB, we have a planar iterator (planar_ptr) containing three pointers inside and upon dereferencing it returns a planar reference proxy (planar_ref) that contains three references. It behaves like a native C reference. For example, you can use its operator= to assign it an RGB pixel value and it will properly modify the three locations. I know exactly what you are after :-) You want an example where we do some arbitrary transformation upon dereferencing and we want to be able to assign to the result, i.e. to run the transformation "backwards". You want to show that doing so is very tricky and involved.
But to my big surprise, I couldn't find any PixelDereferenceAdaptor in the public part of GIL which supports lvalue use.
...which leads to my first argument: that cases where you need to do so (in the context of image processing) simply don't occur frequently in practice. We could think of some really fancy scenarios where you need to do so, but I hope you will agree that these are not mainstream by any means. (To get the nth channel of a memory-based image we simply change the pixel type and the iterator step) Now, suppose we find a need to do so. You will say that implementing this is very tricky, and I will agree. It is tricky, someone might call it "hacky", and involved but I think it is possible. Here is how I would approach this: Create an object that acts as a proxy reference: - It has a conversion operator that converts it to the value type. This implements the "read" direction - It also has an operator= that takes a value type. It implements the "write" direction. This basically corresponds to what you call DataAccessor, except that you provide a different interface. Not so tricky after all, but trickier than doing a DataAccessor. Note, however, that the difficulty is on the side of the library designer/extender, which is expected to have higher expertise with the library, whereas the benefits are on the side of the _user_. My principles are that it is OK to make the designer's job much harder, if this leads to even small benefits on the side of the user. And what are those benefits? First, education. People are familiar with iterators, but not so will data accessors. They have to learn a new concept, as simple as it may be, and learn to recognize and apply the new interface every time they iterate over the pixels of an image. Second, reuse. The promise of generic programming can best be realized by agreeing to build on the same concepts. PixelAccessors may be appealing (I think they are in many ways) but unfortunately people have invested already a lot of effort in writing algorithms that deal just with iterators. If I want to copy two images I could just use std::copy. If I want to rotate the pixels 180 degrees, I can use std::reverse. Of course, as Thorsten pointed out, GIL often provides performance overloads. But the important thing is, we don't have to ; the algorithms still work. Besides, in the cases where we do, like std::copy, we typically still delegate to STL and call their std::copy. We don't explicitly call memmove, we call std::copy with PODs and STL turns it into memmove. We want to delegate as much as we can to standard components because they could be better tested and optimized. And the story, of course, does not end with the STL. If I want to find the largest and smallest pixels in my image I can just use boost::minmax_element. You may say that each of these is a trivial algorithm and easy to just make a version that works with Vigra, but as a collection they become a lot, and it is an open-ended set of algorithms: By conforming to the standard GIL will automatically benefit from any new algorithms that people may provide in the future. Besides, I am not sure they are all trivial. I haven't looked into Boost Graph, but since its algorithms follow standard iterator convention, I wouldn't be surprised if it is easier to combine with GIL and make algorithms like Graph Cut that have recently become popular for image segmentation. A graph cut would be quite non-trivial to reimplement. Of course, in cases where you need to use standard components you could argue that you can make one from iterator and DataAccessor (essentially the DereferenceAdaptor approach). But once you do that, you will have two ways of doing the same thing; isn't removing DataAccessor going to simplify things then? Third, standardization directly decreases the cost of maintaining the library. Writing algorithms like vigra::transformLine and vigra::transformLineIf is the least of the work. You will have to document them, maintain them, extend them, debug them, port them, performance-optimize them... instead of having this be handled by someone else. Fourth, the user code is simplified. Instead of this: template <class SrcIterator, class SrcAccessor, class DestIterator, class DestAccessor, class Functor> void transformLine(SrcIterator s, SrcIterator send, SrcAccessor src, DestIterator d, DestAccessor dest, Functor const & f) { for(; s != send; ++s, ++d) dest.set(f(src(s)), d); } You just need this: template <typename Src, typename Dst> void copy(Src first, Src last, Dst dst) { for (; first!=last; ++first, ++dst) *dst = *first; } Fewer template parameters, simpler interfaces, shorter code, easier to read, less opportunity to introduce bugs. Fifth, conceptually it makes sense to combine the Accessor and the Iterator into one, because what they really represent is an iterator over a range of TransformedPixel. Its value type really is TransformedPixel. To summarize my points: - the cases where lvalue is needed are rare - while tricky, it is possible to implement this - the burden is on the side of library designer/extender and is done once - the benefit is on the side of the library user. There are many users, and their threshold of familiarity with the library is lower. - the benefits include: - less education. Users are familiar with the iterator concepts. They need to learn about accessors - reuse of other generic components, like STL and boost libraries - standardization helps decrease maintenance and bug fixing cost - the user code is simplified. Expressions are cleaner. Functions take fewer arguments - conceptually it makes sense to combine the Accessor and the Iterator And finally, DataAccessor is an interesting idea worthy of investigation, but it spans beyond images. The right way to approach this, in my opinion, is to try to make the case for DataAccessors as an addition to the standard. If DataAccessors are accepted, lots of my objections will no longer hold. Lubomir

Hi Lubomir, thanks for the long reply. I really appreciate this. I think we have both made our points clear. Now I'd like to hear the opinion of others...
And finally, DataAccessor is an interesting idea worthy of investigation, but it spans beyond images. The right way to approach this, in my opinion, is to try to make the case for DataAccessors as an addition to the standard. If DataAccessors are accepted, lots of my objections will no longer hold.
Exactly. But my favourite solution would be to extend the language so that one could implement separate lvalue and rvalue versions of operator* and operator[] (similar to the pre- and postfix increment operator). But I don't know whether this is possible, especially considering backward compatibility. Best regards Ulli -- ________________________________________________________________ | | | Ullrich Koethe Universitaet Hamburg / University of Hamburg | | FB Informatik / Dept. of Informatics | | AB Kognitive Systeme / Cognitive Systems Group | | | | Phone: +49 (0)40 42883-2573 Vogt-Koelln-Str. 30 | | Fax: +49 (0)40 42883-2572 D - 22527 Hamburg | | Email: u.koethe@computer.org Germany | | koethe@informatik.uni-hamburg.de | | WWW: http://kogs-www.informatik.uni-hamburg.de/~koethe/ | |________________________________________________________________|

On Wed, Oct 18, 2006 at 11:58:55AM -0700, Lubomir Bourdev wrote:
[about iterator/accessor vs. iterator-only]
Second, reuse. The promise of generic programming can best be realized by agreeing to build on the same concepts. PixelAccessors may be appealing (I think they are in many ways) but unfortunately people have invested already a lot of effort in writing algorithms that deal just with iterators. If I want to copy two images I could just use std::copy. If I want to rotate the pixels 180 degrees, I can use std::reverse. Of course, as Thorsten pointed out, GIL often provides performance overloads. But the important thing is, we don't have to ; the algorithms still work. Besides, in the cases where we do, like std::copy, we typically still delegate to STL and call their std::copy. We don't explicitly call memmove, we call std::copy with PODs and STL turns it into memmove. We want to delegate as much as we can to standard components because they could be better tested and optimized. And the story, of course, does not end with the STL. If I want to find the largest and smallest pixels in my image I can just use boost::minmax_element. You may say that each of these is a trivial algorithm and easy to just make a version that works with Vigra, but as a collection they become a lot, and it is an open-ended set of algorithms: By conforming to the standard GIL will automatically benefit from any new algorithms that people may provide in the future.
From a plain user's perspective, I find something like http://opensource.adobe.com/gil/packed_pixel.hpp just a little bit too involved - I'd expect the common set of pixel types to be supported natively in GIL (I acknowledge you already offered to include packed
From how it appears to me (and what you and others stated), GIL's selling points are currently pixel and image view abstractions, not so much the actual algorithms. As a user, I don't have too much use for
I agree - with STL-style iterators being a well-established concept, using them provides many benefits. But since the separation into DataAccessor/Iterator (or PropertyMap/Cursor, see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1873.html) so nicely separates concerns (I think this is *especially* apparent in the image pixel domain - packed pixel, planar channels, tiled or synthetic images), I'd still love to see this concept used in GIL. pixel support). Both items taken together, what do you think about the following idea: model GIL's core functionality according to DataAccessor/Iterator, and provide an STL-style RandomAccessTraversalIterator as a wrapper on top of that? Wouldn't that make up for a truly generic framework, where I'd only need to provide something along the lines of your pack/unpack_channel_fn functions and a bit of glue to add my own, custom packed pixel type? Towards the algorithms, you stay compatible, to the inside you're extremely flexible. Then, Vigra & AGG... that, I'm more interested in turn-key solutions like a Canny edge detector, or in filtered image transformations (likewise, I'm a user of Spirit, not so much of MPL). Thus, I feel it's crucial that, supposed GIL is accepted into boost, it gets a set of common algorithms soon. Luckily, Vigra was also offered to boost, and you and Ulli have proof-of-concept code how to integrate them. I'd therefore really like to see Vigra's algorithmic half in boost, too. Ulli, do you think that's feasible? Cheers, -- Thorsten

Quoting Thorsten Behrens <th.behrens@gmx.net>:
Then, Vigra & AGG...
From how it appears to me (and what you and others stated), GIL's selling points are currently pixel and image view abstractions, not so much the actual algorithms. As a user, I don't have too much use for that, I'm more interested in turn-key solutions like a Canny edge detector, or in filtered image transformations (likewise, I'm a user of Spirit, not so much of MPL). Thus, I feel it's crucial that, supposed GIL is accepted into boost, it gets a set of common algorithms soon. Luckily, Vigra was also offered to boost, and you and Ulli have proof-of-concept code how to integrate them. I'd therefore really like to see Vigra's algorithmic half in boost, too.
Ulli, do you think that's feasible?
Yes, images alone are not of much use. I have been thinking about integration possibilities during the last week. We should try to come up with an evolutionary path to integrate things. For now, DataAccessor and PixelDereferenceAdaptors can happily live together (you can apply an accessor to an adaptor, no problem). In the long run, I´d like to apply some lessons learned from VIGRA. For example, I would like a design that can be extended naturally to 3-dimensional and higher dimensional images (a lot of algorithm reuse is possible accross dimensions, if it is done right). Best regards Ulli PS. I´ll be away for a week and cannot follow the discussion closely.

Both items taken together, what do you think about the following idea: model GIL's core functionality according to DataAccessor/Iterator, and provide an STL-style RandomAccessTraversalIterator as a wrapper on top of that?
Hi Thorsten, As a user of GIL I don't see why you should care if it internally uses DataAccessor/Iterator or STL-style iterators. All you should care about is how easy it is to use and extend GIL. In particular, if you have a DataAccessor+Iterator can you make easily from it a GIL iterator and image view? For immutable DataAccessors, it is trivial. You can think of GIL's PixelDereferenceAdaptor almost like a DataAccessor. It has an application operator that takes anything you want (for example a pixel reference) and must return something convertible to PixelValueConcept. To attach a PDA (PixelDereferenceAdaptor) to a given View, you simply do this: typename View::add_deref<PDA>::type newView = View::add_deref<PDA>::make(myView, myPDA); That's it! Now if you want to save the even pixels of the first channel of the view, and apply your custom transformation, you do this: jpeg_write_view("out.jpg", subsampled_view(nth_channel_view(newView, 0), 2,2)); With the DataAccessor/Iterator style, piping of views is not as nice. You will have to duplicate the piping chain for the accessor as well: jpeg_write_view("out.jpg", subsampled_view(nth_channel_view(newView, 0), 2,2), subsampled_accessor(nth_channel_accessor(newAcc, 0), 2,2)); Adapting a mutable DataAccessor requires a bit more work, and we can certainly add some utilities to make it easier, but I am having a hard time thinking of a practical example where you need this. Thorsten wrote:
From how it appears to me (and what you and others stated), GIL's selling points are currently pixel and image view abstractions, not so much the actual algorithms. As a user, I don't have too much use for that, I'm more interested in turn-key solutions like a Canny edge detector...
Ulli also wrote:
Yes, images alone are not of much use.
Please don't forget that computer vision is a niche domain. There is a much broader domain of basic image manipulation, like loading an image from file and converting it to a format that your window manager can display, or making a thumbnail out of an image. Far more people need basic utilities like the above. People like you who need a Canny edge detector, or even know what it means, are a minority. Lubomir

On Mon, Oct 23, 2006 at 03:06:44PM -0700, Lubomir Bourdev wrote:
As a user of GIL I don't see why you should care if it internally uses DataAccessor/Iterator or STL-style iterators. All you should care about is how easy it is to use and extend GIL. In particular, if you have a DataAccessor+Iterator can you make easily from it a GIL iterator and image view?
For immutable DataAccessors, it is trivial. You can think of GIL's PixelDereferenceAdaptor almost like a DataAccessor.
[snip]
Adapting a mutable DataAccessor requires a bit more work, and we can certainly add some utilities to make it easier, but I am having a hard time thinking of a practical example where you need this.
Hi Lubomir, I agree with your first point, and gave an example for the second (removing much of the boilerplate from your packed pixel implementation). To the user, GIL's iterators can stay exactly as they are - to the inside, they are composed of an iterator and (optionally) a DataAccessor. See it as a way to ease addition of new in-memory pixel representations.
Please don't forget that computer vision is a niche domain. There is a much broader domain of basic image manipulation, like loading an image from file and converting it to a format that your window manager can display, or making a thumbnail out of an image.
True as well - but there's a large grey area between the two, and your last example already touches one. Generating a nicely downscaled thumbnail requires low-pass filtering, transforming an image generally requires interpolation. For both tasks, Vigra contains a whole family of algorithms (recursive filters and convolution; linear, B-spline, Catmull- Rom, and CosCot interpolation).
Far more people need basic utilities like the above. People like you who need a Canny edge detector, or even know what it means, are a minority.
Admittedly, the Canny edge detector was a bad example. But I'm otherwise totally convinced that with languages like SVG, UIs like Aqua or Vista, stuff like blurring, convolution, turbulence or morphological operations are bread-and-butter effects for today's graphics programmers (c.f. SVG's filters). And you probably know very well that stuff like OCR, semi-automatic segmentation and face recognition is going down-market (not that Vigra provides much of that yet - but there's non-linear diffusion) - all of that is computer vision at it's best. ;-) So, really, I don't see the issue here - GIL provides the sound foundations, Vigra a wealth of algorithms. Looks like a perfect fit to me. Cheers, -- Thorsten
participants (4)
-
koethe@informatik.uni-hamburg.de
-
Lubomir Bourdev
-
Thorsten Behrens
-
Ullrich Koethe