Re: [boost] Review of Generic Image Library (GIL)begins today, Oct5, 2006

From: Thorsten Behrens [mailto:th.behrens@gmx.net]
Pixel accessors make explicit what's worked-around by GIL's PixelDereferenceAdaptor, namely that getting to the place where the
pixel
is stored and operating on the content of that in-memory representation are orthogonal concepts. Vigra algorithms take a pair of pixel iterators and accessors for source and destination. Thus, to be able to work on a 565 packed RGB pixel type, one can reuse the int16 pixel iterator, and provide a 565 pixel accessor. Very easy, very clean.
You could do the same with GIL. You could use a 16-bit unsigned pixel iterator (or image view) and attach to it a PixelDereferenceAdaptor that can provide an RGB 565 interface. I disagree with your characterization of PixelDereferenceAdaptor as a "work-around". Pixel dereference adaptors are a powerful way to apply functional programming at the level of defining the color of a pixel with given coordinates. You can pipe them together similar to the way you can pipe image views. For example piping RGB-CMYK adaptor over nth_channel adaptor allows you to get the Cyan channel of an RGB image.
that getting to the place where the pixel is stored and operating on the content of that in-memory representation are orthogonal concepts.
I agree, and GIL provides orthogonal ways of extending them. You can use iterator and locator adapters (such as PixelStepIterator) to define coordinate space transformations. Separately, you can use pixel dereference adaptors to define color space transformations. You can combine them to create views such as the upside-down view (spatial transformation) of the green channel (color space transformation) of a given image view.
When it comes to image processing algorithms (I know that this is currently kind of a weak spot for GIL - but ultimately, of how much use is this lib without at least some basic processing functionality), I'd love to see promotion traits used. Otherwise, supporting mixed types (float and int, etc) for color channels generically would become rather hard...
I agree promotion traits are useful, but they don't completely solve the problem. It is not always possible to provide generic definitions for the appropriate type to hold the result of an arithmetic operation. There are often multiple different choices which vary between speed and precision. For example, double may be a more precise holder of the result of float + float, but often float is a reasonable holder. It really depends on the requirements for your task. This is why GIL's policy right now is to allow the client to specify the type of the result as a template parameter. Either way, if we end up using promotion traits, the proper place to place them will be together with imaging algorithms, in the future "numerics" extension. Lubomir

On Mon, Oct 09, 2006 at 03:36:01PM -0700, Lubomir Bourdev wrote:
From: Thorsten Behrens [mailto:th.behrens@gmx.net]
Pixel accessors make explicit what's worked-around by GIL's PixelDereferenceAdaptor, namely that getting to the place where the pixel is stored and operating on the content of that in-memory representation are orthogonal concepts.
You could do the same with GIL. You could use a 16-bit unsigned pixel iterator (or image view) and attach to it a PixelDereferenceAdaptor that can provide an RGB 565 interface.
I disagree with your characterization of PixelDereferenceAdaptor as a "work-around". Pixel dereference adaptors are a powerful way to apply functional programming at the level of defining the color of a pixel with given coordinates. You can pipe them together similar to the way you can pipe image views. For example piping RGB-CMYK adaptor over nth_channel adaptor allows you to get the Cyan channel of an RGB image.
Hi Lubomir, ok, must have missed that aspect of the image view concept. So, if I get all of that right, things work nicely as long as the pixel is able to hand out an l-value for the individual channels to assign to (looking at channel_assigns_t now), with the image views wrapping pixel_t appropriately. Looking further, am I right that having pixel types that advertise a value type they hold no internal representation for (e.g. 1,2,4,8 bpp grey scale packed pixel formats; or palette images, that present a RGB truecolor interface, and internally perform palette lookups) pose problems? Insofar, as then proxy objects are needed, similar to the vector<bool> kludge?
I agree promotion traits are useful, but they don't completely solve the problem. It is not always possible to provide generic definitions for the appropriate type to hold the result of an arithmetic operation. There are often multiple different choices which vary between speed and precision. For example, double may be a more precise holder of the result of float + float, but often float is a reasonable holder. It really depends on the requirements for your task.
This is why GIL's policy right now is to allow the client to specify the type of the result as a template parameter. Either way, if we end up using promotion traits, the proper place to place them will be together with imaging algorithms, in the future "numerics" extension.
Fair enough - if you provide the flexibility to override the traits on an algorithm-by-algorithm base, that's even better. Cheers, -- Thorsten

Thorsten wrote:
Hi Lubomir,
ok, must have missed that aspect of the image view concept. So, if I get all of that right, things work nicely as long as the pixel is able to hand out an l-value for the individual channels to assign to (looking at channel_assigns_t now), with the image views wrapping pixel_t appropriately.
Being able to provide l-value for the individual channels is not a requirement of PixelDereferenceAdaptor. For example, the virtual image view of the Mandelbrot set (see tutorial) is created with a model of PixelDereferenceAdaptor which does not allow for writing (i.e. changing the values of the Mandelbrot set). Thus pixel dereference adaptors may be mutable and immutable (see the concept description). When you attach an immutable dereference adaptor to a view, you will get back immutable (read-only) image view. nth_channel_deref_fn is another model of PixelDereferenceAdaptor that given a pixel returns a grayscale pixel of its n-th channel. Its mutability is determined by the mutability of its source. When you use it over a Mandelbrot set, for example, you get immutable grayscale view. But if you apply it over a regular mutable view, you get back a mutable grayscale view.
Looking further, am I right that having pixel types that advertise a value type they hold no internal representation for (e.g. 1,2,4,8 bpp grey scale packed pixel formats; or palette images, that present a RGB truecolor interface, and internally perform palette lookups) pose problems?
GIL pixels/channels should never advertise that they have larger capacity than they actually do. The capacity is a property of the channel type (see channel_max_value). If you use a PixelDereferenceAdaptor to transform a gray16 pixel to an rgb565 pixel, your pixel dereference adaptor defines the type of the result. That type is picked up by constructs that use it. So your iterator/locator/image view that has the above adaptor will report that their pixel type is rgb565 pixel, and if you ask for the max value of the first channel you should get 31 (assuming you have defined everything properly).
Insofar, as then proxy objects are needed, similar to the vector<bool> kludge?
Yes, GIL uses a reference proxy object to represent a reference to a planar pixel. There are a number of implications to this, among which is that all GIL pixel iterators cannot be Random Access Iterators, but are Random Access Traversal Iterators. And yes, it is not possible to create a class that behaves like a native C reference in all contexts, but we feel that the benefit of being able to abstract away the fact that the image could be planar or interleaved is well worth using a proxy reference class. Lubomir

On Tue, Oct 10, 2006 at 11:16:39AM -0700, Lubomir Bourdev wrote:
Thorsten wrote:
Looking further, am I right that having pixel types that advertise a value type they hold no internal representation for (e.g. 1,2,4,8 bpp grey scale packed pixel formats; or palette images, that present a RGB truecolor interface, and internally perform palette lookups) pose problems?
GIL pixels/channels should never advertise that they have larger capacity than they actually do.
How else would you then implement paletted images?
Insofar, as then proxy objects are needed, similar to the vector<bool> kludge?
Yes, GIL uses a reference proxy object to represent a reference to a planar pixel. There are a number of implications to this, among which is that all GIL pixel iterators cannot be Random Access Iterators, but are Random Access Traversal Iterators. And yes, it is not possible to create a class that behaves like a native C reference in all contexts, but we feel that the benefit of being able to abstract away the fact that the image could be planar or interleaved is well worth using a proxy reference class.
I'm not questioning the necessity to abstract away internal memory representations - quite the contrary. My point is, that using pixel accessors as an explicit concept achieves this goal in a much clearer fashion, without all the drawbacks of reference proxy objects. Accessors just provide this proverbial extra level of indirection that's needed for most of the 'interesting' pixel formats. From a design POV, I'd favor a solution that models this aspect of the problem domain explicitely, if only for the fear that reverting to proxies could be limiting in certain cases (be it compiler idiosyncracies or hard language limits - sorry, no concrete examples, just a gut feeling). Cheers, -- Thorsten

Thorsten wrote:
GIL pixels/channels should never advertise that they have larger capacity than they actually do.
How else would you then implement paletted images?
First, a bit of a background for people who don't know what paletted (more often called indexed) images are. An indexed image stores the color as 1-dimensional index to a lookup table that contains the actual values of the color. For example, consider this lookup table: 0 [255 0 0] // red 1 [128 128 128] // gray 2 [0 128 0] 3 [0 0 0] // black ... Then three pixels with values [0 3 1] correspond to [red black gray] pixels. One obvious advantage to indexed images is that they provide a more compact representation of the data. The way indexed images can be modeled in GIL is using a PixelDereferenceAdaptor over grayscale image of the appropriate channel depth. Specifically: 1. Create an immutable pixel adapter that references the lookup table. Its application operator takes an index and returns an RGB pixel value. Here is a rough synopsis for a concrete version; you can easily make it more generic and improve it in other ways: struct indexed_pixel_deref_fn { typedef rgb8_pixel_t value_type; static const bool is_mutable=false; indexed_pixel_deref_fn(const rgb8_pixel_t* table); rgb8_pixel_t operator()(const gray8_pixel_t& index) const { return _table[index]; } .... }; 2. Create your indexed image by attaching your pixel dereference adapter on a regular grayscale image: typedef gray8_view_t::add_deref<indexed_pixel_deref_fn> indexed_factory_t; typedef indexed_factory_t::type rgb8_indexed_view_t; rgb8_indexed_view_t indexed_view= indexed_factory_t::make(my_gray_view, indexed_pixel_deref_fn(my_index_table)); Now your indexed_view will behave like a regular 8-bit RGB interleaved view. It will be immutable (unless you make an advanced dereference adapter that can do some sort of reverse lookup). But you should be able to copy it to a non-indexed rgb view, compute its gradient, make n-th channel view of its green channel, save it to a file, etc. And to your earlier question, notice that the indexed view does not advertise that it has a larger capacity. Lubomir

On Wed, Oct 11, 2006 at 02:17:30PM -0700, Lubomir Bourdev wrote:
Thorsten wrote:
GIL pixels/channels should never advertise that they have larger capacity than they actually do.
How else would you then implement paletted images?
[snip]
The way indexed images can be modeled in GIL is using a PixelDereferenceAdaptor over grayscale image of the appropriate channel depth. Specifically:
1. Create an immutable pixel adapter that references the lookup table. Its application operator takes an index and returns an RGB pixel value. Here is a rough synopsis for a concrete version; you can easily make it more generic and improve it in other ways:
[snip]
Now your indexed_view will behave like a regular 8-bit RGB interleaved view. It will be immutable (unless you make an advanced dereference adapter that can do some sort of reverse lookup). But you should be able to copy it to a non-indexed rgb view, compute its gradient, make n-th channel view of its green channel, save it to a file, etc.
Understood (but that's how I thought it would work).
And to your earlier question, notice that the indexed view does not advertise that it has a larger capacity.
Well, once you have a mutable indexed view as an operation's destination (that's again the interesting case for me, since it requires a reference proxy object), one could certainly tell that 'advertising more capacity' as physically available - after all, performing a nearest color lookup or dithering involves data loss. There's not much difference between an adaptor silently chopping off some bits from the color channels, and this reverse lookup operation. And what about the pixel type / memory representation dichotomy and the pixel accessor solution for that? Cheers, -- Thorsten

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Thorsten Sent: Wednesday, October 11, 2006 10:06 PM To: boost@lists.boost.org Subject: Re: [boost] GIL & pixel accessors (was: Review of Generic ImageLibrary (GIL)begins today, Oct5, 2006)
On Wed, Oct 11, 2006 at 02:17:30PM -0700, Lubomir Bourdev wrote:
Thorsten wrote:
GIL pixels/channels should never advertise that they have larger capacity than they actually do.
How else would you then implement paletted images?
[snip]
The way indexed images can be modeled in GIL is using a PixelDereferenceAdaptor over grayscale image of the appropriate channel depth. Specifically:
1. Create an immutable pixel adapter that references the lookup
Its application operator takes an index and returns an RGB pixel value. Here is a rough synopsis for a concrete version; you can easily make it more generic and improve it in other ways:
[snip]
Now your indexed_view will behave like a regular 8-bit RGB interleaved view. It will be immutable (unless you make an advanced dereference adapter that can do some sort of reverse lookup). But you should be able to copy it to a non-indexed rgb view, compute its gradient, make n-th channel view of its green channel, save it to a file, etc.
Understood (but that's how I thought it would work).
And to your earlier question, notice that the indexed view does not advertise that it has a larger capacity.
Well, once you have a mutable indexed view as an operation's destination (that's again the interesting case for me, since it requires a reference proxy object), one could certainly tell that 'advertising more capacity' as physically available - after all, performing a nearest color lookup or dithering involves data loss. There's not much difference between an adaptor silently chopping off some bits from the color channels, and
table. this
reverse lookup operation.
And what about the pixel type / memory representation dichotomy and
We are now outside the GIL realm and in the realm of best engineering practices. You could do a number of things: 1. Disallow modifying indexed RGB images 2. Have your dereference adaptor search for a perfect match, and throw an exception upon failure 3. Switch to the nearest match, with a lossy transformation. I wouldn't recommend this, but you could do it 4. Allow modifying the data only through the grayscale view, i.e. modifying the indices directly 5. Define a custom channel model, which will make your RGB indexed image incompatible with other RGB images and will prevent copying from one to another. Define channel conversion to allow your indexed image to be assigned another image only via color conversion, which is allowed to be lossy. the
pixel accessor solution for that?
I am trying to access Vigra's web page to read more about accessors before commenting on this but it seems to be down... In the meantime, I want to clarify a possible confusion: using PixelDeferenceAdaptor does not imply using a pixel reference proxy. They are not really related. The nth_channel_view of a typical memory-based image uses a PixelDeferenceAdaptor but a built-in C reference. GIL employs pixel reference proxy only to represent a reference to a planar pixel. Lubomir
participants (2)
-
Lubomir Bourdev
-
Thorsten