Re: [boost] GIL - late review

2 Nov 2006

      Hi Ulli, 

Thanks for your review!

Ulli wrote:
...
I vote to accept GIL into boost after revision. In addition, 
I consider 
addition of an algorithm library in the not-too-distant 
future mandatory 
in order to realize the full potential of the image classes.
As I already mentioned, we also would like very much to see image
processing algorithms added to GIL, whether or not it is accepted in
Boost.
In particular, the Vigra algorithms provide a good starting point.
Although they could be invoked through a compatibility layer, as you and
I demonstrated, it would be great if they are rewritten in native GIL,
because we don't believe a compatibility layer can handle all possible
image abstrations.
If you and/or others would like to volunteer to do that, you can expect
and extra level of support from us.

We also are looking for volunteers to implement other algorithms, such
as the ones in OpenCV. It would be great to also have a GIL-layer over
Intel's IPP library. Also, providing interfaces to AGG would be
terrific.
Providing parallel versions of image processing algorithms would be
great. GPU support too.
More I/O formats too.
...
...
- What is your evaluation of the design?
I like the design of the image classes (including the 
dynamic_image) and 
the heavy use of views as a major algorithm interface. The various 
iteration abstractions are also quite powerful. The proof-of-concept 
implementation of the VIGRA interoperability layer was quite 
encouraging.
I'd like to see the following additions to the image class:
- constructors that allow copy construction of the pixels values
You mean deep-copy the images? The image class has a copy constructor
which deep-copy's the images.
...
- resize() functions with default construction and copy 
construction of 
the newly allocated pixel values (without resize(), the default 
constructor image() is pointless).
There is a function called resize_clobber_image. Is that what you are
looking for?
If the images are of different dimensions, it constructs an image of the
target size and swaps them.
It does not copy the pixels. This could be done manually with
copy_pixels if needed.
We are looking for a better name, but resize_image implies the pixels
are copied (and truncated) as with vector resize. We didn't add copying
of the pixels because doing so is often not necessary, and it is unclear
what the copy policy should be exactly. (copy into top left? Truncate?)
...
I'm less convinced of the pixel classes (which, as I argued 
in another 
post, should be renamed into pixel_value or just value). 
IMHO, a pixel 
class without a set of operations specific to this particular 
pixel type 
is not very useful. As far as I can tell, there are not even 
conversion 
functions between RGB, LAB and HSB (and, unlike the RGB-CMYK 
conversion, 
these are too expensive to be implemented as pixel adapters).
I recommend to either
- drop the specific color classes for the time being and use 
a general 
static_vector<3, T> together with  accordingly changed 
pixel_value_tag 
concepts, or
- add a set of functions justifying the existence of the 
color classes.
Why does VIGRA have a separate class for RGB Value then? What is special
about RGB?

There are lots of functions that take the color information into
account. Color information goes down to the very basics. For example,
any per-channel operation that takes two or more pixels operates on
semantic channels, not physical channels. These will do the right thing:

bool eq=equal_pixels(rgb8_view, bgr8_view);
copy_pixels(rgb8_view, bgr8_view);
rgb8_pix = bgr8_pix;

rgb8_image_t img(bgr8_img);

Color information also is useful for compile-time type safety. GIL
constructs that don't have the same base color space are incompatible
and cannot be copy-constructed, assigned and compared to each other.
These will generate a compile error:

copy_pixels(lab8_view, rgb8_view);  // error
rgb8_pix = lab8_pix;  // error

This level of compile-time safety is not possible if you only consider
the type and number of channels.

Of course, there are a variety of color conversion functions:

copy_and_convert_pixels(rgb8_view, cmyk16_view);

cmyk16_pixel_t c16;
color_convert(rgb8_pixel_t(255,0,0), c16);

You are right that some color conversion combinations are not yet
provided. This has nothing to do with the performance to do color
conversion, but more with the fact that there is a combinatorial
explosion in the number of conversions - every color space to every
other.
Color conversion between RGB and LAB could be implemented just like it
is done between, say, RGB and CMYK.
Is there any performance advantage to implementing it differently?

I agree that sometimes it doesn't make sense to have a specific color
space. Sometimes you really want to think of the data as image of pixel
with N channels of type T. GIL provides models of anonymous N-channel
homogeneous pixels, which have "device-N" color space. Here are the
types of 5-channel anonymous 16-bit unsigned integral pixels and image
views:

dev5n16_pixel_t p(0,1,3,2,3);
dev5n16_view_t view5;

Their color space is devicen_t<N> where N is the number of channels.
(We are open to suggestions for a better name)
...
Regarding pixel type conversion and mixed-valued functions: 
It was already 
pointed out that the resulting code bloat can easily get out 
of control, 
especially in the context of the dynamic_image. I read 
Lubomir's OOPSLA
Which, by the way, I noticed is posted here:
http://sms.cs.chalmers.se/bibliography/proceedings/2006-LCSD.pdf
...
paper and think that it only offers a partial solution - it 
only covers 
the case were different type combinations actually create the 
same binary 
code. IMHO, this is not the most problematic case.
Consider just the scalar pixel types 'unsigned byte', 'signed short', 
'unsigned short', 'int', and 'float'. All of them are 
regularly occuring 
in VIGRA algorithms, and no combination will be binary equal 
to any other.
As described in my paper, the types can be equivalent in the context of
the algorithm.
For example, for bitwise-copy, copying "signed short to signed short"
and copying "unsigned short to unsigned short" can be treated as
equivalent.
...
What is needed is a consistent set of coercion rules. For example, 
'unsigned char' x 'unsigned char' might be implemented natively (for 
speed), while 'unsigned char' x 'float' will be implemented 
by a coercion 
of the first pixel to 'float', followed by a native 'float' x 'float' 
operation.
These rules, together with appropriate result_of types, can be easily 
formulated in a set a traits classes, and should also be 
customizable on a 
per-function-call basis. Coercion may not be a prerequisite for the 
usefulness of the core part of GIL, but it is critical for the 
dynamic_image and the future image algorithm library 
(including the color 
conversion function suggested above).
I can see how automatically converting one type to another can be
useful, but I am not sure how this relates to the topic of
dynamic_image.
Whether or not the type of the image is specified at run time, and
whether or not you want to convert one type to another are two
orthogonal things... unless I am misunderstanding you.
...
...
- What is your evaluation of the implementation?
It is mostly OK. I would like the following improvements:
- there are several comments saying 'potential performance 
problem' (or 
equivalent) in the code. The code should be changed to 
eliminate these 
problems because the affected functions are performance critical.
I agree, on a case by case basis. We are trying to get what we believe
are the bottlenecks as efficient as possible. It would be quite a job to
make all parts of GIL completely optimal. Very often extra efficiency
comes at the cost of code size, extra stuff in data structures, or
decreasing the performance in other places of the code... Sometimes the
tradeoff is not worth it.

For example, currently view.end() is not very efficient as it is
implemented as "return begin()+size()".
If you say:
for (it=view.begin(); it!=view.end(); ++it) {...}

You may suffer a performance hit there. But how do we fix this? Store
the end iterator inside the view? Then the view class will grow and
contain less than minimum info... Unlike std::vector implementations, we
need to keep the dimensions explicitly as they cannot be inferred simply
by end() - begin(), which is 1D.
...
- C-style casts and reinterpret_casts should be eliminated 
(if this is 
impossible in the latter case, the casts should be guarded by 
static_assert to do the right thing).
I would love to eliminate them but I don't see how. Static asserts are
always a good idea.
...
- proper initialization of the pixel values should be added (e.g. by 
uninitialized_fill).
Already on our todo list.
...
- I didn't check in detail whether all functions (especially 
constructors 
and destructors) are fully exception safe. The authors should make a 
statement about this.
Unless there are bugs, everything should be exception-safe.
...
- code readability should be improved (more line-breaks).
- the test suite should be made part of the distribution.
It is posted on our download page.

Thanks again for your review.

Lubomir

Re: [boost] GIL - late review

Lubomir Bourdev