Re: [boost] GIL - late review

2 Nov 2006

      Hi Ulli,
...
No, I was unclear. I mean that it should be possible to 
provide an initial pixel value (which is copied into all pixels), like
some_image_t img(a_size, an_initial_pixel_value);
That's a good idea. Will do.
...
...
...
- resize() functions with default construction and copy
construction
...
...
of the newly allocated pixel values (without resize(), the default 
constructor image() is pointless).
There is a function called resize_clobber_image. Is that 
what you are
looking for?
Probably. But not being a native English speaker, I have no idea what 
'clobber' stands for. In addition, I'd expect resize() to be a member 
function of the image class, similar to std::vector. I agree 
that the old 
pixel values need not be preserved. The above comment about 
an initial 
pixel value applies to resize() as well:
sime_image.resize(new_size, an_initial_pixel_value);
The policy should be: the member function image::resize() 
just resizes the 
memory without copying pixel values, whereas the free functions 
resize...() interpolate (or drop) pixel values, but assume the right 
amount of memory to be already allocated.
We on purpose did not make the interface the same as vector::resize.
This is because the behavior is different - the vector copies the values
on resize, whereas gil::image doesn't. This is why we use the name
"resize_clobber_image".

It is also a global function to make external adaptation easier. For
example, dynamic images already provide an overload. There are many
functions that have the same interface for dynamic images and templated
ones, so you can write some code that can be instantiated with either:

template <typename MetaImage>
void flip_image(const char* file_name) {
    MetaImage img;
    jpeg_read_image(file_name, img);
    jpeg_write_view(file_name, flip_left_right_view(const_view(img)));
    resize_clobber_image(img, 100,100);
    ...
}

Of course, we could have done the same by keeping resize_clobber inside
the image class and provided one for the dynamic_image too... I don't
have a strong opinion. Do people think we should move all global
functions like this to be part of the image class? Currently they are:

resize_clobber_image(img)
view(img)
const_view(img)
get_height(img or view)
get_width(img or view)
get_dimensions(img or view)
get_num_channels(img or view)

Except for the first one, all these global functions have member method
equivalents. We could:
1. Make image::resize_clobber a method to be consistent
2. or get rid of the global functions
3. or make the member functions private, so people only use the global
ones

Also any suggestions for a better name than resize_clobber_image (just
not resize, not to be confused with vector resize behavior)
...
...
Why does VIGRA have a separate class for RGB Value then?
What is special
...
about RGB?
RGB has some specific functions (e.g. red(), luminance()), 
and I'm not 
arguing against an RGB value class. I'm just saying that
1. It may not be appropropriate to represent each and every 
color model by 
their own class.
Color models are an open class, so we cannot represent all of them even
if we wanted.
This applies to just about any other part of GIL. Our goal is to provide
a comprehensive set of the most frequently used models.
...
2. If a color model does not have specific functions (in the current 
version of the code base), a specific class is definitely not 
warranted.
This applies especially to the lab and hsb classes in the 
current GIL version.
We could get rid of them.
Or we could provide color conversion for them.
...
...
There are lots of functions that take the color information into
account. Color information goes down to the very basics.
Yes, but there are other solutions than providing a new class, e.g. a 
string tag in the image class. It is not as type safe, but 
simpler and 
more flexible. Type safety may not be the main consideration 
here, because 
type errors in color space conversions are easily visible 
during testing.
It is not just type safety. My rule is to push complexity down and
resolve it down at the building blocks. In my experience that results in
simpler and more flexible high level design.
For example, GIL pixels are smart enough to properly handle channel
permutations:

rgb8_pix = bgr8_pix;

By resolving the channel permutation once, down at the pixel level, we
no longer need to worry about this every time we deal with pixels, in
algorithms like copy_pixels, equal_pixels, etc. 

A similar case is for dealing with the planar vs interleaved
organization. We could have treated planar images as separate 1-channel
images and leave the burden on each higher level algorithm to handle
them explicitly (or, what is worse, not support them at all!). Instead,
GIL's approach is to resolve this down at the fundamental pixel and
pixel iterator models. As a result, you can write an open set of higher
level GIL algorithms without having to deal with planar/interleaved
complexity.

(If this reminds you of the iterator adaptor vs data accessor
discussion, it is not an accident - this is the same discussion)

The same principles could apply down to the channel levels.
The channel could be made smart enough to know its range, which allows
us to write channel-level algorithms that can convert one channel to
another. Having smart channels allows us to write the color conversion
routines that just deal with color, and delegate the channel conversion
down to the channels:

template <typename T1, typename T2>
struct color_converter_default_impl<T1,gray_t,T2,rgb_t> {
    template <typename P1, typename P2>
    void operator()(const P1& src, P2& dst) const {
        dst.red=dst.green=dst.blue=channel_convert<T2>(src.gray);
    }
};

Unfortunately, propagating the same principles to the channels brings us
to a dilemma:
1. We want to use simple built-in types to represent channels. Wrapping
the channel in a class may result in potential abstraction penalty
(although we haven't verified it does)
2. At the same time, we want the channels to be smart, i.e. to associate
traits types, such as min value and max value, to the channel type, so
that we can implement channel-level operations like channel_convert.
3. The same built-in types may be used for semantically different
channels. For example, float often has an operating range of 0..1, but
if you want to use high dynamic range images you may want to use floats
with different ranges.

I don't know of a way in C++ to make an "alias" type to a given type
that allows us to associate different traits.
For example, I'd like to say something like:

alias float_dyn_range = float;
alias float_01 = float;

template <>
struct channel_traits<float_01> {...};

template <>
struct channel_traits<float_dyn_range> {...};

Right now GIL hard-codes the commonly used types to their commonly used
ranges (unsigned char is 0..255, float is 0..1, etc). If you want a
different range, you can make a wrapper class and define the traits for
it. We are open to alternative suggestions.
...
...
You are right that some color conversion combinations are not yet
provided. This has nothing to do with the performance to do color
conversion, but more with the fact that there is a combinatorial
explosion in the number of conversions - every color space to every
other.
That's another area were coercion is needed. By providing rules for 
multistep conversion, the actual conversion code can be generated 
automatically from a few basic building blocks.
You mean, writing conversion to/from a common color space? Although this
is appealing, the two problems I see with this are performance (it is
faster to convert A to B than A to C and C to B) and loss of precision
(there is no common color space that includes the gamut of all others)
...
...
Color conversion between RGB and LAB could be implemented 
just like it
is done between, say, RGB and CMYK.
Is there any performance advantage to implementing it differently?
No, but the RGB-LAB conversion is so expensive that the only 
practical 
relevant use of an lab_view is
copy_pixels(lab_view(some_rgb_image), view(lab_image));
Naive users may not be aware of the cost, and may use an 
lab_view as an 
argument to convole(), say. Therefore, I'd prefer a 
functor-based solution 
in that case, such as
transform_image(view(rgb_image), view(lab_image), rgb2lab());
This cannot be accidentaly misused.
True, but at the same time you can write a view pipeline that ends up
executing faster than if you were to copy to an intermediate buffer.
Consider this:

jpeg_write_view("out.jpg",
color_converted_view<lab8_pixel_t>(subsampled_view(my_view, 2,2)));

This performs color conversion only for the even pixels. And it is
faster than having to copy to an intermediate buffer.
I am in favor of letting the programmer choose what is best rather than
imposing one strategy, which in my opinion is the style of C++ and STL
as compared to other languages.
...
...
As described in my paper, the types can be equivalent in
the context of
...
the algorithm.
For example, for bitwise-copy, copying "signed short to 
signed short"
and copying "unsigned short to unsigned short" can be treated as
equivalent.
Yes, but a copy between uniform types is about the only 
example were this 
works.
It works for any bitwise-identical operation - copy construction,
assignment, equality comparison. But I agree these are fairly limited.
...
...
I can see how automatically converting one type to another can be
useful, but I am not sure how this relates to the topic of
dynamic_image.
Suppose you want to support something like:
transform(dynamic_image1, dynamic_image2, dynamic_image3, _1 + _2);
for a large set of possible pixel types. Then your method of 
code bloat 
reduction by binary compatibility only applies to very few 
instances. One 
also needs coercion, i.e. one splits transform() into two parts:
- converting all input into a one among a few supported forms 
(e.g. leave 
uniform type input untouched, but convert mixed type input to 
the highest 
input type)
- doing the actual computation with only the few supported 
type combinations.
So, only a lot of conversion functions are needed (which are 
simple and 
don't bloat the code too much), whereas all complicated functions are 
instantiated only for a few types.
Now I understand what you mean.
But I still think whether you convert one type to another, and whether
you apply that operation on statically-specified or
dynamically-specified types are two orthogonal things.

First of all, in GIL we rarely allow direct mixing images of
incompatible types together.
They must have the same base color space and the same set of channels to
be used in most binary algorithms, such as copy, equality comparison,
etc. For compatible images there is no need for type coercion and no
loss of precision.

For example copy_pixels(rgb8, cmyk16) will not compile.
If you use runtime images and call copy_pixels on two incompatible
images you will get an exception.

If you want to use incompatible views in algorithms like copy_pixels,
you have to state explicitly how you want the mapping to be done:

copy_pixels(rgb8_v, color_converted_view<rgb8_pixel_t>(cmyk16));
copy_pixels(nth_channel_view(rgb8_v,2), gray8_v);

Now, it is possible that a generic GIL algorithm needs to take views
that are incompatible.
In this case, I agree that type coercion may be an option, depending on
the specifics of the algorithm. But whether or not the algorithm inside
uses type coercion is orthogonal to whether we happen to instantiate it
for static types or for dynamic types, right?

Lubomir

Re: [boost] GIL - late review

Lubomir Bourdev