Re: [boost] GIL - late review

15 Nov 2006

      Hi Lubomir,

Lubomir Bourdev wrote:
...
...
...
First of all, in GIL we rarely allow direct mixing images of
incompatible types together.
What are the criteria for 'incompatible'?
Compatible images must allow for lossless conversion back and forth. All
others are incompatible.
That's a nice rule, but it's not the rule of the underlying C/C++ 
language, where compatible means more or less that an implicit conversion 
between two types is defined. Our default behavior in VIGRA resembles this 
as far as possible, with the exception that the default conversions 
include clamping and rounding where appropriate. In practice, both 
definitions of compatibility will probably work equally well as long as 
conversions can be customized. I still think that a default implicit 
conversion makes life easier in many situations, without causing 
unpleasent surprises.

The customization problem is especially hard if the conversion happens 
deep inside a nested operation (which has possibly been created by some 
automatic function/functor composition mechanism) when an intermediate 
type must be converted back to some fixed type, e.g. to the result type 
(recall that intermediate types usually differ from the result types in 
VIGRA, because that allows us to round/clamp only once).
...
I agree - in cases you need arithmentic operations you need to worry
about specifying intermediate types and there is loss of precision.
Arithmetic operations are the bread and butter of image processing as I 
know it. The same applies to filters, edge detectors, interest operators 
etc. Loss of precision accurs only when the intermediate or result types 
are chosen badly.
...
How about image::recreate(width,height)?
I like this name -- it says exactly what's happening. Please remember that

image::recreate(width,height, initial_pixel_value)
image::recreate(size_object)
image::recreate(size_object, initial_pixel_value)

should also be defined.
...
It comes down to some basic principles:
1. We want equality comparison to be an equivalence relation. 
2. We also want equality comparison and copying to be defined on the
same set of types. 
3. The two operations should work as expected. In particular, after a=b,
you can assert that a==b.
I hope you will agree that these are fundamental rules that hold in
mathematics and violating them can lead to unintuitive and bug-prone
systems.
Unfortunately, the CPU itself violates rule 3: a double in a register and 
the same double written to memory and read back need no longer compare 
equal - a behavior that is really hard to debug :-(
...
But the first principle requires that the types be compatible (i.e.
there must be one-to-one correspondence between them). To see why, lets
assume we can define operator== between int and float, and define it to
round to the nearest int when comparing an int to a float.
No, never do that. Mixed type expressions are always coerced to the 
highest of the types involved or to an even higher type. Otherwise, you 
will really get unpleasant surprises. (OK, your code is only an 
illustration, I know).
...
You may define operator== to not round but "promote" the int to a float.
That will make your equality comparison an equivalence relation. The
problem then will shift to rule 3 because you cannot do the same
promotion when copying. Consider this:
float a=5.1;
int b=a;
assert(b==a); // fails!
I don't see this as a surprise -- after all, we have performed a lossy 
assignment in between. Type promotion is a well understood operation.
...
That is, "a=b" should be defined for exactly the same types for which
"a==b" is defined. Therefore, copy should only be defined between
compatible images.
I like this rule. So, the fundamental question is: should a = b
imply a == b?

What are the language gurus saying about this? Niklaus Wirth and Bertand 
Mayer are certainly in favour of the implication, whereas the C/C++ 
inventors opted against. In practice, it amounts to two questions:

1. Will a default implicit conversion be useful enough to tolerate its 
potential for surprises?
2. Can one design the system so that customized conversions can be 
conveniently configured, even if deep inside a nested call?

I'd like to hear the opinion of others about this.
...
Regardless of what we do, there is always a notion of the operating
range of a channel, whether it is implicit or explicit.
I don't agree. For example, when you use Gaussian filters to compute 
derivatives or the structure tensor etc. the result has no obvious range - 
it depends on the image data and operator in a non-trivial way. For 
example, the minimal and maximal possible values of a Gaussian first 
derivative are  proportional to 1/sigma, but sigma (the scale of the 
Gaussian filter) is usually only known at runtime. It is the whole point 
of floating point (as opposed to fixed point) to get rid of these range 
considerations.
...
You are essentially suggesting that the range of a channel be defined at
run time, rather than at compile time, right?
More precisely, we avoid explicitly defined ranges, until some operation 
(e.g. display) requires an explicit range. Remember that it is no longer 
slow to do all image processing in float.
...
But you still need to know the range for many operations.
For example, converting between additive and subtractive color spaces
requires inverting the channel, which requires knowing its range.
Not necessarily. In many cases, one can simply negate the channel value(s) 
and worry about range remapping later, when the required output format is 
known.

Likewise, if you don't require fixed ranges, you may perform out-of-gamut 
computations without loss of precision. You can't display these values, 
but they are mostly intermediate results anyway.

Thus, I strongly argue that the range 0...1 for floating point values 
should be dropped as the default behavior.
...
GIL's dynamic image doesn't do anything fancy. It simply instantiates
the algorithm with all possible types and selects the right one at run
time.
If your source runtime image can be of type 1 or 2, and your destination
can be A,B or C, when you invoke copy_pixels, it will instantiate:
copy_pixels(1,A)
...
copy_pixels(2,C)
and switch to the correct one at run-time.
That's all well and good. But in practice, the dynamic image might well 
support more than 3 types, and the system will need operations with more 
than two arguments. Then a combinatorial explosion occurs unless a 
powerful coercion mechanism is provided.
...
In some contexts specific color
models are widely used and others are not needed. But each color model
was invented because it was needed somewhere.
I see little harm in providing an extensive set of color models.
That's not what I was arguing against. I was arguing against _pretending_ 
support for a color space when just an accordingly named class is 
provided, but no operations.
...
In my opinion tiled images are a different story, they cannot be just
abstracted out and hidden under the rug the way planar/intereaved images
can.
I'm not so pessimistic. I have some ideas about how algorithms could be 
easily prepared for handling tiled storage formats.

Best regards
Ulli

-- 
  ________________________________________________________________
|                                                                |
| Ullrich Koethe  Universitaet Hamburg / University of Hamburg   |
|                 FB Informatik        / Dept. of Informatics    |
|                 AB Kognitive Systeme / Cognitive Systems Group |
|                                                                |
| Phone: +49 (0)40 42883-2573                Vogt-Koelln-Str. 30 |
| Fax:   +49 (0)40 42883-2572                D - 22527 Hamburg   |
| Email: u.koethe@computer.org               Germany             |
|        koethe@informatik.uni-hamburg.de                        |
| WWW:   http://kogs-www.informatik.uni-hamburg.de/~koethe/      |
|________________________________________________________________|