
Hi Lubomir, Lubomir Bourdev wrote:
First of all, in GIL we rarely allow direct mixing images of incompatible types together.
What are the criteria for 'incompatible'?
Compatible images must allow for lossless conversion back and forth. All others are incompatible.
That's a nice rule, but it's not the rule of the underlying C/C++ language, where compatible means more or less that an implicit conversion between two types is defined. Our default behavior in VIGRA resembles this as far as possible, with the exception that the default conversions include clamping and rounding where appropriate. In practice, both definitions of compatibility will probably work equally well as long as conversions can be customized. I still think that a default implicit conversion makes life easier in many situations, without causing unpleasent surprises. The customization problem is especially hard if the conversion happens deep inside a nested operation (which has possibly been created by some automatic function/functor composition mechanism) when an intermediate type must be converted back to some fixed type, e.g. to the result type (recall that intermediate types usually differ from the result types in VIGRA, because that allows us to round/clamp only once).
I agree - in cases you need arithmentic operations you need to worry about specifying intermediate types and there is loss of precision.
Arithmetic operations are the bread and butter of image processing as I know it. The same applies to filters, edge detectors, interest operators etc. Loss of precision accurs only when the intermediate or result types are chosen badly.
How about image::recreate(width,height)?
I like this name -- it says exactly what's happening. Please remember that image::recreate(width,height, initial_pixel_value) image::recreate(size_object) image::recreate(size_object, initial_pixel_value) should also be defined.
It comes down to some basic principles:
1. We want equality comparison to be an equivalence relation. 2. We also want equality comparison and copying to be defined on the same set of types. 3. The two operations should work as expected. In particular, after a=b, you can assert that a==b.
I hope you will agree that these are fundamental rules that hold in mathematics and violating them can lead to unintuitive and bug-prone systems.
Unfortunately, the CPU itself violates rule 3: a double in a register and the same double written to memory and read back need no longer compare equal - a behavior that is really hard to debug :-(
But the first principle requires that the types be compatible (i.e. there must be one-to-one correspondence between them). To see why, lets assume we can define operator== between int and float, and define it to round to the nearest int when comparing an int to a float.
No, never do that. Mixed type expressions are always coerced to the highest of the types involved or to an even higher type. Otherwise, you will really get unpleasant surprises. (OK, your code is only an illustration, I know).
You may define operator== to not round but "promote" the int to a float. That will make your equality comparison an equivalence relation. The problem then will shift to rule 3 because you cannot do the same promotion when copying. Consider this:
float a=5.1; int b=a; assert(b==a); // fails!
I don't see this as a surprise -- after all, we have performed a lossy assignment in between. Type promotion is a well understood operation.
That is, "a=b" should be defined for exactly the same types for which "a==b" is defined. Therefore, copy should only be defined between compatible images.
I like this rule. So, the fundamental question is: should a = b imply a == b? What are the language gurus saying about this? Niklaus Wirth and Bertand Mayer are certainly in favour of the implication, whereas the C/C++ inventors opted against. In practice, it amounts to two questions: 1. Will a default implicit conversion be useful enough to tolerate its potential for surprises? 2. Can one design the system so that customized conversions can be conveniently configured, even if deep inside a nested call? I'd like to hear the opinion of others about this.
Regardless of what we do, there is always a notion of the operating range of a channel, whether it is implicit or explicit.
I don't agree. For example, when you use Gaussian filters to compute derivatives or the structure tensor etc. the result has no obvious range - it depends on the image data and operator in a non-trivial way. For example, the minimal and maximal possible values of a Gaussian first derivative are proportional to 1/sigma, but sigma (the scale of the Gaussian filter) is usually only known at runtime. It is the whole point of floating point (as opposed to fixed point) to get rid of these range considerations.
You are essentially suggesting that the range of a channel be defined at run time, rather than at compile time, right?
More precisely, we avoid explicitly defined ranges, until some operation (e.g. display) requires an explicit range. Remember that it is no longer slow to do all image processing in float.
But you still need to know the range for many operations. For example, converting between additive and subtractive color spaces requires inverting the channel, which requires knowing its range.
Not necessarily. In many cases, one can simply negate the channel value(s) and worry about range remapping later, when the required output format is known. Likewise, if you don't require fixed ranges, you may perform out-of-gamut computations without loss of precision. You can't display these values, but they are mostly intermediate results anyway. Thus, I strongly argue that the range 0...1 for floating point values should be dropped as the default behavior.
GIL's dynamic image doesn't do anything fancy. It simply instantiates the algorithm with all possible types and selects the right one at run time. If your source runtime image can be of type 1 or 2, and your destination can be A,B or C, when you invoke copy_pixels, it will instantiate:
copy_pixels(1,A) ... copy_pixels(2,C)
and switch to the correct one at run-time.
That's all well and good. But in practice, the dynamic image might well support more than 3 types, and the system will need operations with more than two arguments. Then a combinatorial explosion occurs unless a powerful coercion mechanism is provided.
In some contexts specific color models are widely used and others are not needed. But each color model was invented because it was needed somewhere. I see little harm in providing an extensive set of color models.
That's not what I was arguing against. I was arguing against _pretending_ support for a color space when just an accordingly named class is provided, but no operations.
In my opinion tiled images are a different story, they cannot be just abstracted out and hidden under the rug the way planar/intereaved images can.
I'm not so pessimistic. I have some ideas about how algorithms could be easily prepared for handling tiled storage formats. Best regards Ulli -- ________________________________________________________________ | | | Ullrich Koethe Universitaet Hamburg / University of Hamburg | | FB Informatik / Dept. of Informatics | | AB Kognitive Systeme / Cognitive Systems Group | | | | Phone: +49 (0)40 42883-2573 Vogt-Koelln-Str. 30 | | Fax: +49 (0)40 42883-2572 D - 22527 Hamburg | | Email: u.koethe@computer.org Germany | | koethe@informatik.uni-hamburg.de | | WWW: http://kogs-www.informatik.uni-hamburg.de/~koethe/ | |________________________________________________________________|