
Hi Ulli,
I like to think of images in terms of Abstract Data Types. My rules for member functions vs. free functions are therefore:
- functions that retrieve the internal state of a single image or change this state belong to the class (i.e. resize(), get_height() etc.)
- functions that invlove several images and create new images (or rather write the values of a pre-allocated destination) belong outside the class, e.g. transform() and convolve()
- functions that have potentially open-ended semantics belong outside the class (e.g. view creation).
I have no problem with following these rules.
Also any suggestions for a better name than resize_clobber_image (just not resize, not to be confused with vector resize behavior)
Yes, please. IMHO image::resize() is ok, the semantic difference to std::vector is no problem for me, but others' opinions may be different.
My rule of thumb: when I provide functionality that is similar but not identical to existing functionality, I go out of my way to indicate that they are different. This is why GIL calls 2D 'iterators' locators, although they are almost identical to traditional iterators. Similarly, I prefer names like resize_and_clobber(), resize_and_invalidate(), create_with_new_dimensions(), recreate(), or reset() than resize(). What do other people think?
2. If a color model does not have specific functions (in the current version of the code base), a specific class is definitely not warranted.
This applies especially to the lab and hsb classes in the current GIL version.
We could get rid of them. Or we could provide color conversion for them.
The first possibility migth be ok for the first release, but eventually the second will be required. Still, the question is: is a color conversion function sufficient to warrant an own class? (This is a general question, not targeted against GIL).
I think it depends on your application. In some contexts specific color models are widely used and others are not needed. But each color model was invented because it was needed somewhere. I see little harm in providing an extensive set of color models. You don't have to use them if you don't want to - you could just use the anonymous model.
This is a good thing, and VIGRA tries the same (e.g. interleaved vs. planar images can be abstracted by means of accessors). But the problem is also open-ended. For example, what about tiled images (necessary for very large files that don't fit into memory, and for better chache locality)?
In my opinion tiled images are a different story, they cannot be just abstracted out and hidden under the rug the way planar/intereaved images can. If you want your algorithm to work on a tile-by-tile, you have to redesign it with this requirement in mind. And some algorithms are inherently less tile-friendly than others. It may be possible to create a virtual image that tries to predict and adapt to the access pattern of a generic algorithm, but my feeling is that it won't do much better than simply letting the virtual memory of your OS handle it.
In my experience, it doesn't cost much (if anything), at least for integral underlying types. As an analogous example: a pure pointer and an iterator class which contains just a pure pointer behave identically on all systems I tried.
We have some internal performance benchmarks showing that for complex algorithms (like sort), representing the vector iterator as a pointer wrapped in a class results in suboptimal performance on some compilers. Compilers are good enough for simple algorithms, but in more fancy contexts may fail to keep the wrapped pointer in a register. But I haven't looked into specifics.
Right now GIL hard-codes the commonly used types to their commonly used ranges - float is 0..1.
Really? I didn't realize this and don't like it at all as a default setting. IMHO, the pleasant property of floats is that one (mostly) doesn't need to take care of ranges and loss of precision.
Think of these as part of the channel traits specifying the legal range of the channel, which is often smaller than its physical range. You are, of course, free to ignore them (and it may make sense to do so in intermediate computations), but they are implicitly always defined. Having them explicitly defined makes certain operations much easier, such as converting from one channel type to another. Not having them defined would force you to propagate them to every function that might need them. For example channel_convert will need to be provided the minimum and maximum value for both source and destination, and so will the color conversion functions, the color converted view, etc.
Some color conversions are defined by means of the rgb or xyz color space as intermediate spaces anyway. Others are concatenations of linear transforms, which are easily optimized at compile-time or run-time. Loss of precision is a non-issue if the right temporary type is used for intermediate results.
I was talking about loss of precision due to out-of-gamut representations. Those could result in large errors, as the result of color conversion can only lie in the intersection of the gamuts of all color spaces.
But I still think whether you convert one type to another, and whether you apply that operation on statically-specified or dynamically-specified types are two orthogonal things.
Yes, but when types are statically specified, you usually know what to expect, so the number of supported types is fairly limited. In contrast, the dynamic type must (statically) support any type combination that could happen. Thus, the two things are linked in practice.
GIL's dynamic image doesn't do anything fancy. It simply instantiates the algorithm with all possible types and selects the right one at run time. If your source runtime image can be of type 1 or 2, and your destination can be A,B or C, when you invoke copy_pixels, it will instantiate: copy_pixels(1,A) copy_pixels(1,B) copy_pixels(1,C) copy_pixels(2,A) copy_pixels(2,B) copy_pixels(2,C) and switch to the correct one at run-time. It has some optional mode to reduce code bloat by collapsing identical instantiations and representing binary algorithms using a single switch statement (instead of two nested ones.
First of all, in GIL we rarely allow direct mixing images of incompatible types together.
What are the criteria for 'incompatible'?
Compatible images must allow for lossless conversion back and forth. All others are incompatible.
For compatible images there is no need for type coercion and no loss of precision.
Loss of precision and out-of-range values become important issues as soon as arithmetic operations enter the picture -- you only want to round once, at the end of the computation, and not in every intermediate step.
I agree - in cases you need arithmentic operations you need to worry about specifying intermediate types and there is loss of precision.
For example copy_pixels(rgb8, cmyk16) will not compile.
What's incompatible: rgb vs cmyk or 8 bit vs 16 bit?
Both the color space and the channel type.
If you want to use incompatible views in algorithms like copy_pixels, you have to state explicitly how you want the mapping to be done:
copy_pixels(rgb8_v, color_converted_view<rgb8_pixel_t>(cmyk16)); copy_pixels(nth_channel_view(rgb8_v,2), gray8_v);
Fair enough, but why not make the most common case the default? It worked well for us.
There are some fundamental operations that can be performed at the level of channels, pixels, image views and images. They are: - copy construction and copy - assignment - equality comparison We would like to have them defined only between compatible channels/pixels/views/images. Otherwise some simple operations are not mathematically consistent. For example, equality comparison should be an equivalence relation, but if you define it for incompatible types, it is no longer transitive... Lubomir