
Lubomir Bourdev wrote:
Of course, we could have done the same by keeping resize_clobber inside the image class and provided one for the dynamic_image too... I don't have a strong opinion. Do people think we should move all global functions like this to be part of the image class? Currently they are:
resize_clobber_image(img) view(img) const_view(img) get_height(img or view) get_width(img or view) get_dimensions(img or view) get_num_channels(img or view)
Except for the first one, all these global functions have member method equivalents. We could: 1. Make image::resize_clobber a method to be consistent 2. or get rid of the global functions 3. or make the member functions private, so people only use the global ones
I like to think of images in terms of Abstract Data Types. My rules for member functions vs. free functions are therefore: - functions that retrieve the internal state of a single image or change this state belong to the class (i.e. resize(), get_height() etc.) - functions that invlove several images and create new images (or rather write the values of a pre-allocated destination) belong outside the class, e.g. transform() and convolve() - functions that have potentially open-ended semantics belong outside the class (e.g. view creation). There are some border cases. For example, norm() or the most basic views. These might be provided either way or even both ways.
Also any suggestions for a better name than resize_clobber_image (just not resize, not to be confused with vector resize behavior)
Yes, please. IMHO image::resize() is ok, the semantic difference to std::vector is no problem for me, but others' opinions may be different.
Color models are an open class, so we cannot represent all of them even if we wanted. This applies to just about any other part of GIL. Our goal is to provide a comprehensive set of the most frequently used models.
But this doesn't answer the question which ones warrant their own class, and which ones are just instances of static_vector.
2. If a color model does not have specific functions (in the current version of the code base), a specific class is definitely not warranted.
This applies especially to the lab and hsb classes in the current GIL version.
We could get rid of them. Or we could provide color conversion for them.
The first possibility migth be ok for the first release, but eventually the second will be required. Still, the question is: is a color conversion function sufficient to warrant an own class? (This is a general question, not targeted against GIL).
It is not just type safety. My rule is to push complexity down and resolve it down at the building blocks. In my experience that results in simpler and more flexible high level design.
Yes, but it may create complexity of its own (e.g. code bloat, endless compiles, unguarded reinterpret_casts). It is probably necessary to decide this on a per-case base.
For example, GIL pixels are smart enough to properly handle channel permutations:
rgb8_pix = bgr8_pix;
So does VIGRA.
A similar case is for dealing with the planar vs interleaved organization. We could have treated planar images as separate 1-channel images and leave the burden on each higher level algorithm to handle them explicitly (or, what is worse, not support them at all!). Instead, GIL's approach is to resolve this down at the fundamental pixel and pixel iterator models. As a result, you can write an open set of higher level GIL algorithms without having to deal with planar/interleaved complexity.
This is a good thing, and VIGRA tries the same (e.g. interleaved vs. planar images can be abstracted by means of accessors). But the problem is also open-ended. For example, what about tiled images (necessary for very large files that don't fit into memory, and for better chache locality)?
(If this reminds you of the iterator adaptor vs data accessor discussion, it is not an accident - this is the same discussion)
I don't agree with this - the distinction between iterator adaptor and accessor is mostly syntactic, provided the proxy idiom works as expected (which it didn't 5 years back :-).
The same principles could apply down to the channel levels. The channel could be made smart enough to know its range, which allows us to write channel-level algorithms that can convert one channel to another. Having smart channels allows us to write the color conversion routines that just deal with color, and delegate the channel conversion down to the channels:
This is actually an idea we were contemplating in VIGRA as well. A general solution would be nice.
1. We want to use simple built-in types to represent channels. Wrapping the channel in a class may result in potential abstraction penalty (although we haven't verified it does)
In my experience, it doesn't cost much (if anything), at least for integral underlying types. As an analogous example: a pure pointer and an iterator class which contains just a pure pointer behave identically on all systems I tried.
2. At the same time, we want the channels to be smart, i.e. to associate traits types, such as min value and max value, to the channel type, so that we can implement channel-level operations like channel_convert.
No problem with a wrapper class.
3. The same built-in types may be used for semantically different channels. For example, float often has an operating range of 0..1, but if you want to use high dynamic range images you may want to use floats with different ranges.
No problem with a wrapper class.
Right now GIL hard-codes the commonly used types to their commonly used ranges - float is 0..1.
Really? I didn't realize this and don't like it at all as a default setting. IMHO, the pleasant property of floats is that one (mostly) doesn't need to take care of ranges and loss of precision. In fact, whenever time and memory constraints allow, I try to do _all_ image processing in float. Thanks to modern CPUs, this is not a big performance penalty anymore, results are far superior, and algorithms become much simpler. Speed is mainly limited by the fact that only a quarter as many pixels fit into the cache (compared to unsigned char). (It's a different story when MMX/SSE optimization comes into play). What's really a performance killer is mixed expressions between integer and float, and the integer/float conversion. This makes good coercion rules all the more important.
You mean, writing conversion to/from a common color space? Although this is appealing, the two problems I see with this are performance (it is faster to convert A to B than A to C and C to B) and loss of precision (there is no common color space that includes the gamut of all others)
Some color conversions are defined by means of the rgb or xyz color space as intermediate spaces anyway. Others are concatenations of linear transforms, which are easily optimized at compile-time or run-time. Loss of precision is a non-issue if the right temporary type is used for intermediate results.
True, but at the same time you can write a view pipeline that ends up executing faster than if you were to copy to an intermediate buffer. Consider this:
jpeg_write_view("out.jpg", color_converted_view<lab8_pixel_t>(subsampled_view(my_view, 2,2)));
This performs color conversion only for the even pixels. And it is faster than having to copy to an intermediate buffer.
Do you have objective numbers for this claim? Unfortunately I don't, but I've sometimes seen the intermediate buffer version run faster, because the processor was able to make more efficient use of caches, internal pipelining, and register allocation. So it is probably very difficult to come up with general rules.
I am in favor of letting the programmer choose what is best rather than imposing one strategy, which in my opinion is the style of C++ and STL as compared to other languages.
Definitely, but the library should also avoid unpleasant surprises.
Now I understand what you mean. But I still think whether you convert one type to another, and whether you apply that operation on statically-specified or dynamically-specified types are two orthogonal things.
Yes, but when types are statically specified, you usually know what to expect, so the number of supported types is fairly limited. In contrast, the dynamic type must (statically) support any type combination that could happen. Thus, the two things are linked in practice.
First of all, in GIL we rarely allow direct mixing images of incompatible types together.
What are the criteria for 'incompatible'?
For compatible images there is no need for type coercion and no loss of precision.
Loss of precision and out-of-range values become important issues as soon as arithmetic operations enter the picture -- you only want to round once, at the end of the computation, and not in every intermediate step.
For example copy_pixels(rgb8, cmyk16) will not compile.
What's incompatible: rgb vs cmyk or 8 bit vs 16 bit?
If you want to use incompatible views in algorithms like copy_pixels, you have to state explicitly how you want the mapping to be done:
copy_pixels(rgb8_v, color_converted_view<rgb8_pixel_t>(cmyk16)); copy_pixels(nth_channel_view(rgb8_v,2), gray8_v);
Fair enough, but why not make the most common case the default? It worked well for us. Regards Ulli -- ________________________________________________________________ | | | Ullrich Koethe Universitaet Hamburg / University of Hamburg | | FB Informatik / Dept. of Informatics | | AB Kognitive Systeme / Cognitive Systems Group | | | | Phone: +49 (0)40 42883-2573 Vogt-Koelln-Str. 30 | | Fax: +49 (0)40 42883-2572 D - 22527 Hamburg | | Email: u.koethe@computer.org Germany | | koethe@informatik.uni-hamburg.de | | WWW: http://kogs-www.informatik.uni-hamburg.de/~koethe/ | |________________________________________________________________|