
Hi Ulli,
No, I was unclear. I mean that it should be possible to provide an initial pixel value (which is copied into all pixels), like
some_image_t img(a_size, an_initial_pixel_value);
That's a good idea. Will do.
- resize() functions with default construction and copy
construction
of the newly allocated pixel values (without resize(), the default constructor image() is pointless).
There is a function called resize_clobber_image. Is that what you are looking for?
Probably. But not being a native English speaker, I have no idea what 'clobber' stands for. In addition, I'd expect resize() to be a member function of the image class, similar to std::vector. I agree that the old pixel values need not be preserved. The above comment about an initial pixel value applies to resize() as well:
sime_image.resize(new_size, an_initial_pixel_value);
The policy should be: the member function image::resize() just resizes the memory without copying pixel values, whereas the free functions resize...() interpolate (or drop) pixel values, but assume the right amount of memory to be already allocated.
We on purpose did not make the interface the same as vector::resize. This is because the behavior is different - the vector copies the values on resize, whereas gil::image doesn't. This is why we use the name "resize_clobber_image". It is also a global function to make external adaptation easier. For example, dynamic images already provide an overload. There are many functions that have the same interface for dynamic images and templated ones, so you can write some code that can be instantiated with either: template <typename MetaImage> void flip_image(const char* file_name) { MetaImage img; jpeg_read_image(file_name, img); jpeg_write_view(file_name, flip_left_right_view(const_view(img))); resize_clobber_image(img, 100,100); ... } Of course, we could have done the same by keeping resize_clobber inside the image class and provided one for the dynamic_image too... I don't have a strong opinion. Do people think we should move all global functions like this to be part of the image class? Currently they are: resize_clobber_image(img) view(img) const_view(img) get_height(img or view) get_width(img or view) get_dimensions(img or view) get_num_channels(img or view) Except for the first one, all these global functions have member method equivalents. We could: 1. Make image::resize_clobber a method to be consistent 2. or get rid of the global functions 3. or make the member functions private, so people only use the global ones Also any suggestions for a better name than resize_clobber_image (just not resize, not to be confused with vector resize behavior)
Why does VIGRA have a separate class for RGB Value then?
What is special
about RGB?
RGB has some specific functions (e.g. red(), luminance()), and I'm not arguing against an RGB value class. I'm just saying that
1. It may not be appropropriate to represent each and every color model by their own class.
Color models are an open class, so we cannot represent all of them even if we wanted. This applies to just about any other part of GIL. Our goal is to provide a comprehensive set of the most frequently used models.
2. If a color model does not have specific functions (in the current version of the code base), a specific class is definitely not warranted.
This applies especially to the lab and hsb classes in the current GIL version.
We could get rid of them. Or we could provide color conversion for them.
There are lots of functions that take the color information into account. Color information goes down to the very basics.
Yes, but there are other solutions than providing a new class, e.g. a string tag in the image class. It is not as type safe, but simpler and more flexible. Type safety may not be the main consideration here, because type errors in color space conversions are easily visible during testing.
It is not just type safety. My rule is to push complexity down and resolve it down at the building blocks. In my experience that results in simpler and more flexible high level design. For example, GIL pixels are smart enough to properly handle channel permutations: rgb8_pix = bgr8_pix; By resolving the channel permutation once, down at the pixel level, we no longer need to worry about this every time we deal with pixels, in algorithms like copy_pixels, equal_pixels, etc. A similar case is for dealing with the planar vs interleaved organization. We could have treated planar images as separate 1-channel images and leave the burden on each higher level algorithm to handle them explicitly (or, what is worse, not support them at all!). Instead, GIL's approach is to resolve this down at the fundamental pixel and pixel iterator models. As a result, you can write an open set of higher level GIL algorithms without having to deal with planar/interleaved complexity. (If this reminds you of the iterator adaptor vs data accessor discussion, it is not an accident - this is the same discussion) The same principles could apply down to the channel levels. The channel could be made smart enough to know its range, which allows us to write channel-level algorithms that can convert one channel to another. Having smart channels allows us to write the color conversion routines that just deal with color, and delegate the channel conversion down to the channels: template <typename T1, typename T2> struct color_converter_default_impl<T1,gray_t,T2,rgb_t> { template <typename P1, typename P2> void operator()(const P1& src, P2& dst) const { dst.red=dst.green=dst.blue=channel_convert<T2>(src.gray); } }; Unfortunately, propagating the same principles to the channels brings us to a dilemma: 1. We want to use simple built-in types to represent channels. Wrapping the channel in a class may result in potential abstraction penalty (although we haven't verified it does) 2. At the same time, we want the channels to be smart, i.e. to associate traits types, such as min value and max value, to the channel type, so that we can implement channel-level operations like channel_convert. 3. The same built-in types may be used for semantically different channels. For example, float often has an operating range of 0..1, but if you want to use high dynamic range images you may want to use floats with different ranges. I don't know of a way in C++ to make an "alias" type to a given type that allows us to associate different traits. For example, I'd like to say something like: alias float_dyn_range = float; alias float_01 = float; template <> struct channel_traits<float_01> {...}; template <> struct channel_traits<float_dyn_range> {...}; Right now GIL hard-codes the commonly used types to their commonly used ranges (unsigned char is 0..255, float is 0..1, etc). If you want a different range, you can make a wrapper class and define the traits for it. We are open to alternative suggestions.
You are right that some color conversion combinations are not yet provided. This has nothing to do with the performance to do color conversion, but more with the fact that there is a combinatorial explosion in the number of conversions - every color space to every other.
That's another area were coercion is needed. By providing rules for multistep conversion, the actual conversion code can be generated automatically from a few basic building blocks.
You mean, writing conversion to/from a common color space? Although this is appealing, the two problems I see with this are performance (it is faster to convert A to B than A to C and C to B) and loss of precision (there is no common color space that includes the gamut of all others)
Color conversion between RGB and LAB could be implemented just like it is done between, say, RGB and CMYK. Is there any performance advantage to implementing it differently?
No, but the RGB-LAB conversion is so expensive that the only practical relevant use of an lab_view is
copy_pixels(lab_view(some_rgb_image), view(lab_image));
Naive users may not be aware of the cost, and may use an lab_view as an argument to convole(), say. Therefore, I'd prefer a functor-based solution in that case, such as
transform_image(view(rgb_image), view(lab_image), rgb2lab());
This cannot be accidentaly misused.
True, but at the same time you can write a view pipeline that ends up executing faster than if you were to copy to an intermediate buffer. Consider this: jpeg_write_view("out.jpg", color_converted_view<lab8_pixel_t>(subsampled_view(my_view, 2,2))); This performs color conversion only for the even pixels. And it is faster than having to copy to an intermediate buffer. I am in favor of letting the programmer choose what is best rather than imposing one strategy, which in my opinion is the style of C++ and STL as compared to other languages.
As described in my paper, the types can be equivalent in
the context of
the algorithm. For example, for bitwise-copy, copying "signed short to signed short" and copying "unsigned short to unsigned short" can be treated as equivalent.
Yes, but a copy between uniform types is about the only example were this works.
It works for any bitwise-identical operation - copy construction, assignment, equality comparison. But I agree these are fairly limited.
I can see how automatically converting one type to another can be useful, but I am not sure how this relates to the topic of dynamic_image.
Suppose you want to support something like:
transform(dynamic_image1, dynamic_image2, dynamic_image3, _1 + _2);
for a large set of possible pixel types. Then your method of code bloat reduction by binary compatibility only applies to very few instances. One also needs coercion, i.e. one splits transform() into two parts:
- converting all input into a one among a few supported forms (e.g. leave uniform type input untouched, but convert mixed type input to the highest input type)
- doing the actual computation with only the few supported type combinations.
So, only a lot of conversion functions are needed (which are simple and don't bloat the code too much), whereas all complicated functions are instantiated only for a few types.
Now I understand what you mean. But I still think whether you convert one type to another, and whether you apply that operation on statically-specified or dynamically-specified types are two orthogonal things. First of all, in GIL we rarely allow direct mixing images of incompatible types together. They must have the same base color space and the same set of channels to be used in most binary algorithms, such as copy, equality comparison, etc. For compatible images there is no need for type coercion and no loss of precision. For example copy_pixels(rgb8, cmyk16) will not compile. If you use runtime images and call copy_pixels on two incompatible images you will get an exception. If you want to use incompatible views in algorithms like copy_pixels, you have to state explicitly how you want the mapping to be done: copy_pixels(rgb8_v, color_converted_view<rgb8_pixel_t>(cmyk16)); copy_pixels(nth_channel_view(rgb8_v,2), gray8_v); Now, it is possible that a generic GIL algorithm needs to take views that are incompatible. In this case, I agree that type coercion may be an option, depending on the specifics of the algorithm. But whether or not the algorithm inside uses type coercion is orthogonal to whether we happen to instantiate it for static types or for dynamic types, right? Lubomir