
Hi Ulli, Thanks for your review! Ulli wrote:
I vote to accept GIL into boost after revision. In addition, I consider addition of an algorithm library in the not-too-distant future mandatory in order to realize the full potential of the image classes.
As I already mentioned, we also would like very much to see image processing algorithms added to GIL, whether or not it is accepted in Boost. In particular, the Vigra algorithms provide a good starting point. Although they could be invoked through a compatibility layer, as you and I demonstrated, it would be great if they are rewritten in native GIL, because we don't believe a compatibility layer can handle all possible image abstrations. If you and/or others would like to volunteer to do that, you can expect and extra level of support from us. We also are looking for volunteers to implement other algorithms, such as the ones in OpenCV. It would be great to also have a GIL-layer over Intel's IPP library. Also, providing interfaces to AGG would be terrific. Providing parallel versions of image processing algorithms would be great. GPU support too. More I/O formats too.
- What is your evaluation of the design?
I like the design of the image classes (including the dynamic_image) and the heavy use of views as a major algorithm interface. The various iteration abstractions are also quite powerful. The proof-of-concept implementation of the VIGRA interoperability layer was quite encouraging.
I'd like to see the following additions to the image class: - constructors that allow copy construction of the pixels values
You mean deep-copy the images? The image class has a copy constructor which deep-copy's the images.
- resize() functions with default construction and copy construction of the newly allocated pixel values (without resize(), the default constructor image() is pointless).
There is a function called resize_clobber_image. Is that what you are looking for? If the images are of different dimensions, it constructs an image of the target size and swaps them. It does not copy the pixels. This could be done manually with copy_pixels if needed. We are looking for a better name, but resize_image implies the pixels are copied (and truncated) as with vector resize. We didn't add copying of the pixels because doing so is often not necessary, and it is unclear what the copy policy should be exactly. (copy into top left? Truncate?)
I'm less convinced of the pixel classes (which, as I argued in another post, should be renamed into pixel_value or just value). IMHO, a pixel class without a set of operations specific to this particular pixel type is not very useful. As far as I can tell, there are not even conversion functions between RGB, LAB and HSB (and, unlike the RGB-CMYK conversion, these are too expensive to be implemented as pixel adapters).
I recommend to either - drop the specific color classes for the time being and use a general static_vector<3, T> together with accordingly changed pixel_value_tag concepts, or - add a set of functions justifying the existence of the color classes.
Why does VIGRA have a separate class for RGB Value then? What is special about RGB? There are lots of functions that take the color information into account. Color information goes down to the very basics. For example, any per-channel operation that takes two or more pixels operates on semantic channels, not physical channels. These will do the right thing: bool eq=equal_pixels(rgb8_view, bgr8_view); copy_pixels(rgb8_view, bgr8_view); rgb8_pix = bgr8_pix; rgb8_image_t img(bgr8_img); Color information also is useful for compile-time type safety. GIL constructs that don't have the same base color space are incompatible and cannot be copy-constructed, assigned and compared to each other. These will generate a compile error: copy_pixels(lab8_view, rgb8_view); // error rgb8_pix = lab8_pix; // error This level of compile-time safety is not possible if you only consider the type and number of channels. Of course, there are a variety of color conversion functions: copy_and_convert_pixels(rgb8_view, cmyk16_view); cmyk16_pixel_t c16; color_convert(rgb8_pixel_t(255,0,0), c16); You are right that some color conversion combinations are not yet provided. This has nothing to do with the performance to do color conversion, but more with the fact that there is a combinatorial explosion in the number of conversions - every color space to every other. Color conversion between RGB and LAB could be implemented just like it is done between, say, RGB and CMYK. Is there any performance advantage to implementing it differently? I agree that sometimes it doesn't make sense to have a specific color space. Sometimes you really want to think of the data as image of pixel with N channels of type T. GIL provides models of anonymous N-channel homogeneous pixels, which have "device-N" color space. Here are the types of 5-channel anonymous 16-bit unsigned integral pixels and image views: dev5n16_pixel_t p(0,1,3,2,3); dev5n16_view_t view5; Their color space is devicen_t<N> where N is the number of channels. (We are open to suggestions for a better name)
Regarding pixel type conversion and mixed-valued functions: It was already pointed out that the resulting code bloat can easily get out of control, especially in the context of the dynamic_image. I read Lubomir's OOPSLA
Which, by the way, I noticed is posted here: http://sms.cs.chalmers.se/bibliography/proceedings/2006-LCSD.pdf
paper and think that it only offers a partial solution - it only covers the case were different type combinations actually create the same binary code. IMHO, this is not the most problematic case.
Consider just the scalar pixel types 'unsigned byte', 'signed short', 'unsigned short', 'int', and 'float'. All of them are regularly occuring in VIGRA algorithms, and no combination will be binary equal to any other.
As described in my paper, the types can be equivalent in the context of the algorithm. For example, for bitwise-copy, copying "signed short to signed short" and copying "unsigned short to unsigned short" can be treated as equivalent.
What is needed is a consistent set of coercion rules. For example, 'unsigned char' x 'unsigned char' might be implemented natively (for speed), while 'unsigned char' x 'float' will be implemented by a coercion of the first pixel to 'float', followed by a native 'float' x 'float' operation.
These rules, together with appropriate result_of types, can be easily formulated in a set a traits classes, and should also be customizable on a per-function-call basis. Coercion may not be a prerequisite for the usefulness of the core part of GIL, but it is critical for the dynamic_image and the future image algorithm library (including the color conversion function suggested above).
I can see how automatically converting one type to another can be useful, but I am not sure how this relates to the topic of dynamic_image. Whether or not the type of the image is specified at run time, and whether or not you want to convert one type to another are two orthogonal things... unless I am misunderstanding you.
- What is your evaluation of the implementation?
It is mostly OK. I would like the following improvements:
- there are several comments saying 'potential performance problem' (or equivalent) in the code. The code should be changed to eliminate these problems because the affected functions are performance critical.
I agree, on a case by case basis. We are trying to get what we believe are the bottlenecks as efficient as possible. It would be quite a job to make all parts of GIL completely optimal. Very often extra efficiency comes at the cost of code size, extra stuff in data structures, or decreasing the performance in other places of the code... Sometimes the tradeoff is not worth it. For example, currently view.end() is not very efficient as it is implemented as "return begin()+size()". If you say: for (it=view.begin(); it!=view.end(); ++it) {...} You may suffer a performance hit there. But how do we fix this? Store the end iterator inside the view? Then the view class will grow and contain less than minimum info... Unlike std::vector implementations, we need to keep the dimensions explicitly as they cannot be inferred simply by end() - begin(), which is 1D.
- C-style casts and reinterpret_casts should be eliminated (if this is impossible in the latter case, the casts should be guarded by static_assert to do the right thing).
I would love to eliminate them but I don't see how. Static asserts are always a good idea.
- proper initialization of the pixel values should be added (e.g. by uninitialized_fill).
Already on our todo list.
- I didn't check in detail whether all functions (especially constructors and destructors) are fully exception safe. The authors should make a statement about this.
Unless there are bugs, everything should be exception-safe.
- code readability should be improved (more line-breaks).
- the test suite should be made part of the distribution.
It is posted on our download page. Thanks again for your review. Lubomir