
Hi Ulli, Thanks for spending time to review GIL. [...]
1. I like the concept of views to adapt images to algorithms' needs. Users can understand this quite easily. But there is also a danger of overuse (this is pointed out in the tuorial): doing all transformations on the fly may be very inefficient. For example, conversion between RGB and CMYK in an adaptor is OK, but RGB to LAB is not. For the latter cases, the GIL authors propose an idiom
copy_pixels(some_expensive_view(source), some_other_view(destination));
A traditional STL style solution would probably be easier:
transform(source.begin(), source.end(), destination.begin(), SomeFunctor());
(since implementing a view is more involved than implementing a functor, at least in terms of reading documentation).
They are not quite equivalent. Views can transform both the domain and the range of images. For example, a view may change the way its horizontal and/or vertical iterators advance, and similarly may perform a transformation upon dereferencing (which your SomeFunctor does). We find it appealing to represent this as a single object (view) rather than splitting it into multiple parts, such as ranges and accessors. Yes, I know these are conceptually different transformations, and they are implemented in different classes; they are just combined in the View. It is appealing because conceptually it just represents an abstract image. And you can pipe views nicely together. Another difference is in performance: copy_pixels(src,dst) is usually faster than the STL equivalent std::copy(src.begin(), src.end(), dst.begin()) This is because the one-dimensional iterators are slower, as they have to deal with potential padding at the end of rows, and also because we provide performance specializations.
On the other hand, a PixelDereferenceAdapter is semantically also a view, so the use of the view idea could be extended.
PixelDereferenceAdapter is not a view. It is a function object to be invoked upon pixel access, similar to your Pixel Accessor. You can easily attach it to any iterator, locator or a view.
The interpretation of pixel adaptors as views is more apparent when data accessors are used (they work essentially like the image views). The difference to the PixelDereferenceAdapter is mainly syntactic (using operator* rather than get/set functions), and it was already pointed out that this is not without problems (reference proxies are required for operator*, which are more difficult to implement than data accessors and may sometimes exhibit unexpected behavior).
I understand your motivation for using pixel adaptors, but in my opinion the benefits of using PixelDereferenceAdapter outweigh its disadvantages. I will elaborate, but I am first waiting to hear from others on the thread we started on this subject.
2. I like the idea of handling unknown input formats by means of a variant pixel. However, it remains to be shown whether this solution is indeed practical for algorithm implementation. For example, when a binary operation is called for a variant with 4 possible types, the compiler is forced to generate 16 algorithm instantiations (for all pairs) for every result type. There are a number of design alternatives (for example coercion to a common type) which are probably better.
GIL incorporates a mechanism for active reduction of code bloat due to variant instantiations. I will present this approach at the LCSD workshop in the upcoming OOPSLA conference, which is in less than a week: http://lcsd.cs.tamu.edu/2006/
3. I don't like that the input/output image format (jpeg, png etc.) must be hardcoded in the program (by using jpeg_read_image() etc.).
We initially had a combined I/O methods like: template <typename Image> void read_image(const char* file, Image& img) { switch (image_type_from_file_name(file)) { case kJPEG: jpeg_read_image(file,img); break; case kTIFF: tiff_read_image(file,img); break; ... } } We decided to remove them for three reasons - first, the set of file formats is open and listing them in a switch statement is not very flexible. Second, using the above would require people to link against all libraries while they may only need/have one. And third, there are plans to put in ASL a more advanced GIL I/O that allows for dynamic registration of file format modules. Of course, the above methods are trivial to implement and people might want to just do them for convenience. On the client side it is ok to be less generic...
4. GIL has no (public) test suite whatsoever.
We have a regression suite, though not on the web site. You will get it if you install GIL through ASL.
5. A GIL image can only be used with POD pixel types, because the allocated memory is never initialized (by uninitialized_fill() or equivalent). Neither is the destructor for individual pixels called before deallocation.
This was an item on our TODO list but don't know how it fell between the cracks. We will put it back on the list. Thanks! Good catch!
6. GIL may have alignment problems. For example, the variant template allocates internal memory as "char[MAX_SIZE]" but then reinterpret_casts it to an image type.
MAX_SIZE is determined by taking the maximum of the size_of of each element in the MPL vector of image types with which the variant is instantiated. So for every variant it is different, and the appropriate size is evaluated at compile time.
Casts from "signed short *" to "pixel<int16s, rgb_t> *" and similar (see the tutorial) are also problematic
This cast is necessary to go from the outside world (pointer to the beginning of the image) into GIL. It is not obvious to me how to avoid it. How does VIGRA deal with the case where the user constructs an 8-bit RGB image by providing VIGRA with a raw pointer to the first pixel?
7. Is transform_pixel_position() able to handle image borders properly and efficiently (page 7 of the tutorial)? Otherwise, it is wrongly used (may cause segfaults).
No it doesn't handle corner conditions. Actually you found a typo in the tutorial. The function that contains transform_pixel_position should be called x_gradient_unguarded to indicate this. As the tutorial shows, x_gradient_unguarded is called from x_gradient using subimage_view that excludes the first and last column.
8. The cached_loaction_t is a nice, elegant optimization.
9. Generic algorithms that work efficiently for both planar and interleaved multi-channel images are not as easy as the authors imply.
Why not? (see below)
On page 5 of the tutorial, the authors recommend transform_channels() in order to unroll the inner loop over the channels. But this optimization is useless for planar images - in this case, one wants to work on one channel at a time, i.e. the channel loop must be the outer rather than the inner loop.
I wouldn't call it useless for planar images. Just the opposite - this code illustrates one of the nice abstractions in GIL; that you can write the algorithm once and have it work for both planar and interleaved images. There is typically no loss in efficiency even though the three planes are separated, because modern architectures can support many simultaneous cache lines. Of course, this is not always the case. The optimal performance depends on the exact algorithm, the hardware, and the image size. There may be contexts for which processing the planes one at a time is noticeably faster. In this case a planar-only specialization of the algorithm can be provided. The point is that by using GIL you are not forced to provide a planar version for the algorithm. We have many versions of the same imaging algorithms at Adobe - ones that work for planar images, and ones for interleaved. GIL is a hope that these could be consolidated. ____ We have been discussing the accessors/DereferenceAdaptors and promotion traits issues in separate threads. Aside from these differences with VIGRA, do you see any issues with the GIL interfaces that would prevent porting the Vigra algorithms to GIL, or that would affect their performance? If yes, what interfaces are missing and what algorithms need them? Thank you again for your review! Thanks, Lubomir