Re: [boost] GIL Image Processing Library - specific comments

18 Oct 2006

      Hi Ulli,

Thanks for spending time to review GIL. 

[...]
...
1. I like the concept of views to adapt images to algorithms' 
needs. Users can understand this quite easily. But there is 
also a danger of overuse (this is pointed out in the 
tuorial): doing all transformations on the fly may be very 
inefficient. For example, conversion between RGB and CMYK in 
an adaptor is OK, but RGB to LAB is not. For the latter 
cases, the GIL authors propose an idiom
copy_pixels(some_expensive_view(source), 
some_other_view(destination));
A traditional STL style solution would probably be easier:
transform(source.begin(), source.end(), destination.begin(),
             SomeFunctor());
(since implementing a view is more involved than implementing 
a functor, at least in terms of reading documentation).
They are not quite equivalent. Views can transform both the domain and
the range of images. For example, a view may change the way its
horizontal and/or vertical iterators advance, and similarly may perform
a transformation upon dereferencing (which your SomeFunctor does). We
find it appealing to represent this as a single object (view) rather
than splitting it into multiple parts, such as ranges and accessors.
Yes, I know these are conceptually different transformations, and they
are implemented in different classes; they are just combined in the
View. It is appealing because conceptually it just represents an
abstract image. And you can pipe views nicely together.
Another difference is in performance:  copy_pixels(src,dst) is usually
faster than the STL equivalent std::copy(src.begin(), src.end(),
dst.begin())
This is because the one-dimensional iterators are slower, as they have
to deal with potential padding at the end of rows, and also because we
provide performance specializations.
...
On the other hand, a PixelDereferenceAdapter is semantically 
also a view, so the use of the view idea could be extended.
PixelDereferenceAdapter is not a view. It is a function object to be
invoked upon pixel access, similar to your Pixel Accessor. You can
easily attach it to any iterator, locator or a view.
...
The interpretation of pixel adaptors as views is more 
apparent when data accessors are used (they work essentially 
like the image views). The difference to the 
PixelDereferenceAdapter is mainly syntactic (using operator* 
rather than get/set functions), and it was already pointed 
out that this is not without problems (reference proxies are 
required for operator*, which are more difficult to implement 
than data accessors and may sometimes exhibit unexpected behavior).
I understand your motivation for using pixel adaptors, but in my opinion
the benefits of using PixelDereferenceAdapter outweigh its
disadvantages. I will elaborate, but I am first waiting to hear from
others on the thread we started on this subject.
...
2. I like the idea of handling unknown input formats by means 
of a variant pixel. However, it remains to be shown whether 
this solution is indeed practical for algorithm 
implementation. For example, when a binary operation is 
called for a variant with 4 possible types, the compiler is 
forced to generate 16 algorithm instantiations (for all 
pairs) for every result type. There are a number of design 
alternatives (for example coercion to a common type) which 
are probably better.
GIL incorporates a mechanism for active reduction of code bloat due to
variant instantiations. I will present this approach at the LCSD
workshop in the upcoming OOPSLA conference, which is in less than a
week: http://lcsd.cs.tamu.edu/2006/
...
3. I don't like that the input/output image format (jpeg, png 
etc.) must be hardcoded in the program (by using 
jpeg_read_image() etc.).
We initially had a combined I/O methods like:

template <typename Image>
void read_image(const char* file, Image& img) {
    switch (image_type_from_file_name(file)) {
        case kJPEG: jpeg_read_image(file,img); break;        
        case kTIFF: tiff_read_image(file,img); break;
	...
    }
}

We decided to remove them for three reasons - first, the set of file
formats is open and listing them in a switch statement is not very
flexible. Second, using the above would require people to link against
all libraries while they may only need/have one. And third, there are
plans to put in ASL a more advanced GIL I/O that allows for dynamic
registration of file format modules.
Of course, the above methods are trivial to implement and people might
want to just do them for convenience. On the client side it is ok to be
less generic...
...
4. GIL has no (public) test suite whatsoever.
We have a regression suite, though not on the web site. You will get it
if you install GIL through ASL.
...
5. A GIL image can only be used with POD pixel types, because 
the allocated memory is never initialized (by 
uninitialized_fill() or equivalent). Neither is the 
destructor for individual pixels called before deallocation.
This was an item on our TODO list but don't know how it fell between the
cracks. We will put it back on the list. Thanks! Good catch!
...
6. GIL may have alignment problems. For example, the variant 
template allocates internal memory as "char[MAX_SIZE]" but 
then reinterpret_casts it to an image type.
MAX_SIZE is determined by taking the maximum of the size_of of each
element in the MPL vector of image types with which the variant is
instantiated. So for every variant it is different, and the appropriate
size is evaluated at compile time.
...
Casts from 
"signed short *" to "pixel<int16s, rgb_t> *" and similar (see 
the tutorial) are also problematic
This cast is necessary to go from the outside world (pointer to the
beginning of the image) into GIL.
It is not obvious to me how to avoid it. How does VIGRA deal with the
case where the user constructs an 8-bit RGB image by providing VIGRA
with a raw pointer to the first pixel?
...
7. Is transform_pixel_position() able to handle image borders 
properly and efficiently (page 7 of the tutorial)? Otherwise, 
it is wrongly used (may cause segfaults).
No it doesn't handle corner conditions. Actually you found a typo in the
tutorial. The function that contains transform_pixel_position should be
called x_gradient_unguarded to indicate this. As the tutorial shows,
x_gradient_unguarded is called from x_gradient using subimage_view that
excludes the first and last column.
...
8. The cached_loaction_t is a nice, elegant optimization.
9. Generic algorithms that work efficiently for both planar 
and interleaved multi-channel images are not as easy as the 
authors imply.
Why not? (see below)
...
On page 5 of the tutorial, the authors 
recommend transform_channels() in order to unroll the inner 
loop over the channels. But this optimization is useless for 
planar images - in this case, one wants to work on one 
channel at a time, i.e. the channel loop must be the outer 
rather than the inner loop.
I wouldn't call it useless for planar images. Just the opposite - this
code illustrates one of the nice abstractions in GIL; that you can write
the algorithm once and have it work for both planar and interleaved
images. There is typically no loss in efficiency even though the three
planes are separated, because modern architectures can support many
simultaneous cache lines.

Of course, this is not always the case. The optimal performance depends
on the exact algorithm, the hardware, and the image size. There may be
contexts for which processing the planes one at a time is noticeably
faster. In this case a planar-only specialization of the algorithm can
be provided. The point is that by using GIL you are not forced to
provide a planar version for the algorithm.

We have many versions of the same imaging algorithms at Adobe - ones
that work for planar images, and ones for interleaved. GIL is a hope
that these could be consolidated.

____

We have been discussing the accessors/DereferenceAdaptors and promotion
traits issues in separate threads. Aside from these differences with
VIGRA, do you see any issues with the GIL interfaces that would prevent
porting the Vigra algorithms to GIL, or that would affect their
performance?
If yes, what interfaces are missing and what algorithms need them?

Thank you again for your review! 

Thanks,
Lubomir

Re: [boost] GIL Image Processing Library - specific comments

Lubomir Bourdev