
- What is your evaluation of the documentation?
Not good, really. It consists of two parts: the tutorial and the design guide. the first is too superficial and the second is both superficial and too detailed in some places. It is easier to say what the documentation of this and any other generic library should be: 1) The description of the domain covered (it is equal to abstract mathematical model)
The domain covered (if I understand you correctly) is providing abstractions for image representations from algorithms applied on images, which allows us to write imaging algorithms once and have them work for images of any representation. It is specified: - in the abstract - in the first paragraph of opensource.adobe.com/gil - slide 4 of the presentation video - section 1 of the design guide
2) introductory examples of simple library usages
Simple examples of use of the library are provided: - throughout the tutorial - throughout the video presentation - section 15 of the design document - gil_sample_code.cpp - numeric_example.cpp
3) definitions of concepts _with_ examples of it's implementations and adoptation of existing types to model the concept
Examples of implementations of each concept are listed after the concept is introduced. See the "implementation" subsections in the design guide.
This section covers the important question of how one can extend the library. It is too hard to harvest this information from current concepts definitions in GIL docs.
We have an entire section 16 of the design guide to help with extending GIL. In addition, the Virtual Image Views section of the tutorial shows you an example of how to create arbitrary virtual images.
Mathematically speaking, image is a N-dimensional vector function in
M-
dimensional space. In other words, we have a set of points in one N-dimensional space (let call it S(N)) and a set of points in another M-dimensional space (let call it S'(M), where ' corresponds to the target space), that corresponds to the first set. In short form the same can be formulated as:
Image := S(N) -> S'(M)
where ":=" means "by definition"
For example, for the 2D RGB image N == 2 and M == 3.
Your definition is an abstract definition of a function. It applies to images equally well as to any other function, so you can use it to describe any other library that deals in some way with mathematics. While our concepts allow for images of arbitrary dimensions, GIL provides models exclusively for images whose domain is two dimensional (N=2), and whose range can have an arbitrary (M) dimension.
To complete this definition, we need to add that each dimension in both spaces is discretized and finite in some interval. For example, it is [0,255] for one byte channel in GIL terms.
Setting aside the fact that the machine representation of any number is inherently discrete, discretizing the domains makes the concept too narrow. Floating point channels are not discrete. You could also define virtual image views whose coordinates are not integral types.
After defining abstract object we can play with - the Image, we can
Transformations on such objects. and different kinds of
define transformations.
The most general transformation can be expressed like this:
Transform := {Img1} -> {Img2}
where Img1 := S(N1) -> S'(M1) and Img2 := S(N2) -> S'(M2) and {} - defines a set in mathematical sense, so {Img1} is a set of images
in other words, all attributes can change: dimensions of both, source and target spaces (S and S') and a number of images (let call it |{Img}|) in a source and
destination
image sets:
This is just an abstract definition of an arbitrary transformation. Yes, I believe all GIL does could be described as a subset of this formulation, but so can any other math library. GIL provides transformations that change: The domain: (subsampled_view, transposed_view,....) The range: (color_converted_view, nth_channel_view, ...) Of course, everything stays in the context of 2D images.
The reference to the STL is not by chance. The part of STL that
and operates on a Random Access Sequences has the same Abstraction behind it, that is formulated here, but only for case when N is identical to
Wait a minute! STL operates on objects of arbitrarily type T, not on a linear spaces of dimension M as in our formulation here. Yes, but
defines 1. there is
a one to one relation between the two. (Each class object can be seen as a set of values that it holds and this set of values can be seen as a set of choords in our M-dimensional space)
Note, that even for the case when N == 1 STL doesn't cover all kinds of transformations possible, and it is really sad, GIL should be better in this aspect. Moreover, GIL can (and may be should) supersede STL in a Random Access Sequences Transformations business. I want to claim that GIL should provide a primitives to express any kind of Transformation possible.
Oleg - these are pretty ambitious goals and thank you for your trust that GIL should be the library to provide them. Our goals are much more humble and we would like to stay within the image processing domain. If you have concrete ways to improve STL that you can put into code, I encourage you to provide your own library. Once you have the library we will consider the option of using your transformations for GIL, and I am sure other library developers will be interested too. I am curious, what do you think are the primitives that STL lacks and that allow for expressing any kind of transformation possible?
I claim that GIL has to remove algorithm concept completely, and leave only View with a new meaning - it is not only a lightweight Image representation, but it is a representation of the Transformation abstraction from now. View would be invoked when, and only when it would be assigned to Image: Image img = some_view; // all computation made here
You are obviously a fan of functional programming. While functional programming has its appeal, providing functional programs that match the efficiency of imperative programs in every context is out of the question. GIL allows image transformations to be expressed as combinations of functional programming (by creating image views) and imperative (via GIL algorithms). While we have suggestions of where to use each (by providing a set of both) we leave the ultimate choice to the user's judgment.
Here you should say that View already is a Transformation in GIL. Yes, but from docs I've realized that that its potentioal is not clearly understud by GIL authors. I claim that TransformationView should become the central part in GIL, almost all GIL programming and using/extending should be centered around the View concept.
Again, it is a matter of your preference and your tolerance for the associated performance overhead.
Next, and very important issue: laziness produces overhead without
some
kind of memoization technique. I claim that GIL should have one built-in.
Now GIL docs have examples of suboptimal algorithms, that are suboptimal because of this absence of memoization support. I mean examples, where one needs to apply non-trivial views on each step in order to increase performance, but it is: 1) suboptimal 2) looks agly 3) wastes memory
The same examples could be rewritten by hand without GIL in a much more efficient way. It is not a good conclusion for library that claims
"Performance. Speed has been instrumental to the design of the
There is no magical memorization technique that could match the performance of imperative programming, because what you call the TransformationView is not in control of the algorithm. It doesn't know the pattern of access of the pixels. That pattern is determined by the type of algorithm and the transformation views that precede it. One example that Dave points out is whether to memorize the value at all. If it will be used again it makes sense to memorize it. Otherwise it could be a huge waste of memory and time. that library.
The generic algorithms provided in the library are comparable in speed to hand-coding the algorithm for a specific image type."
Please be specific. What example do you have in mind that could be written by hand without GIL in a much more efficient way?
My final claim is about representativity of implementations of
concepts in
GIL. I claim that they are insufficient. For example, sub-byte channels should be implemented to ensure that GIL's concepts are good enough to handle it. In particular, I'm very interested to see an implementation for 1bpp colorspace that is very common in OCR development.
I can understand that. But we have provided models for the vast majority of cases, even for image views that represent an arbitrary function. Certainly it would be nice to have models for everything that people might ever need but at some point we have to draw the line.
I hope, that my comments would help authors to make the GIL that would
be
in a magnitude better than STL is now.
Again, thanks for the ambitious goals you have for GIL. Regarding the concept definitions that you propose, please be concrete on what specific suggestions you have for changes to the GIL code, if any. Regarding your ideas for improving STL and providing fully optimized way to do functional programming, we encourage you to put your ideas into code - this is the best way of specifying what they are. Once you do that we will have something more concrete to discuss. Thanks for your review. Lubomir & Hailin