[boost] GIL Review

30 Oct 2006

      Apologies for the late review.  Aside from the voting, I hope it will 
still be of some use.

I appreciate the efforts of the authors, as well as the opportunity to 
provide feedback.

What is your evaluation of the design?
--------------------------------------
Elegant, but it takes a slightly simplistic view of images, in some regards.

I'm concerned that GIL's color spaces combine too many distinct 
concepts: memory layout, channel ordering, channel type & range, and a 
more pure definition of a color space (e.g. basis vectors, primaries, 
transfer function, etc.).  In practice, this may require some users to 
define a large number of GIL color spaces.

For example, a program or library that handles encoding/decoding of 
MPEG-4 video (non-studio profiles) has to deal with as many as 6 
variants of YUV, 4 transfer functions, and two different scales of 
sample values (without getting into things like n-bit profile).  In 
addition to that, professional video production systems will also have 
to deal with a variety of linear, non-linear, and log-scale RGB 
formats.  Add RGBA, and you also have to deal with whether Alpha is 
premultiplied.  Combined with a few different channel orderings and data 
layouts, I fear the result is such a multiplicity of combinations that 
the core purpose of GIL's color space construct would be defeated.

Perhaps this is simply at odds with GIL's goal of uncompromising 
performance.  Still, I think the library shouldn't simply exclude such 
cases.  There should be ways to trade various amounts of efficiency for 
various amounts of runtime flexibility.

While I generally agree with Lubomir about the numerical aspects of 
promotion & conversion traits (i.e. the safe default will be overkill, 
in most cases), I do think they could have both significant runtime and 
structural advantages.  The runtime advantages would be mostly avoidance 
of extra conversion steps and may require a per-algorithm override, so 
that the policy can be tweaked on a case-by-case basis.  The structural 
advantages would come from being able to establish a semantic 
relationship between the values of different channel types.  In order to 
get this right, I think channel types would rarely be POD types, such as 
int and float.  Instead, an example of what you might use is a unique 
type that represents its value as a float, but which can have distinct 
traits template specializations like channel_traits<T>::zero_value(), 
channel_traits<T>::unity_gain_value(), and 
channel_traits<T>::saturated_value() (i.e. "white" value - often 
different from the largest value representable by the type!).  I should 
concede that I haven't developed this idea very far, though it may 
provide a foundation for addressing some of the concerns raised by Dr. 
Reese.

Is there any way to create views of an image that include a subset of 
the channels (for subsets larger than 1), besides color_converted_view?  
Or is there some way color_converted_view might come for free, in this 
case (I didn't get a chance to look into that)?  IMO, this is pretty 
important for processing RGBA images since, as Ullrich Koethe points 
out, it's often necessary to treat Alpha differently than RGB (or YUV).  
Same goes for Z-buffers, and a variety of other synthetic and captured 
channels one might imagine.

Finally, GIL seems to lack any explicit notion of non-uniform sample 
structures.  In video, 4:2:2, 4:1:1, and 4:2:0 are ubiquitous.  An image 
representation that can efficiently support uniform presentation of 
channels with different sample frequencies and phases is important for 
high-performance video processing applications.  I accept that this is 
beyond GIL's intended scope, though I'm citing it as a problem I had 
hoped GIL would solve.  While I believe a synthetic view could be used 
to provide 4:4:4 access to images with other sample structures, such a 
mechanism would likely constitute a severe performance/quality tradeoff 
- at least for a serious video processing application.

What is your evaluation of the implementation?
----------------------------------------------
Elegant.  Good focus on performance & generality.

I had a few stylistic issues that probably don't bear much discussion, 
here.  Mostly concerning line length (207 columns!?!) and type names 
(pixel<> vs. color<> - I agree with Fernando; as well as more minor issues).

What is your evaluation of the documentation?
---------------------------------------------
I found the design guide a bit terse, in places.  I'm not sure it's a 
good stand-alone resource for providing a high-level understanding of 
the library.  I may not be a good judge, given my familiarity with the 
problem domain, and the fact that I first watched the Breeze presentation.

Not to contradict others' criticisms, but I did find the Doxygen docs a 
helpful aid, when trying to navigate the header files.

The Breeze presentation provided a great starting point.

Overall, I feel the written documentation isn't quite up to the standard 
of other Boost library docs.  I expect this may be an issue for some 
first time users of the library, and relatively new users trying to find 
answers to specific questions.

What is your evaluation of the potential usefulness of the library?
-------------------------------------------------------------------
It's probably useful enough to warrant acceptance, as is.  Again, it's 
facilities for color conversion are hampered with the heavy overloading 
of the color space concept.

Did you try to use the library?  With what compiler?  Did you have any 
problems?
--------------------------------------------------------------------------------
No.

How much effort did you put into your evaluation? A glance? A quick 
reading? In-depth study?
--------------------------------------------------------------------------------------------
Watched the Breeze presentation.  Read the Design guide.  Read all of 
the review-related discussions on the mailing list.  Looked at many of 
the header files & Doxygen docs.

Are you knowledgeable about the problem domain?
-----------------------------------------------
Yes.  Seven of my Ten years of professional experience have been focused 
on development of high performance software for: photorealistic 3D 
rendering, video compression, film & video post-production, and computer 
vision.  Prior to that, computer graphics was one of my primary interests.

Overall strengths
-----------------
* Heavy use of templates & focus on matching the performance
  of non-generic code.
* Use of STL idioms and compatibility with STL and Boost.
* Good generalization of most concepts.
* I like the focus on image containers, access, and conversions.
  These are applicable to nearly all graphics & imaging libraries
  and applications, whereas such a universal subset of image
  algorithms is virtually non-existent

Weaknesses
----------
* Conflation of too many distinct concepts in color spaces.
* In order to better address the problem of providing a unified
  interface over different image representations, more attention
  should be given to the semantics of certain pixel & channel
  values.  This may help make type conversion & promotion more
  tractable and may result in more intuitive algorithm behavior.

Do you think the library should be accepted as a Boost library?
---------------------------------------------------------------
Yes.  I feel the shortcomings mentioned above and by others neither 
prevent GIL from being usable nor useful.  However, its usefulness (and 
perhaps usability) could be greatly enhanced, if these issues could be 
addressed more comprehensively.  In particular, I hope the concepts of 
color space, data layout, and channel type can be better separated, at 
some point.

Matt

[boost] GIL Review

Matt Gruenke