
Greg Reese, Ph.D. Senior Research Computing Specialist Research Computing Support Group Gaskill Hall, Room 352 Miami University Dr. Reese wrote the following review of the GIL library (I've pasted it wiithout modification below): * * Review of Generic Image Library (GIL) Version Generic Image Library Design Guide, Version 1.01, October 2, 2006 Generic Image Library Tutorial, Version 1.01, October 2, 2006 Code downloaded October 11, 2006 SHORT REVIEW What is your evaluation of the design? Apart from my major concern (see below) the design seems to meet its goals of generality without loss of speed. What is your evaluation of the implementation? Good implementation, especially having everything done in headers, with no library necessary. Too much reliance on proxies may lessen use of STL though. What is your evaluation of the documentation? Excellent documentation. When released suggest adding a lot of examples and checking spelling in the docs. What is your evaluation of the potential usefulness of the library? Unfortunately, I think it's going to be tough to get people to use it. One obvious drawback is the lack of algorithms. This will change with time though. I think a bigger drawback is going to be the highly complex appearance of the code. Applications programmers are going to look at it and faint, even when told that although the internal construction of the code is complicated its use is not. The authors have put in typedef's for some common image and pixel types and I encourage them to add even more typedef's and hide as much of the complexity as possible. I think the library will be most useful in projects that use images of many different types of pixels. In this case the tradeoff between the generality of the library and its complexity is beneficial. In other cases there may have to be a real marketing push. The authors have made a good start in their video tutorial by answering the top ten excuses for not using GIL. Did you try to use the library? With what compiler? Did you have any problems? Tried to use it with Borland C++ Builder 2006 but couldn't because of compilation problems with Boost. Used it with no problems in Microsoft Visual C++ 2005 Express Edition. How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? A good amount of effort. Read the tutorial and design guides pretty carefully, watched the online video guide and ran the two sample programs. Are you knowledgeable about the problem domain? Yes. LONG REVIEW First of all, I'd like to thank Mssrs. Bourdev and Jin for writing GIL. They have obviously put a lot of effort and thought into the library. My thanks to them and Adobe also for making it open source. I hope the authors find the comments below constructive. Because of the complexity of the software and the limited time to review, some of the problems I point out may in fact be problems in my understanding and not in the library itself. If that's the case – great! It makes those non-existent problems much easier to solve. Major Concern My major concern is the lack of requirements/concepts on arithmetic between Pixels, i.e., pixel classes/structures. For example, in the Guide there are numerous code snippets for the computation of a gradient. These take the form of c = (a – b) / 2 where a and b are uint8s and c is an int8. Unless the subtraction operator is defined in some special way not mentioned in the Guide, all of these examples are wrong. The difference of a and b can easily be a negative number but it will be interpreted as a uint8, giving a weird result. Since the most common pixels are uint8 grays and RGB colors made of triplets of uint8's, this is a serious concern. When performing arithmetic with uint8s in image processing, the results need to be saturated, i.e., values above 255 need to be set to 255 and values below 0 need to be set to 0. One way to do this to define the binary arithmetic operators so that they convert to int16, do the arithmetic between the two operands, saturate the values if necessary and convert back to uint8. If this technique is used, the requirement for this behavior should be placed on Pixel. Note too that the operator actions depend on the data type. In many applications, its okay to assume that there won't be any over/underflow when doing arithmetic for int16s because the number of terms and the pixels' values are usually small. The same holds true for not converting floats to doubles. However, it wouldn't be unusual for a program to want to convert int16s to int32s, so that needs to be an option. Unfortunately, simply defining binary arithmetic between Pixels is not sufficient. For example, suppose that we've defined binary addition between uint8s as above, so that saturation is handled correctly. Now consider finding the average of two uint8s a and b, each equal to 255. The answer is obviously 255. However, if we use the formula average = (a + b) / 2, a and b will sum to 255 because of saturation, and 255/2 yields a value for the average of 127. This problem with pixel arithmetic doesn't affect GIL per se because it only represents images, not actions on them. However, having a completely generic image representation that doesn't accommodate pixel arithmetic makes the representation useless for image processing. Moreover, now that the authors have finished the bulk of work on the image software and are starting on the algorithms to process them, the majority of the algorithms will involve pixel arithmetic, so a general concept of arithmetic between Pixels should now be a concern. Minor concerns * The I/O library really should have facilities to read image headers. Suggestions/questions * The channels provide their min and max values, but it would be nice if they could also provide the min and max values that the data they hold can attain, i.e., how many bits of data there are, not of data storage. For example, MRIs usually have a 12-bit data range but are stored in 16-bit pixels. * Can GIL accommodate images with a "virtual range", i.e., in which the x-y range the user accesses is different than the x-y range of the actual data? For example, the Fourier Transform of an n x n image is an image of n x n (complex) numbers. However, half the transformed numbers are redundant and are not stored. Thus it would be useful to let the user access the transformed image as if it were n x n but let the image representation retrieve the appropriate value if the user requests one that is not in the stored range. Perhaps this can be done through some sort of pixel iterator adaptor. How much work would it be entailed for all of the library to work on images like these with a virtual range? * How ubiquitous are proxies in GIL? They can't be used in STL algorithms, which is a big drawback. * The use of piped views for processing makes me nervous. For example, the Tutorial states that one way to compute the y-gradient is to rotate an image 90°, compute the x-gradient, and rotate it back. First of all this only works for kernels that have that rotational symmetry, e.g., square or circular kernels. It doesn't work with rectangular kernels. Secondly, even if the kernels have the correct symmetry, the process won't yield usable results if the spatial resolution in the x- and y-directions is different, as happens often with scanners. No computations done by the library should be performed in this indirect manner, or at a minimum, the documentation should clearly warn about these potential problems. Finally, one possible extension to the design that may be helpful is to incorporate the idea of an image region. This is a generalization of the subimage in GIL but is not limited to a rectangle. Image regions are useful because they help carry out a typical chain of image processing, which goes something like this: 1. Image enhancement - after the image is acquired, it is processed to bring out information of interest, e.g., made brighter, made sharper, converted from gray to color or vice versa. This kind of processing is done by the algorithms that the GIL people are starting to write. 2. Image segmentation – the image is divided into a moderately small number of regions, with all pixels in a region representing one physical quantity, e.g., diseased plants, a particular metal, cancerous cells, etc. 3. Image representation and description – possible connection of regions into objects, e.g., all cancerous cells in these three segments are part of the same cancer; descriptions of groups of pixels by other measurements, e.g., boundary length, circularity, etc. 4. Image interpretation – deciding what physical objects the region representations correspond to For example, consider an industrial CT (computerized) scan of an engine. It's easy to distinguish the air from the engine metal, so the image may be enhanced by using less pixels values to represent air than metal. The picture could then be segmented into regions representing air and different kinds of metals or other engine materials. These segments could then possibly be connected into objects, perhaps depending on the distance the segments are separated by, their constituent materials, the intervening material (air or metal). Their properties can also be computed, e.g., size, shape, boundary brightness. Finally some sort of artificial intelligence software could decide if a region is a carburetor, valve, etc. In all cases, it's often necessary to refer back to the pixels in the original image that are denoted by a particular segment or region. In the above example, it would be reasonable for the CT image to have 8 or 10 bits per pixel (bpp), a segmented image to have 1 bpp ( a crude segmentation into air and metal) or 4 bpp (air and 15 kinds of metal or engine material), and a representation image to have 3 or 4 bpp Anyway, some characteristics of a GIL region could be: o The user should be able to define any region (connected or not, any shape) in the original image. o The pixel type can be different than that of the original image. o The image regions should be "stackable", i.e., more than one at a time referring to the original image, like channels in a Pixel o The region should have iterators that travel only over that region but that can access pixel values of the region and pixel values of the original image that correspond to a region. There are various ways to accomplish this region stuff. One way may be by appropriately defining the current subimage structure (a view, I believe) and iteration over it, which is why I brought this topic up. Once again, thanks to the authors for their hard work. If there are questions, I can contacted at: reesegj at muohio.edu Greg Reese

Tom Brinkman wrote:
Greg Reese, Ph.D. Senior Research Computing Specialist Research Computing Support Group Gaskill Hall, Room 352 Miami University
[snip review]
If there are questions, I can contacted at: reesegj at muohio.edu
Very interesting perspective! One thing that's not clear in his review is that he did not answer this very crucial question:
Please always explicitly state in your review, whether you think the library should be accepted into Boost.
If it's there, I must've missed it. Anyway, that being said, I hope people would join in the discussion rather than simply forward their reviews and ask for a private discussion through an email address. I think that's not the way the Boost culture works. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Anyway, that being said, I hope people would join in the discussion rather than simply forward their reviews and ask for a private discussion through an email address. I think that's not the way the Boost culture works.
Things are not as nefarious as they seem. I was not asking for a private discussion and am happy to have my review made public but was foiled by a technical glitch. (Or perhaps, since this was my first Boost post, it was a user error...) I joined the Boost developers' list and submitted my review to it on Friday the 13th. (Hmm, could this be the problem?) I received an email confirming my submission and saying that my email " Is being held until the list moderator can review it for approval. The reason it is being held: Post to moderated list Either the message will get posted to the list, or you will receive notification of the moderator's decision. " Unfortunately, although I have been receiving daily Boost digests since I joined the list, I neither saw my review posted nor heard from a moderator. In any case, since the review period was about to expire, Tom Brinkman was kind enough to post the review for me.
One thing that's not clear in his review is that he did not answer this very crucial question:
Please always explicitly state in your review, whether you think the library should be accepted into Boost.
My bad. I answered all of the bulleted questions in the review announcement, but missed this other, more important question. My answer is: yes, GIL should be included in Boost.
If there are questions, I can contacted at: reesegj at muohio.edu
Very interesting perspective!
I thought the reviews just went to the authors and was actually trying to help them out by volunteering to answer any questions they had about what I said. I didn't realize the reviews were posted on the list and discussed by the whole community. Now I know. Again, I apologize for my procedural breaches - I was a Boost-list virgin. I will now keep tabs on this thread for a little while and post responses, this time publicly! Thanks. Greg Reese

Greg Reese wrote:
Anyway, that being said, I hope people would join in the discussion rather than simply forward their reviews and ask for a private discussion through an email address. I think that's not the way the Boost culture works.
Things are not as nefarious as they seem. I was not asking for a private discussion and am happy to have my review made public but was foiled by a technical glitch. (Or perhaps, since this was my first Boost post, it was a user error...) I joined the Boost developers' list and submitted my review to it on Friday the 13th. (Hmm, could this be the problem?) I received an email confirming my submission and saying that my email
[snip]
Please always explicitly state in your review, whether you think the library should be accepted into Boost.
My bad. I answered all of the bulleted questions in the review announcement, but missed this other, more important question. My answer is: yes, GIL should be included in Boost.
If there are questions, I can contacted at: reesegj at muohio.edu Very interesting perspective!
I thought the reviews just went to the authors and was actually trying to help them out by volunteering to answer any questions they had about what I said. I didn't realize the reviews were posted on the list and discussed by the whole community. Now I know.
Again, I apologize for my procedural breaches - I was a Boost-list virgin. I will now keep tabs on this thread for a little while and post responses, this time publicly!
Thank you very much for the clarification. Certainly I misunderstood and I apologize too for that. Your review was very enlightening and informative. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net

Dr. Reese, Thank you for your detailed review. I am going to post my comments even though I am not sure if you check the boost thread. Greg Reese wrote:
Major Concern
My major concern is the lack of requirements/concepts on arithmetic between Pixels, i.e., pixel classes/structures. For example, in the Guide there are numerous code snippets for the computation of a gradient. These take the form of c = (a - b) / 2
where a and b are uint8s and c is an int8. Unless the subtraction operator is defined in some special way not mentioned in the Guide, all of these examples are wrong. The difference of a and b can easily be a negative number but it will be interpreted as a uint8, giving a weird result. Since the most common pixels are uint8 grays and RGB colors made of triplets of uint8's, this is a serious concern.
We agree, in general. GIL core simply gives you access to the channels of a pixel in their raw form. When you write an algorithm, you have to be very careful and take into account that the channels can be of any type. But that is true in C++ in general: It will let you assign a signed char to an unsigned char with no complaint. The result will be strange, and you better know what you are doing. As for the example in the tutorial the types used are as follows: unsigned char a,b; signed char c; c = (a - b) / 2; This does not result in overflow and works properly. We used this example because the goal of the tutorial was to get a very basic introduction to GIL, things like pixel navigations and views, not to write the real algorithm that computes the gradient. (There are many other problems with the x_gradient algorithm in the tutorial that were intentionally overlooked) But in general, you are right, when writing image processing algorithms we need some facilities for dealing with channel arithmetic. However, I claim that they should be part of the numeric extension, the GIL extension that deals with algorithms. In particular, we should have some fundamental atomic channel-level operations defined. For example, we could have the above atomic operation "half_difference" and have specializations for, say, integral channels, to use a shift instead of division. Then the formula could look, for example, like this: c = channel_convert<DstChannel>(channel_halfdiff(a,b));
When performing arithmetic with uint8s in image processing, the results need to be saturated, i.e., values above 255 need to be set to 255 and values below 0 need to be set to 0. One way to do this to define the binary arithmetic operators so that they convert to int16, do the arithmetic between the two operands, saturate the values if necessary and convert back to uint8. If this technique is used, the requirement for this behavior should be placed on Pixel.
Agreed - the above atomic channel-level algorithms should take into account saturation.
Note too that the operator actions depend on the data type. In many applications, its okay to assume that there won't be any over/underflow when doing arithmetic for int16s because the number of terms and the pixels' values are usually small. The same holds true for not converting floats to doubles. However, it wouldn't be unusual for a program to want to convert int16s to int32s, so that needs to be an option.
Unfortunately, simply defining binary arithmetic between Pixels is not sufficient. For example, suppose that we've defined binary addition between uint8s as above, so that saturation is handled correctly. Now consider finding the average of two uint8s a and b, each equal to 255. The answer is obviously 255. However, if we use the formula average = (a + b) / 2, a and b will sum to 255 because of saturation, and 255/2 yields a value for the average of 127.
Agreed - having binary arithmetic between pixels is insufficient.
This problem with pixel arithmetic doesn't affect GIL per se because it only represents images, not actions on them.
Exactly!
However, having a completely generic image representation that doesn't accommodate pixel arithmetic makes the representation useless for image processing. Moreover, now that the authors have finished the bulk of work on the image software and are starting on the algorithms to process them, the majority of the algorithms will involve pixel arithmetic, so a general concept of arithmetic between Pixels should now be a concern.
Agreed - one of the very first things to do when writing image processing algorithms, is defining those atomic operations.
* The channels provide their min and max values, but it would be nice if they could also provide the min and max values that the data they hold can attain, i.e., how many bits of data there are, not of data storage. For example, MRIs usually have a 12-bit data range but are stored in 16-bit pixels.
You could use std::numeric_traits<Channel>::max() for this
* Can GIL accommodate images with a "virtual range", i.e., in which the x-y range the user accesses is different than the x-y range of the actual data? For example, the Fourier Transform of an n x n image is an image of n x n (complex) numbers. However, half the transformed numbers are redundant and are not stored. Thus it would be useful to let the user access the transformed image as if it were n x n but let the image representation retrieve the appropriate value if the user requests one that is not in the stored range. Perhaps this can be done through some sort of pixel iterator adaptor. How much work would it be entailed for all of the library to work on images like these with a virtual range?
It would be possible to do that using the virtual image abstraction. It allows you to define the pixel value given coordinates x and y. You could have a function that holds the original image and performs an arbitrary processing/caching to return the value at (x,y).
* How ubiquitous are proxies in GIL? They can't be used in STL algorithms, which is a big drawback.
Proxies are used to represent a planar image. You are, of course, free to treat a planar image as N single-channel images.
* The use of piped views for processing makes me nervous. For example, the Tutorial states that one way to compute the y-gradient is to rotate an image 90°, compute the x-gradient, and rotate it back. First of all this only works for kernels that have that rotational symmetry, e.g., square or circular kernels. It doesn't work with rectangular kernels. Secondly, even if the kernels have the correct symmetry, the process won't yield usable results if the spatial resolution in the x- and y-directions is different, as happens often with scanners. No computations done by the library should be performed in this indirect manner, or at a minimum, the documentation should clearly warn about these potential problems.
Agreed. Again, this was just used to illustrate what rotated90cw_view does. We don't recommend people actually write y-gradient by reusing x-gradient with rotated views. There are other severe problems, like performance, besides the problems you mention.
Finally, one possible extension to the design that may be helpful is to incorporate the idea of an image region.
Image regions are an interesting concept, and worth considering further. But there could be different models depending on the access pattern requirements and density. For example, image view + bitmask may make sense to represent dense localized regions, or a vector of pixel coordinates may be more appropriate for sparse regions. So it makes sense to let this concept emerge, driven by the concrete algorithms. Thanks again for the detailed review! Lubomir

Lubomir Bourdev wrote:
Dr. Reese,
Thank you for your detailed review. I am going to post my comments even though I am not sure if you check the boost thread.
Thank you! With all due respect to Dr. Reese, I think posting a review and then asking to be emailed for replies is not right. The review is a public affair. We would also like to read about the replies and exchanges that ensue after a reviewer posts his review. A reviewer should also be responsible to answer and reply to the questions and answers related to his review _on_list_ in as much as the one being reviewed (Lubomir et. al.) tries as best as they can to answer and reply to the reviews. It's not a one way street. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net
participants (4)
-
Greg Reese
-
Joel de Guzman
-
Lubomir Bourdev
-
Tom Brinkman