
Hi Phil, thanks for your email.
I'm not currently using gil, though now it has been released in 1.35 I will probably convert some of my existing code to use it. However, I have used the C++ interfaces to libjpeg in a couple of other libraries - ImageMagick and DirectFB - and I thought I'd have a look at your code to see how it compares. My application involves decompressing large JPEGS from digital cameras which are displayed on a pan/zoon user interface, which needs to be responsive despite running on somewhat slow hardware. Here are some comments, some of which are feature requests and others are really just questions for someone who might know a bit more about libjpeg than me:
DCT type: libjpeg lets you select one of three different DCT implementations via cinfo.dct_method: integer, normal floating point and a faster floating point algorithm that is less accurate. I have found this faster algorithm to be visually indistinguishable and makes decoding about 1/4 to 1/3 faster overall, and I now use it by default. It would be good to have some way to enable this.
Interesting. The new IO interface defines a image_read_info type for each file format. The user can use those structures to either get information from the image that is unimportant for the reader or to set some values that can be used by the reader while reading. Your suggestion would fall into the later. Let me do some research on how to use this flag. As the default I would suggest to use the more expensive but more accurate setting. But of course the documentation will reveal such optimization.
Scaled decoding: libjpeg has a mode in which it will scale the image down to 1/2, 1/4 or 1/8 of its native size while decoding (see cinfo.scale_num and cinfo.scale_denom). This is much less expensive than doing the full decode and then scaling. It would be good to have some way of using this.
That would be another candidate to include in image_read_info.
Partial image decoding: I see that you have some code to extract a portion of an image, but that it does this by reading and discarding the unwanted parts. You should certainly be able to avoid doing this for the parts after the wanted portion, as you note in the comments. I have been trying to work out how to avoid doing this work for the earlier parts of the image. It should be possible to skip the expensive DCT for these parts as libjpeg has a mode in which it will perform only Huffman decoding and returns DCT coefficients (jpeg_read_coefficients). However, I've not found a way to ask it to do the DCT (and subsequent steps) for the wanted subset of those values. I believe that it's also possible to skip lines in the JPEG data (i.e. not even doing the Huffman decoding) by looking for an FFDA marker, but I don't see anything in libjpeg to support that.
I believe the smallest entity to read is by scanline. Because of compression there is no random access for scanlines or parts of a scanline. Though, skipping whole scanlines or parts of scanlines might be problematic. It will require some research some find the right ways. I totally agree on minimizing expensive operations.
MMX/SSE: It should be possible to get a significant speedup for the DCT using MMX or SSE instructions on x86 machines. Versions of libjpeg that do this have existed, but it seems that they were buggy and not well maintained; Debian stopped shipping theirs.
What I have noticed is that I needed to recompile the lib on every different Windows machine itself. I haven't researched it thoroughly but if you compile it yourself using MMX then it might work on one machine but fails on another. Someone here to share his/her experience?
Byte packing: I recall that the output from libjpeg is a packed sequence of red/green/blue values, which you typically then unpack to 32 bits per pixel. Wouldn't it be nice if libjpeg would save it in this format?
libjpeg? Not sure if I can follow here. I think libjpeg only supports interleaved images. When reading a scanline we have a std::vector with the values which then are memcpy'ed into gil::image in the case we only read and not convert. gil's copy_pixels should do that automatically.
Rotation: It's possible to rotate a jpeg image inexpensively by fiddling with the DCT coefficients. But I find the same situation as for partial image decoding: having asked libjpeg to decode to DCT coefficients, and then having fiddled with them to effect the rotation, I can see no function that will complete the decoding to pixel values. Maybe I'm missing something.
I will put it this feature on my todo list. But there are more pressing issues, right now.
In-memory jpegs: If I get my jpeg data from somewhere other than a file, e.g. from some sort of database, or from an in-memory cache or an mmap()ed file, do you have a way to use it?
I remember doing such things at one point in my past. It was for a jpeg streaming application. It worked pretty well and the source code should be somewhere on google groups. For now I haven't thought about such features yet. There are more pressing problems right. Most notably too many template instantiations. Thanks a lot for valuable input. Christian