
Christian wrote:
might know a bit more about libjpeg than me:
DCT type: libjpeg lets you select one of three different DCT implementations via cinfo.dct_method: integer, normal floating point and a faster floating point algorithm that is less accurate. I have found this faster algorithm to be visually indistinguishable and makes decoding about 1/4 to 1/3 faster overall, and I now use it by default. It would be good to have some way to enable this.
Interesting. The new IO interface defines a image_read_info type for each file format. The user can use those structures to either get information from the image that is unimportant for the reader or to set some values that can be used by the reader while reading. <snip>
Scaled decoding: libjpeg has a mode in which it will scale
the image
down to 1/2, 1/4 or 1/8 of its native size while decoding (see cinfo.scale_num and cinfo.scale_denom). This is much less expensive than doing the full decode and then scaling. It would be good to have some way of using this.
I think the tiff format, which I beleve was designed to be a kind of meta-format on top of jpeg and others, has a huge slew of 'tags' that you can peruse on the web. I think each tag corresponds tpo som peice of info. that can be read from or written to the underlying image. These might be useful for providing the kinds of parameters that could/should be read or written.
Partial image decoding: I see that you have some code to extract a portion of an image, but that it does this by reading and
the unwanted parts. You should certainly be able to avoid doing this for the parts after the wanted portion, as you note in the comments. I have been trying to work out how to avoid doing this work for the earlier parts of the image. It should be possible to skip the expensive DCT for these parts as libjpeg has a mode in which it will perform only Huffman decoding and returns DCT coefficients (jpeg_read_coefficients). However, I've not found a way to ask it to do the DCT (and subsequent steps) for the wanted subset of those values. I believe that it's also possible to skip lines in
discarding the JPEG
data (i.e. not even doing the Huffman decoding) by looking for an FFDA marker, but I don't see anything in libjpeg to support that.
I believe the smallest entity to read is by scanline. Because of compression there is no random access for scanlines or parts of a scanline. Though, skipping whole scanlines or parts of scanlines might be problematic. It will require some research some find the right ways. I totally agree on minimizing expensive operations.
I have always thought that an image reader/writer should act a lot like an image that holds memory for the pixels, except that it could be lazy about when it allocated. You would create one from a file, but it would only read the header. You would ask it for its size or other properties, or request a view to part of the pixels. It would load what was needed (possibly more) into memory. You would be able to pass a visitor to some operations and it would call the visitor at various phases of decoding with format specific info, which may be an array of coefficients, a scan line, a single-channel-view, or a view to a tile that is smaller than the region requested. This way somebody could customize handling of various kinds of format specific info in a generic way, or just print progress (% complete) messages. for example: jpeg_image_t jpeg(filename); size_t w = get(jpeg, tags::width); size_t h = get(jpeg, tags::height); double compression_ratio = get(jpeg, tags::compression_ratio); struct my_visitor { //Swallow unhandled events template<class Event, class Data> void operator()(Data & data, Event const& event){} template<class View> void operator()(View & tile, events::on_tile){ //Do something .... } }; //Request a portion to be loaded into memory somewhere jpeg_image_t::view_t roi = jpeg.lock_view(x, y, width, height, my_visitor) I havent considered all the details, but this kind of interface seems like it would be good, if it works. -- John