Re: [boost] GIL io_new review

11 Dec 2010

      Hi Lubomir, apologies in advance for not covering all the points...short on
time ;)

"Lubomir Bourdev" <lbourdev@adobe.com> wrote in message
news:286B66C8-D267-417C-9442-58E4A491EBFD@adobe.com...
...
All we can hope for is push the boundary as much as we can, and leave
the rest for the next update.
As is often the case...however there are more future-proof and less
future-proof designs/interfaces (judged by how much can you
change/add/remove without breaking compatibility and/or performance)...Which
I think is what you also stressed below:
...
What is important in the first release is to get the design right.
Once we have a solid foundation we can add a lot of functionality quickly
and transparently.
...
On the issue of using classes vs free functions, I don't think there is a
huge difference either way. The the new_io code borrows the style from
GIL, which uses free functions as algorithms (if you call I/O
"algorithms").
GIL in turn borrows this style from the STL. Using functions one can
extend
functionality with overloading (but there are ways to do this with classes
too)
and it is nicer to be able to read/write a file with one line, without
having to
first construct a reader/writer and then call it. But these are minor
quibbles.
I understand you had limited time to go through all the, now really numerous
posts, but I think I've shown (in the pre-review discussion) that these are
not minor quibbles...The free function interface has issues both in terms of
performance and in terms flexibility/power of access to the backend that it
gives...So far shown attempts to fix those where all shifts towards public
classes/objects representing the image to be read/written (and/or further
increase in the number of parameters)...

Furthermore, the classes approach can also provide a oneliner interface as
io2 already does (as shown in the pre-review discussion) using static member
functions...it is provided for all backends in a single place by the CRTP
base class...so you can for example simply write libpng_image::read( ... )
or libtiff_image::write( ... ) which will construct the local reader/writer
object for you and do the rest of the required work....
...
On the issue of performance, I don't think one test with one file format
being
30% slower proves that there are serious performance issues. My guess is
that
Christian has been focusing on adding functionality and never got around
to
work on performance improvements, which usually means that we can get a
lot of speedup by doing some simple optimizations in the inner loops.
Actually there were two file formats (JPEG and PNG) and the speed difference
was ~37.5% (the size difference was ~30%)...In fact simply looking
at/stepping through the implementation of io and io_new (which I've
elaborated in more detail previously) is enough to 'see' the inefficiency
w/o testing but I did the test anyway...

I simply cannot follow the logic of making a priori slow software (by
ignoring obvious inefficiencies) and then making it fast later (unless of
course for the later selling of some 'clever' corporate marketing fog which
obviously is not the case here)...
There would be no need for serious performance improvements if it weren't
for the, unfortunately so prevalent, 'habit'/'school of coding' that simply
by default uses STL containers/constructs for just about anything without
even blinking an eye...
And the injudicious use of STL containers coupled with the inherent template
bloat are the major reasons why io_new (and the current io) are slow (not
counting the issues inherent in the interface/design)...

In this context I actually do not blame Christian almost at all for this but
the 'climate' that fosters this "Way"...

It is no wonder that 'we live in the age' of several-megabyte-single-dialog
software if 'every' library author thinks it OK to add a few tens or even a
couple of hundreds of unnecessary kilobytes of code because it supposedly
will not be noticeable because of disk access or some similar 'by-the-book'
excuse (that is rarely actually tested)...

ps. Ideally, in most cases there should be no 'inner-loop' in GIL.IO (other
than those within the backend library itself)...
...
On the issue of using/not using streams, I don't have a strong opinion,
except that stream support for I/O was one of the highest requested
functionalities in GIL's review. Domagoj, you should have been more
vocal at that time. Christian simply responds to the wishlist of the
Boost community. In general, I feel that performance might be a bit
overhyped in recent discussions. The image I/O operations are often
dominated by disk access and by compression, etc. I don't think
avoiding a copy of the pixels would make much of a difference in the
majority of uses. That said, we do want to make sure GIL I/O is as fast
as it can be.
There are assumptions in this which you must/should test before you start
believing in them...I have already done numerous tests and analysis (not
only for GIL.IO) that prove those assumptions wrong...

That iostreams are evil is a simple _fact_ demonstrated countless times by
numerous people...And it seems there is no longer any point in arguing this
fact as it is simply ignored by people that seem to have some sort of a
priori trust in the STL (and you can't rationally argue with 'blind faith')
as if it is some 'holy artefact' of the C++ language when in fact it is
sometimes its stumbling block...

When I discuss/consider 'efficiency' I always consider both code speed and
size (as this also affects the speed even if sometimes not of the program
itself when tested/considered in isolation but of the whole system affected
by N such 'bloating decisions'...i.e. link the result of sizeof( Windows
Vista installation folder ) to its performance...)...In this light it is IMO
wrong for any library author to make any bloat-inducing decisions based on
assumptions of the type 'oh the OS call I make will always be slower than
anything I can possibly do'...By that rationale Boost.ASIO could then
allocate random std::strings before making a network call 'that will surely
be slower than anything done in a local process'...

With that said, I have no problem with providing an iostream interface (as
an additional option, that has no influence on other code) only with
providing _only_ an iostream interface (which AFAICT is currently the case
for in memory images...luckily not for all of io_new)...
...
...on the basis that there is another alternative libarary (Domagoj's
library).
I don't think this is a good idea. I looked briefly at his library; there
are things
I like about it but also things I don't. There are certainly a lot of
things that
are missing. I will refrain from discussing it because it is not under
review
Of course, there are things I don't like in it :) As you said it is not
under review ;)
...
This is a huge effort on behalf of Christian, he produced some great
and very useful work and a reject will be demoralizing after so many
years of work, and is unjustified given that fundamentally he has the
right design.
Of course, I agree fully (except for the last part, that io_new has a
fundamentally right design)...
My vote 'no' was not a 'rejection' vote: as I've stated numerous times I
think Christian has done a truly massive piece of work...but...a massive
piece of work can also require a massive review process (as this is turning
out to be)...of which it can be expected to give a few 'unsatisfactory'
marks in the process before reaching a successful end...much like a mentor
might not simply reject a student's thesis but return it for a work up...

The 'massiveness' of the thing in our hands is further pointed out by the
fact that we have so far mostly or even exclusively seen reviews of only the
IO extension which comprises only a part of Christian's work (which I
suspect might also be 'frustrating' for Christian...as even a negative
review is often better than no review at all)...
...
- Support for different backends. It appears that this can be done simply
the
same way a file format can be extended, i.e. have "png_libpng" and
"png_lodepng"
tags. It is not immediately obvious to me why this would be a bad design
choice;
if someone disagrees they should elaborate.
For example adding another set of tags could further complicate
things/implementation (even maybe slowing compilation) with another required
tag->implementation mapping...Directly using the backend wrapper (e.g.
libjpeg_image::... ) seems IMHO simpler with no adverse effects...
...
Finally, here is an idea that I like, but may be too much work to do in
the current release:
I would like to have a GIL virtual image view associated with a given
file. That would
allow reading/writing to simply be copy_pixels, and reading into a
different format to
be copy_and_convert_pixels. Reading a given ROI could simply be done by
creating a
subimage_view. The cool thing about this is one could in theory make
changes to
images without even loading them in memory. The caviat is that doing this
right and
efficiently is very hard, requires clever caching, plus a wrong access
pattern can
result in horribly bad performance.
Actually this is what I tried to do at the very beginning when I started to
work on io2 with the GDI+ backend (the gp_view classes at the end of
gp_image.hpp, it is relatively easy to do this with GDI+ and WIC) but I
stopped after an initial prototype version as I of course had to catch up
and concentrate on a zilion other features either already present in io_new
or simply of a higher priority...

ps. There is also one important aspect (that has AFAICT only been briefly
mentioned so far) and that is of a header only library...having dealt with a
good number of different backends (and their various 'quirks') and being
able to extract/produce a fair amount of non-template code I'd also vote for
a change to a non-header-only library for various reasons...

-- 
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Re: [boost] GIL io_new review

Domagoj Saric