[boost] Re: IOStreams formal review start

30 Aug 2004

      ...
On 8/28/04 8:09 PM, "Jeff Garland" <jeff@crystalclearsoftware.com> wrote:
...
Today (August 28th, 2004) is the start of the formal review of the Iostreams
library by Jonathan Turkanis. I will be serving as review manager.  Note
"Daryle Walker" <darylew@hotmail.com> wrote:

that
...
...
this is a somewhat unusual situation in that we have several libraries that
overlap in the same area, so comments related to the MoreIo overlap are
needed.
http://home.comcast.net/~jturkanis/iostreams/
Thanks for the review!
...
Some concerns over what qualifies:
1.  Aren't memory-mapped files and file descriptors highly platform
specific?
Yes, just like threads, sockets and directory iteration.
...
Code that works with them would have to be non-portable, so I
don't think they're appropriate for Boost.
It achieves portability the same way boost.thread and boost.filesystem do: by
having separate implementations for different systems. See
http://www.boost.org/more/imp_vars.htm ("Implementation variations").
...
2.  This library does what a lot of other text-I/O libraries do, try to fit
in "kewl" compression schemes.  The problem is that the types of compression
here are binary oriented; they convert between sets of byte streams.
However, characters are not bytes (although characters, like other types,
are stored as bytes).
Are you saying there are problems with the implementation of the compression
filters, e.g., that they make unwarranted assumptions about 'char'? If so,
please let me know. I'm sure it can be fixed.
...
There are issues of character-to-byte-sequence
conversion.  These issues, and binary I/O itself, should _not_ be snuck in
through a text-I/O component.  Worse, we have an upcoming library
(Serialization) that also has to deal with binary I/O stuff, so we should
have a binary I/O strategy.  That means that the compression stuff should be
skipped for now.
...
(If someone says that this library is supposed to be text and/or binary
I/O, that's even worse!  The two types should be distinct and not just
glommed together.)
I don't see the iostream framework as relating to text streams only: streams
can handle text and binary. In some cases, you want text and binary to work
together. E.g., suppose you have a compressed text file ("essay.z") and you want
to read a 'family-friendly' version of it. You can do so as follows:

    filtering_istream in;
    in.push(regex_filter(regex("damn"), "darn"));
    in.push(zlib_decompressor());
    in.push(file_source("essay.z"));
    // read from in.

Isn't this perfectly natural and convenient? What's wrong with using the
decompressor and the regex filter in the same chain?
...
Now to the review itself:
I thought it had already started ;-)
...
1.  Actual code using this library is very slick and easy to set up.  This
ease of use/set-up also applies to the plug-in filters and/or resources.
Thanks.
...
The end-user experience is so awesome that we could just say "ship it"
and go home.  However, the experience is the only good point.  That ease
comes with a nasty little price tag, or more accurately, a nasty BIG price
tag.
The core question on keeping this library is:
Besides filtering/chaining, is there any part of the interface that
    isn't already covered by the standard stream (buffer) system?
Can I rephrase this as follows: InputFilters and OutputFilters are a useful
addition to the standard library, but Sources and Sinks just duplicate
functionality alread present? If this is not your point please correct me.

There are two main resons to write Sources and Sinks instead of stream buffers:

1. Sources and Sinks and sinks express just the core functionality of a
component. Usually you have to implement just one or two functions with very
natural interfaces. You don't have to worry about buffering or about putting
back characters. I would have thought it would be obvious that it's easier to
write:

        template<typename Ch>
        struct null_buf {
            typedef Ch char_type;
            typedef sink_tag category;
            void write(const Ch*, std::streamsize) { }
        };

than to write your null_buf, which is 79 lines long.

2. Sources and sinks can be reused in cases where standard streams and stream
buffers are either unnecessary or are not the appropriate abstraction. For
example, suppose you want to write the concatenation of three files to a string.
You can do so like this:

    string s;
    boost::io::copy(
        concatenate(
            file_source("file1"),
            file_source("file2"),
            file_source("file3")
        ),
        back_insert_resource(s)
    );

This is IMO clearer and possibly much more efficient than:

    string s;
    ofstream file1("file1");
    ofstream file2("file2");
    ofstream file3("file3");
    ostringstream out;
    out << file1.rdbuf();
    out << file2.rdbuf();
    out << file3.rdbuf();
    s = out.str();

(Note: concatenate is unimplemented, pending resolution of some open issues such
as the return type of 'write').

Another example, for the future, might be resources for asynchronous and
multiplexed i/o. (See 'Future Directions', http://tinyurl.com/6r8p2.)
...
The whole framework seems like "I/O done 'right'", a "better" implementation
of the ideas/concepts shown in the standard I/O framework.
I'd say thanks here if 'right' and 'better' weren't in quotes ;-)
...
The price is a
code size many times larger than the conventional system,
Are you talking about the size of the libray or the size of the generated code?
Most of the filter and resource infrastructure, as found in headers such as
<boost/io/io_traits.hpp>, <boost/io/operations.hpp> and
<boost/io/detail/resource_adapter.hpp>, should compile down to nothing with
optimizations enabled. A typical instantation of streambuf_facade should be only
slightly larger, if at all, than a typical hand-written stream buffer. The main
difference is that more protected virtual function may be implemented than are
actually needed. Even this can be fixed, if necessary; at this point it seems
like a premature optimization, unless you have some data.
...
and a large chunk
of it is a "poor man's" reflection system.
Do you mean the i/o categories? This follows the example of the standard library
and the boost iterator library. It's better than reflection, since you can't get
accidental conformance.
...
1a. An issue that appeared during the previous review (my More-I/O library)
was the necessity of the stream base classes that could wrap custom
stream-buffer classes.  The complaint was that the wrappers couldn't be a
100% solution because they can't forward constructors, so a final derived
class has to be made the manually carries over the constructors.  Well, that
is true, mainly because C++ generally doesn't provide any member forwarding
(besides in a limited way with inheritance).  The sample stream-buffer in
More-I/O generally had added-value member functions attached, that perform
inspection or (limited) reconfiguration.  Those member functions also have
to be manually carried over to the final derived stream class.  The point
(after all this exposition) is that the More-I/O framework at least
acknowledges support for value-added member functions.  The Iostreams
framework seems to totally ignore the issue!  (So we could have a 90%
solution for a quarter of the work, but triple the #included code length!)
With a streambuf_facade or stream_facade you can access the underlying resource
directly using operators * and ->. E.g.,

    stream_facade<tcp_resource> tcp("www.microsoft.com", 80);
    ...
    if (tcp->input_closed()) {
       ...
    }

Maybe I should stress this more in the documentation. (I imagine some people
won't like the use of operators * and -> here, but these can be replaced by a
member functions such as resource().)
...
2.  Another issue that appeared during the More-I/O review was that plug-ins
for Iostreams could be used for other ultimate sources/sinks unrelated to
standard streams.  What would those be?  If a system in question supports
Standard C++, it probably uses the Standard I/O framework for (text) I/O.
If the system doesn't/can't use Standard I/O, it has to use a custom I/O
framework.  To use it with Iostreams framework plug-ins, an adapter would
have to be made for the custom I/O routines.  In that case, an adapter could
be made for the Standard I/O framework instead. (As I said in the prelude,
I'm skipping over binary-I/O concerns.)
I don't follow this.
...
To sum it up:
I would REJECT the library (at least for now).  Maybe chain-filtering
    stream-buffer and stream base classes could be made instead?
Sorry to hear that -- I hope I can change your mind.

Jonathan