Re: [boost] [rfc] I/O Library Design

20 Jun 2007

      Sebastian Redl wrote:
...
A few weeks ago, a discussion that followed the demonstration of the
binary_iostream library made me think about the standard C++ I/O and
what I would expect from an I/O model.
Now I have a preliminary design document ready and would like to have
some feedback from the Boost community on it.
Hi Sebastian,

This is an interesting document and you have obviously put a lot of 
work into it.  My few thoughts follow.  I can't claim to have great 
insight into this problem, but there have been more than a couple of 
times when the limitations of what is currently available have struck me.

** Formatting of user-defined types often broken in practice.

The ability to write overloaded functions to format user-defined types 
for text I/O is attractive in theory, but in practice it always lets me 
down somewhere.  My main complaint is that neither of these work:

typedef std::set<thing> things_t;
operator<<(things_t things) { .... }  // doesn't work because things_t 
is a typedef

uint8_t i;
cout << i;  // doesn't work because uint8_t is actually a char

When I do have a class, I often find that there is more than one way in 
which I'd like to format it, but there is only one operator<< to 
overload.  And often I want to put the result of the formatting into a 
string, not a stream.

So for all of these reasons I have more explicit to_str() functions in 
my code than operator<<s.

** lexical_cast<> uses streams, should the reversed.

Currently we implement formatters that output to streams.  We implement 
lexical_cast using stringstreams.  Surely it would be preferable to 
implement formatters as specialisations of lexical_cast to a string (or 
character sequence / output iterator / whatever) and to implement 
formatted output to streams on top of that.  I suppose you could argue 
that the stream model is better for very large amounts of output since 
you don't accumulate it all in a temporary string, but I've never 
encountered a case where that would matter.

** Formatting state has the wrong scope

Spot the mistake here:

cout << "address of buffer = 0x" << hex << p;

yes, I forget to <<dec<< afterwards, so in some totally different part 
of the program when I write

cout << "DEBUG: x=" << x

and it prints '10', I think "10?  should be 16!" and spend ages debugging.

But reverting to dec might not be the right thing to do depending on 
what the caller was in the middle of doing, so I really want to 
save/restore the formatting state.  And if I throw or do a premature 
return I still want the formatting state to be reverted:

void f() {
   scoped_fmt_state(cout,hex);
   cout << ....;
   if (...) throw;
   cout << .....;
}

Hmm, I think that's too much work.  I'd be happy with NO formatting 
state in the stream, and to use explicit formatting when I want it:

    cout << hex(x);
OR cout << format("%08x",x);
OR printf(stdout,"%08x",x);

(No, I don't really use printf() in C++ code.  But it does have its 
strengths; it's by far the best way to output a uint8_t.  And it _is_ 
type safe if you are using a compiler that treats it as special.)

** Too much disconnect between POSIX file descriptors and std::streams

I have quite a lot of code that uses sockets and serial ports, does 
ioctls on file descriptors, and things like that.  So I have a 
FileDescriptor class that wraps a file descriptor with methods that 
implement simple error-trapping wrappers around the POSIX function calls.

Currently, there's a strong separation between what I can do to a 
FileDescriptor (i.e. reads and writes) and what I can do to a stream.  
There is no reason why this has to be the case.  It should be possible 
to add buffering to a FileDescriptor *and only add buffering*, and it 
should be possible to do formatted I/O on a non-buffered FileDescriptor.

In other words:

class ReadWriteThing;
class FileDescriptor: ReadWriteThing;
class Stream: ReadWriteThing;

FileDescriptor fd("192.168.1.1:80");  // a socket
int i=1234;
fd << "GET " << i << "\r\n";   // Unbuffered write, text formatting.

Stream s("foo.bin");   // a file, with a buffering layer
for (int i=0; i<1000; ++i) {
   short r = f();
   s.write(r);   // Buffered, non-formatted binary write.
}

** Character sets need support

This is a hugely complex area which native English speakers are 
uniquely unqualified to talk about.

I think that a starting point would be for someone to write a Boost 
interface to iconv (I have an example that makes functors for iconv 
conversions), and to write a tagged-string class that knows its 
encoding (either a compile-time type tag or a run-time enumeration tag 
or both).  Ideally we'd spend a couple of years getting used to using 
that, and then consider how it can best integrate with IO.

Regards,

Phil.

Re: [boost] [rfc] I/O Library Design

Phil Endecott