
Sebastian Redl wrote:
A few weeks ago, a discussion that followed the demonstration of the binary_iostream library made me think about the standard C++ I/O and what I would expect from an I/O model.
Now I have a preliminary design document ready and would like to have some feedback from the Boost community on it.
Hi Sebastian, This is an interesting document and you have obviously put a lot of work into it. My few thoughts follow. I can't claim to have great insight into this problem, but there have been more than a couple of times when the limitations of what is currently available have struck me. ** Formatting of user-defined types often broken in practice. The ability to write overloaded functions to format user-defined types for text I/O is attractive in theory, but in practice it always lets me down somewhere. My main complaint is that neither of these work: typedef std::set<thing> things_t; operator<<(things_t things) { .... } // doesn't work because things_t is a typedef uint8_t i; cout << i; // doesn't work because uint8_t is actually a char When I do have a class, I often find that there is more than one way in which I'd like to format it, but there is only one operator<< to overload. And often I want to put the result of the formatting into a string, not a stream. So for all of these reasons I have more explicit to_str() functions in my code than operator<<s. ** lexical_cast<> uses streams, should the reversed. Currently we implement formatters that output to streams. We implement lexical_cast using stringstreams. Surely it would be preferable to implement formatters as specialisations of lexical_cast to a string (or character sequence / output iterator / whatever) and to implement formatted output to streams on top of that. I suppose you could argue that the stream model is better for very large amounts of output since you don't accumulate it all in a temporary string, but I've never encountered a case where that would matter. ** Formatting state has the wrong scope Spot the mistake here: cout << "address of buffer = 0x" << hex << p; yes, I forget to <<dec<< afterwards, so in some totally different part of the program when I write cout << "DEBUG: x=" << x and it prints '10', I think "10? should be 16!" and spend ages debugging. But reverting to dec might not be the right thing to do depending on what the caller was in the middle of doing, so I really want to save/restore the formatting state. And if I throw or do a premature return I still want the formatting state to be reverted: void f() { scoped_fmt_state(cout,hex); cout << ....; if (...) throw; cout << .....; } Hmm, I think that's too much work. I'd be happy with NO formatting state in the stream, and to use explicit formatting when I want it: cout << hex(x); OR cout << format("%08x",x); OR printf(stdout,"%08x",x); (No, I don't really use printf() in C++ code. But it does have its strengths; it's by far the best way to output a uint8_t. And it _is_ type safe if you are using a compiler that treats it as special.) ** Too much disconnect between POSIX file descriptors and std::streams I have quite a lot of code that uses sockets and serial ports, does ioctls on file descriptors, and things like that. So I have a FileDescriptor class that wraps a file descriptor with methods that implement simple error-trapping wrappers around the POSIX function calls. Currently, there's a strong separation between what I can do to a FileDescriptor (i.e. reads and writes) and what I can do to a stream. There is no reason why this has to be the case. It should be possible to add buffering to a FileDescriptor *and only add buffering*, and it should be possible to do formatted I/O on a non-buffered FileDescriptor. In other words: class ReadWriteThing; class FileDescriptor: ReadWriteThing; class Stream: ReadWriteThing; FileDescriptor fd("192.168.1.1:80"); // a socket int i=1234; fd << "GET " << i << "\r\n"; // Unbuffered write, text formatting. Stream s("foo.bin"); // a file, with a buffering layer for (int i=0; i<1000; ++i) { short r = f(); s.write(r); // Buffered, non-formatted binary write. } ** Character sets need support This is a hugely complex area which native English speakers are uniquely unqualified to talk about. I think that a starting point would be for someone to write a Boost interface to iconv (I have an example that makes functors for iconv conversions), and to write a tagged-string class that knows its encoding (either a compile-time type tag or a run-time enumeration tag or both). Ideally we'd spend a couple of years getting used to using that, and then consider how it can best integrate with IO. Regards, Phil.