
Phil Endecott wrote:
** Formatting of user-defined types often broken in practice.
The ability to write overloaded functions to format user-defined types for text I/O is attractive in theory, but in practice it always lets me down somewhere. My main complaint is that neither of these work:
typedef std::set<thing> things_t; operator<<(things_t things) { .... } // doesn't work because things_t is a typedef
I see no specific reason why that would fail, as long as there isn't an operator << for std::set<thing> somewhere already. It's even legal, I think, because std::set<thing> depends on a type not in namespace std. (You can't overload for std::set<int>, for example, by the rules of the standard.)
uint8_t i; cout << i; // doesn't work because uint8_t is actually a char
Yes, that's annoying. In my opinion, it's a defect in the standard that unsigned and signed char are treated as characters instead of small integers. Characters is what char is for.
When I do have a class, I often find that there is more than one way in which I'd like to format it, but there is only one operator<< to overload. And often I want to put the result of the formatting into a string, not a stream.
I have an idea for a formatting system that should address all these issues. Basically, a format string would be able to specify, in an extensible and type-safe way, how to format an object. The format string would be used to look up a formatter in some sort of registry.
** lexical_cast<> uses streams, should the reversed.
Currently we implement formatters that output to streams. We implement lexical_cast using stringstreams. Surely it would be preferable to implement formatters as specialisations of lexical_cast to a string (or character sequence / output iterator / whatever) and to implement formatted output to streams on top of that. I suppose you could argue that the stream model is better for very large amounts of output since you don't accumulate it all in a temporary string, but I've never encountered a case where that would matter.
I have written in another post why I think the stream interface is better. Efficiency is one part of the issue. Another is that the code is simpler that way for the library implementer, and the difference is transparent for the library user. Also, it means that it's easier to switch the string type used (something that is not uncommon).
** Formatting state has the wrong scope void f() { scoped_fmt_state(cout,hex); cout << ....; if (...) throw; cout << .....; }
Hmm, I think that's too much work. I'd be happy with NO formatting state in the stream, and to use explicit formatting when I want it:
cout << hex(x); OR cout << format("%08x",x); OR printf(stdout,"%08x",x);
I absolutely agree. Stateful formatting is generally not good. The only state that should be in formatting is the used locale.
And it _is_ type safe if you are using a compiler that treats it as special.)
... _and_ if you use a string literal as the formatting string. Far from guaranteed, especially when localizing.
** Too much disconnect between POSIX file descriptors and std::streams
I cannot make myself think of this specific issue as a defect. It would mean platform coupling.
I have quite a lot of code that uses sockets and serial ports, does ioctls on file descriptors, and things like that. So I have a FileDescriptor class that wraps a file descriptor with methods that implement simple error-trapping wrappers around the POSIX function calls.
Is there any specific reason you cannot implement a streambuffer that acts on a file descriptor? A streambuffer, despite its name, doesn't have to buffer data.
Currently, there's a strong separation between what I can do to a FileDescriptor (i.e. reads and writes) and what I can do to a stream. There is no reason why this has to be the case. It should be possible to add buffering to a FileDescriptor *and only add buffering*, and it should be possible to do formatted I/O on a non-buffered FileDescriptor.
Yes. It is possible now. It should be easier with my system.
** Character sets need support
This is a hugely complex area which native English speakers are uniquely unqualified to talk about.
Luckily, I'm not a native English speaker. I have some experience with the issues involved, although my experience is limited to German umlauts. I have experienced the pains of unexpected encoding use in web applications. This is why I really, really think all C++ types involving text handling really need to be tagged with the encoding used.
I think that a starting point would be for someone to write a Boost interface to iconv (I have an example that makes functors for iconv conversions), and to write a tagged-string class that knows its encoding (either a compile-time type tag or a run-time enumeration tag or both). Ideally we'd spend a couple of years getting used to using that, and then consider how it can best integrate with IO.
I don't want to wait that long ;) I have in fact considered this issue and have drawn the outline of such a character handling and conversion library. In fact, a subset of it is absolutely needed for the text layer of my I/O plans. Sebastian Redl