[iostream] Device::read return inconsistency

The example in the documentation for the Source concept in Boost.Iostreams says: std::streamsize read(char* s, std::streamsize n) { // Read up to n characters from the input // sequence into the buffer s, returning // the number of characters read, or -1 // to indicate end-of-sequence. } (The same text appears in 'valid expressions / semantics' table, which is probably more definitive.) What does a return of less than n actually mean? Nowhere on the documentation pages for the Source concept or the read function template does it say what a return value in the range [0,n) means. And from the comment in the example, it would appear that read is permitted to return less than n if the some smaller number of characters are currently available /without/ implying that a subsequent call will hit eof. This seems to be confirmed by the use of non_blocking_adapter in places in the library where you want code to block until the n bytes are available, and to guarantee eof (or, I guess, an error) for a subsequent call. However, the comment in the BidirectionalDevice seems to disagree: std::streamsize read(char* s, std::streamsize n) { // Reads up to n characters from the input // sequence into the buffer s, returning the number // of characters read. Returning a value less than n // indicates end-of-sequence. } (Again, the same text appears in the 'valid expressions / semantics' table.) So which is it? I don't believe it is the library's intention for a [0,n) return to have different meanings in BidirectionalDevice and Source. I haven't located any code in the library itself that assumes a non-zero return of less than n signifies eof (though I haven't searched all that hard), but third-party code and/or any other boost libraries that use Boost.Iostreams may well do. However there do seem to be places where a return of zero is taken to mean eof. For example, indirect_streambuf::underflow seems to treat zero as eof. At the very least the documentation needs fixing to be consistent. And ideally it needs to document what a return of 0 means (presumably undefined behaviour unless n == 0) and whether a return in the range (0,n) implies that the next call to read will return eof (probably not). (Also, a minor typo: in the example on the Sink concept documentation, write should return std::streamsize, not void.) Richard Smith

Richard Smith wrote:
The example in the documentation for the Source concept in Boost.Iostreams says:
std::streamsize read(char* s, std::streamsize n) { // Read up to n characters from the input // sequence into the buffer s, returning // the number of characters read, or -1 // to indicate end-of-sequence. }
(The same text appears in 'valid expressions / semantics' table, which is probably more definitive.)
What does a return of less than n actually mean?
I think the tutorial example for the multi-char shell comments filter gives an example of one such case (c == WOULD_BLOCK). see: http://www.boost.org/doc/libs/1_39_0/libs/iostreams/doc/tutorial/multichar_f... The meaning of WOULD_BLOCK is mentioned in the section where they create this same filter as a regular input filter. http://www.boost.org/doc/libs/1_39_0/libs/iostreams/doc/tutorial/shell_comme...

eg wrote:
Richard Smith wrote:
The example in the documentation for the Source concept in Boost.Iostreams says:
std::streamsize read(char* s, std::streamsize n) { // Read up to n characters from the input // sequence into the buffer s, returning // the number of characters read, or -1 // to indicate end-of-sequence. }
(The same text appears in 'valid expressions / semantics' table, which is probably more definitive.)
What does a return of less than n actually mean?
I think the tutorial example for the multi-char shell comments filter gives an example of one such case (c == WOULD_BLOCK).
OK. So the implementation of iostreams::get(Source) assumes that a return of 0 from Source::read() means that input would block. However indirect_streambuf::underflow calls Source::read() and assumes that a return of 0 means EOF. And I've just done a brief test to verify this. If 0 is treated as WOULD_BLOCK, then the following code should spin; conversely, if 0 is treated as EOF, then it should exit immediately. #include <iostream> #include <boost/iostreams/categories.hpp> #include <boost/iostreams/stream.hpp> namespace io = boost::iostreams; struct my_source { typedef char char_type; typedef io::source_tag category; std::streamsize read(char*, std::streamsize) { return 0; } }; int main() { io::stream<my_source> in(( my_source() )); int c; while ( ( c = in.get() ) != EOF ) std::cout.put( char(c) ); } Running it with boost 1.39, it exits immediately, and adding some diagnostics makes it clear that read() is called once. The iostreams::get(Source) code eventually calls the following code from lines 174-180 of iostreams/read.hpp: char_type c; std::streamsize amt; return (amt = t.read(&c, 1)) == 1 ? traits_type::to_int_type(c) : amt == -1 ? traits_type::eof() : traits_type::would_block(); Clealy read() returning 0 would result in WOULD_BLOCK, but so too would any negative return other than -1. On most systems WOULD_BLOCK == -2 (though this is dependent on your C library): is the intention that read() should return WOULD_BLOCK if it blocks? However if that's the case, a lot of other code breaks. Just to take one example, filter/agregate.hpp, lines 120-2 implicitly assumes that -1 is the only negative value that can be returned. And this is far from alone. The only other possibility is that the Source concept leaves a return of 0 (and most probably of anything less than n) undefined, and that various parts of the library rely on different interpretations to get their documented behaviour. And if that was the intention, I sincerely hope the library would have failed review. I have several patches that fix this in different ways, depending on ever behaviour is desired, but in the absence of documentation specifying the intended behaviour of a return in the [0, n) range, I'm not sure which I should be testing and submitting. Richard

Richard Smith wrote:
Richard Smith wrote:
I have several patches that fix this in different ways, depending on ever behaviour is desired, but in the absence of documentation specifying the intended behaviour of a return in the [0, n) range, I'm not sure which I should be testing and submitting.
Good sleuthing. I think we need Jonathan Turkanis, the author, or some other iostreams expert to reply here.
participants (2)
-
eg
-
Richard Smith