(Another) socket streams library

older
[CVS] How to move a module from...

pedro.lamarao＠mndfck.org

20 Apr 2005 20 Apr '05

2:38 a.m.

Hello, all. I see everybody has been talking about a networking library for a long while. Checking the wiki, I see that my own project has escaped attention. So, here it goes for the examination of the experts: http://socketstream.sourceforge.net/ -- Pedro Lamarão

Show replies by date

Boris

21 Apr 21 Apr

10:29 a.m.

pedro.lamarao@mndfck.org wrote:

...

Hello, all.

I see everybody has been talking about a networking library for a long while. Checking the wiki, I see that my own project has escaped attention. So, here it goes for the examination of the experts:

http://socketstream.sourceforge.net/

Socket streams have not been forgotten - see package iostream in http://www.highscore.de/boost/net/packages.png. However there hasn't been much talk about it lately as we were trying to get the big picture first (at least I :-). Right now I am also more interested in level 0 of the network library but had a glance at the documentation of your library and have a few questions: - Why is your socket streams library better than other socket streams libraries? (I ask to understand better the design and goals of your library.) - Do your socket streams support blocking and non-blocking I/O? - Is it possible to use your socket streams without them throwing exceptions? (I ask as I see a class socketstream::exception.) - As far as I can see your socket streams are not limited to stream sockets - you can use datagram or even raw sockets? I haven't thought about this yet but I am not sure if the interface of I/O streams makes sense for datagram and raw sockets? Boris

pedro.lamarao＠mndfck.org

10:10 p.m.

Boris wrote: [SNIP]

...

- Why is your socket streams library better than other socket streams libraries? (I ask to understand better the design and goals of your library.)

Well, I haven't thought about that, really -- this is much more of a research project to me. I posted the link seeing that the networking discussion in this list is endless, and my project escaped attention. What I am trying to achieve with this project is "easy semantics"; there are a number of normal socket operations that, by the nature of the C interface, end up being just boring. Like the constant (sockaddr*) casting. A "socket" class seems natural, being that the interface is typical: resource descriptor, operations on resource. And the "stream" classes provide a means for "protocol message" classes to offer easy semantics to the natural "serialization" and "desserialization" operations from a network "source". I don't know why, or if, my project is better than anyone's, really, but I'm seeing everyone trying to solve the big problem, not the small ones.

...

- Do your socket streams support blocking and non-blocking I/O?

Not yet, I believe; although I haven't experimented with that much. I'm still undecided if it's best to choose some int_type value to become "would block" like there's eof() or if it would be best to have a would_block() method in the stream classes. That's really one of the next things in my list, now that "resolver result container" has some form.

...

- Is it possible to use your socket streams without them throwing exceptions? (I ask as I see a class socketstream::exception.)

Yes. Exceptions are thrown by the "basic socket" class, instead of returning (-1). socket_exception tries to ease the errno checking.

...

- As far as I can see your socket streams are not limited to stream sockets - you can use datagram or even raw sockets? I haven't thought about this yet but I am not sure if the interface of I/O streams makes sense for datagram and raw sockets?

Not yet, I think. I'm still thinking about what a "datagram stream" would mean... My best thought has been for something like a "datagram" class offering operator>> and operator<< together with sendmsg and recvmsg... But I don't really know. The code in Sourceforge is usually not the latest one; I prefer to develop here: https://mndfck.org/svn/socketstream/ There's some new stuff there I really like, specially the "resolver result container" thing. Please take a look and tell me what you think. There's also new examples for echo, chargen, discard, time, quote, and daytime "protocol servers". I'm soon to write an example IRC server, but that'll probably take a while -- I'm still to graduate. :-D -- Pedro Lamarão

Caleb Epstein

22 Apr 22 Apr

12:08 a.m.

https://mndfck.org/svn/socketstream/trunk/example/time/session.h Buffer overflow here: char s[3]; this->time(s); where ::time indexes s[0..3]. There's even a comment: // assume we really got a string of the appropriate size Overall this seems like a nice implementation of blocking, iostream-able socket classes and related plumbing. I know different folks are looking for different things in a network library. The example programs are all well written (modulo that bug above) and concise, which is a tribute to the your implementation. At the same time, it is missing asynchronous and/or non-blocking operations and any means for doing single-threaded I/O multiplexing (e.g. select/poll/etc). If non-blocking I/O were a possibility, I don't think throwing exceptions on EWOUDBLOCK would be performant. -- Caleb Epstein caleb dot epstein at gmail dot com

Jeff Garland

2:30 a.m.

On Thu, 21 Apr 2005 20:08:20 -0400, Caleb Epstein wrote

...

https://mndfck.org/svn/socketstream/trunk/example/time/session.h

Buffer overflow here:

char s[3]; this->time(s);

where ::time indexes s[0..3]. There's even a comment:

// assume we really got a string of the appropriate size

This kind of code needs to be banished from all socket examples and libraries. We need buffer type that the "network infrastructure" can 'know' the size of and possibly even resize if needed. These type of assumed/fixed size buffers are bad design -- simply unacceptable in my mind for a modern C++ library. I notice we don't have a buffer concept in any of our net/socket writeups on the wiki. I think that's a big omission. I also wonder if the abstraction doesn't already exist --> std::basic_streambuf. Let the socket class write into the streambuf and then you can trivially wrap a stream around it to do sophisticated i/o if you wish -- or simply pull out the raw chars.... Jeff

Michel André

6:50 a.m.

Jeff Garland wrote:

...

This kind of code needs to be banished from all socket examples and libraries. We need buffer type that the "network infrastructure" can 'know' the size of and possibly even resize if needed. These type of assumed/fixed size buffers are bad design -- simply unacceptable in my mind for a modern C++ library.

I notice we don't have a buffer concept in any of our net/socket writeups on the wiki. I think that's a big omission. I also wonder if the abstraction doesn't already exist --> std::basic_streambuf. Let the socket class write into the streambuf and then you can trivially wrap a stream around it to do sophisticated i/o if you wish -- or simply pull out the raw chars....

So basically you propose something like using basic_streambuf<unsigned char> or basic_streambuf<char> in the interfaces instead of void*, size? The reason I haven't got any buffer Is that I don't want to impose a buffering strategy or interface onto the user. Isn't streambuf interface more related to io with locale and char_traits. Wouldn't a simpler concept do just buffer with just checked read/write and size/resize. /Michel

Jeff Garland

1:08 p.m.

On Fri, 22 Apr 2005 08:50:03 +0200, Michel André wrote

...

So basically you propose something like using basic_streambuf<unsigned char> or basic_streambuf<char> in the interfaces instead of void*, size?

Yes.

...

The reason I haven't got any buffer Is that I don't want to impose a buffering strategy or interface onto the user.

Socket data needs to be buffered somehow and there might be good reasons why the application programmer would want some control over that buffer. But using unsized/unsafe void*, char* is a choice to be avoided. As for imposing, the interface to control the buffer options should be 'optional'. Suitable defaults provided by the library.

...

Isn't streambuf interface more related to io with locale and char_traits.

The interface is almost exclusively about seeking, getting, and putting into a buffer. It does have a locale as well...

...

Wouldn't a simpler concept do just buffer with just checked read/write and size/resize.

Sure you could probably create a simplier buffer concept, but basic_streambuf is already documented and comparatively well understood. It has the significant advantage that an iostream can be wrapped around it for formatted reading and writing. It's a way of separating the I/O issues so that the socket classes only use buffers and then users can decide to use streams on top -- or not. A combination approach might be to write a minimal buffer concept that could be then implemented as a derivative of basic_streambuf, but could have non-streambuf versions for those that want something else. Anyway, it's just an idea. My main point is that the unsafe, unknown size char buff decision is just not acceptable. To me the issue is big enough to vote against a library that doesn't doesn't address it... Jeff ps: sorry I don't have more time to follow these discussions. I haven't really read all the posts and proposals that have been discussed recently :-(

Boris

1:25 p.m.

Jeff Garland wrote:

...

[...] ps: sorry I don't have more time to follow these discussions. I haven't really read all the posts and proposals that have been discussed recently :-(

Yeah, it's quite difficult to follow everything. I try not to get lost and follow the big picture but feel myself repeating all the time. :) Boris

Michel André

2:11 p.m.

Jeff Garland wrote:

...

Socket data needs to be buffered somehow and there might be good reasons why the application programmer would want some control over that buffer. But using unsized/unsafe void*, char* is a choice to be avoided. As for imposing, the interface to control the buffer options should be 'optional'. Suitable defaults provided by the library.

Even at layer0 a facade over socket api? I don't really see the absolute need for a buffer concept in a socket library at some level we need just to shuffle the bytes back and forth, and want to avoid copying the data to many times along the way.

...

The interface is almost exclusively about seeking, getting, and putting into a buffer. It does have a locale as well...

...
Wouldn't a simpler concept do just buffer with just checked read/write and size/resize.

Sure you could probably create a simplier buffer concept, but basic_streambuf is already documented and comparatively well understood.

Comparatively it was ;), it isn't the most beautiful and simple interface in the standard library by my view at least ;) not to talk about the naming.

...

It has the significant advantage that an iostream can be wrapped around it for formatted reading and writing. It's a way of separating the I/O issues so that the socket classes only use buffers and then users can decide to use streams on top -- or not. A combination approach might be to write a minimal buffer concept that could be then implemented as a derivative of basic_streambuf, but could have non-streambuf versions for those that want something else.

Do you have any specific ideas how the interface to the socket library would look like and how the concept would be used?

...

Anyway, it's just an idea. My main point is that the unsafe, unknown size char buff decision is just not acceptable. To me the issue is big enough to vote against a library that doesn't doesn't address it...

what about basic_ostream::write and basic_istream::read? /Michel

Boris

10:26 a.m.

Caleb Epstein wrote:

...

[...] concise, which is a tribute to the your implementation. At the same time, it is missing asynchronous and/or non-blocking operations and any means for doing single-threaded I/O multiplexing (e.g. select/poll/etc).

But these don't belong to a streams library anway - at least I wouldn't know how to support these I/O models without changing the stream interface.

...

If non-blocking I/O were a possibility, I don't think throwing exceptions on EWOUDBLOCK would be performant.

It should be possible to use socket streams with and without exceptions - then we come close to std::iostreams (and I think this is the goal of a socket streams library). As there is no flag in std::iostreams to indicate EWOULDBLOCK you use an existing one like failbit or introduce a new one eg. wouldblockbit. Boris

Caleb Epstein

10:49 a.m.

On 4/22/05, Boris <boris@gtemail.net> wrote:

...

Caleb Epstein wrote:

...
[...] concise, which is a tribute to the your implementation. At the same time, it is missing asynchronous and/or non-blocking operations and any means for doing single-threaded I/O multiplexing (e.g. select/poll/etc).

But these don't belong to a streams library anway - at least I wouldn't know how to support these I/O models without changing the stream interface.

No, not in a streams library, but they DO belong in a socket library. This implementation has both, but no non-blocking support that I can see at either level. The error code EWOULDBLOCK is handled (and causes an exception), but there seems to be no way to put a socket into non-blocking mode to cause it to be generated in the first place. -- Caleb Epstein caleb dot epstein at gmail dot com

Boris

1:21 p.m.

Caleb Epstein wrote:

...

On 4/22/05, Boris <boris@gtemail.net> wrote:

...
Caleb Epstein wrote:

...
[...] concise, which is a tribute to the your implementation. At the same time, it is missing asynchronous and/or non-blocking operations and any means for doing single-threaded I/O multiplexing (e.g. select/poll/etc).

But these don't belong to a streams library anway - at least I wouldn't know how to support these I/O models without changing the stream interface.

No, not in a streams library, but they DO belong in a socket library. This implementation has both, but no non-blocking support that I can see at either level. The error code EWOULDBLOCK is handled (and causes an exception), but there seems to be no way to put a socket into non-blocking mode to cause it to be generated in the first place.

I agree with you, Caleb. Please have a look at the package structure where I try to sort things out: http://www.highscore.de/boost/net/packages.png Maybe the packages can be rearranged but they should give a complete picture and include the requirements you were talking about. Boris

pedro.lamarao＠mndfck.org

3:17 p.m.

Caleb Epstein wrote:

...

https://mndfck.org/svn/socketstream/trunk/example/time/session.h

Buffer overflow here:

char s[3]; this->time(s);

where ::time indexes s[0..3]. There's even a comment:

// assume we really got a string of the appropriate size

Yeah, that should be a 4. Thanks.

...

Overall this seems like a nice implementation of blocking, iostream-able socket classes and related plumbing. I know different folks are looking for different things in a network library. The example programs are all well written (modulo that bug above) and concise, which is a tribute to the your implementation. At the same time, it is missing asynchronous and/or non-blocking operations and any means for doing single-threaded I/O multiplexing (e.g. select/poll/etc).

If non-blocking I/O were a possibility, I don't think throwing exceptions on EWOUDBLOCK would be performant.

I don't think it is necessary to throw an exception from the stream to signal it, for the same reason the stream can signal other kinds of errors without throwing exceptions. The standard basic_ios base class is even nice enough to offer the exceptions(iostate) "property" for you to choose if you want or don't want exceptions to be thrown. Nothing new here. I believe a non-blocking mode is possible for std::iostream, but I'm unsure of how... elegant would it be to use it. But should we offer the possibility for the user to set socket options, or merely offer a blocking(bool) method? Also, how should the stream object behave in case of EWOULDBLOCK? Set failbit? Even if we use a wouldblockbit, we still need to set failbit, as the only thing clear is that the operation was *not* successful. Then, what would the programmer need to check if the code gets to the else? protocol_message m; if (stream >> m) { } else { // what is the cause of the failure? } That'll probably need one or two more if's to test all the status bits. It's ugly, and I hope there's some other, more interesting, way. -- Pedro Lamarão

Caleb Epstein

5:55 p.m.

On 4/22/05, pedro.lamarao@mndfck.org <pedro.lamarao@mndfck.org> wrote:

...

Caleb Epstein wrote:

...
If non-blocking I/O were a possibility, I don't think throwing exceptions on EWOUDBLOCK would be performant.

I don't think it is necessary to throw an exception from the stream to signal it, for the same reason the stream can signal other kinds of errors without throwing exceptions. The standard basic_ios base class is even nice enough to offer the exceptions(iostate) "property" for you to choose if you want or don't want exceptions to be thrown. Nothing new here.

But from my reading of your code, you explicitly throw whenever the system calls return -1. Please correct me if I am wrong.

...

I believe a non-blocking mode is possible for std::iostream, but I'm unsure of how... elegant would it be to use it.

I'm not suggesting we implement non-blocking iostreams (I'll leave that to Jonathan :-), but the lowest level basic_socket implementation ought to support it. I don't see how it does in your implementation.

...

But should we offer the possibility for the user to set socket options, or merely offer a blocking(bool) method?

At the basic_socket level, all socket functionality should be accessible. Its all there for a reason, so if you leave something out, someone is guaranteed to complain.

...

Also, how should the stream object behave in case of EWOULDBLOCK? Set failbit? Even if we use a wouldblockbit, we still need to set failbit, as the only thing clear is that the operation was *not* successful.

I would not recommend combining non-blocking I/O with streams, at least not in any way that becomes visible to the user of the stream. I'm just advocating that the lowest level API should provide this functionality. Streams operate at a higher level. -- Caleb Epstein caleb dot epstein at gmail dot com

Boris

6:54 p.m.

Caleb Epstein wrote:

...

[...]

...
Also, how should the stream object behave in case of EWOULDBLOCK? Set failbit? Even if we use a wouldblockbit, we still need to set failbit, as the only thing clear is that the operation was *not* successful.

I would not recommend combining non-blocking I/O with streams, at least not in any way that becomes visible to the user of the stream.

As far as I can see EWOULDBLOCK is just another reason for a stream to fail. If you use << or >> with streams you must be prepared for failure any way. The only difference with EWOULDBLOCK is that you have to call << or >> again. However if I think about it I wonder how reasonable non-blocking I/O streams are at all as they don't support multiplexing - the application would need to call << or >> again and again?

...

I'm just advocating that the lowest level API should provide this functionality. Streams operate at a higher level.

After following most of the discussions I think this is broad consensus. That's why http://www.highscore.de/boost/net/packages.png looks like it does. Boris

Rob Stewart

9:55 p.m.

From: "Boris" <boris@gtemail.net>

...

Caleb Epstein wrote:

...
[...]

...
Also, how should the stream object behave in case of EWOULDBLOCK? Set failbit? Even if we use a wouldblockbit, we still need to set failbit, as the only thing clear is that the operation was *not* successful.

I would not recommend combining non-blocking I/O with streams, at least not in any way that becomes visible to the user of the stream.

As far as I can see EWOULDBLOCK is just another reason for a stream to fail. If you use << or >> with streams you must be prepared for failure any way. The only difference with EWOULDBLOCK is that you have to call << or >> again. However if I think about it I wonder how reasonable non-blocking I/O streams are at all as they don't support multiplexing - the application would need to call << or >> again and again?

How would the user level code know which insertion or extraction to repeat in an expression like this: s << a << b << c << d; It seems that there should be a setting that dictates how many times to retry, have the library retry that number of times and, if it still fails, throw an exception. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Boris

10:37 p.m.

Rob Stewart wrote:

...

From: "Boris" <boris@gtemail.net>

...
Caleb Epstein wrote:

...
[...]

...
Also, how should the stream object behave in case of EWOULDBLOCK? Set failbit? Even if we use a wouldblockbit, we still need to set failbit, as the only thing clear is that the operation was *not* successful.

I would not recommend combining non-blocking I/O with streams, at least not in any way that becomes visible to the user of the stream.

As far as I can see EWOULDBLOCK is just another reason for a stream to fail. If you use << or >> with streams you must be prepared for failure any way. The only difference with EWOULDBLOCK is that you have to call << or >> again. However if I think about it I wonder how reasonable non-blocking I/O streams are at all as they don't support multiplexing - the application would need to call << or >> again and again?

How would the user level code know which insertion or extraction to repeat in an expression like this:

s << a << b << c << d;

Socket streams supporting non-blocking I/O would require the user to check the result of I/O operations. However this should already be done with today's std::iostreams. If s is eg. of type std::ofstream and a call to << fails you have the same problem, don't you? Boris

Rob Stewart

25 Apr 25 Apr

4:52 p.m.

From: "Boris" <boris@gtemail.net>

...

Rob Stewart wrote:

...
From: "Boris" <boris@gtemail.net>

How would the user level code know which insertion or extraction to repeat in an expression like this:

s << a << b << c << d;

Socket streams supporting non-blocking I/O would require the user to check the result of I/O operations. However this should already be done with today's std::iostreams. If s is eg. of type std::ofstream and a call to << fails you have the same problem, don't you?

Yes and no. When writing to a file, you can inspect its contents and determine what's missing. You can also delete it and rewrite it from the beginning. With sockets, you can't seek back to determine where you left off. Instead, clients must establish a protocol that permits resending a block of data, even if the receiver has had no problems reading data to that point. IOW, if a given message required more than one TCP block, the receiver will have read some number of blocks and will expect more to complete the current message. The protocol will have to recognize that the sender restarted the current message, throw away the currently read data (from the successful blocks), all because the sender couldn't figure out which insertion (<<) failed. Do you intend to impose that requirement on clients? Typically, when trying to do async I/O, one does one datum at a time in order to know when that datum is sent or not. Sending multiple data in a row before checking for success leads to the problem I desribed above. So, allowing streaming for async I/O creates problems because you can chain the insertions or extractions and only check the status after all have been attempted. Put another way, I don't think the two can be reconciled nicely. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Boris

8:45 p.m.

Rob Stewart wrote:

...

From: "Boris" <boris@gtemail.net>

...
Rob Stewart wrote:

...
From: "Boris" <boris@gtemail.net>

How would the user level code know which insertion or extraction to repeat in an expression like this:

s << a << b << c << d;

Socket streams supporting non-blocking I/O would require the user to check the result of I/O operations. However this should already be done with today's std::iostreams. If s is eg. of type std::ofstream and a call to << fails you have the same problem, don't you?

Yes and no. When writing to a file, you can inspect its contents and determine what's missing. You can also delete it and rewrite it from the beginning. With sockets, you can't seek back to determine where you left off. Instead, clients must establish a protocol that permits resending a block of data, even if the receiver has had no problems reading data to that point.

IOW, if a given message required more than one TCP block, the receiver will have read some number of blocks and will expect more to complete the current message. The protocol will have to recognize that the sender restarted the current message, throw away the currently read data (from the successful blocks), all because the sender couldn't figure out which insertion (<<) failed. Do you intend to impose that requirement on clients?

I was under the false assumption that "s << a" either completes successfully or fails completely. But you are right: If a part of object a is sent we don't know. Thanks for jumping in! I agree now that socket streams can only support blocking I/O. Boris

Rob Stewart

10:15 p.m.

...

From: "Boris" <boris@gtemail.net> Rob Stewart wrote:

...
From: "Boris" <boris@gtemail.net>

...
Rob Stewart wrote:

...
From: "Boris" <boris@gtemail.net>

How would the user level code know which insertion or extraction to repeat in an expression like this:

s << a << b << c << d;

Socket streams supporting non-blocking I/O would require the user to check the result of I/O operations. However this should already be done with today's std::iostreams. If s is eg. of type std::ofstream and a call to << fails you have the same problem, don't you?

Yes and no. When writing to a file, you can inspect its contents and determine what's missing. You can also delete it and rewrite it from the beginning. With sockets, you can't seek back to determine where you left off. Instead, clients must establish a protocol that permits resending a block of data, even if the receiver has had no problems reading data to that point.

IOW, if a given message required more than one TCP block, the receiver will have read some number of blocks and will expect more to complete the current message. The protocol will have to recognize that the sender restarted the current message, throw away the currently read data (from the successful blocks), all because the sender couldn't figure out which insertion (<<) failed. Do you intend to impose that requirement on clients?

I was under the false assumption that "s << a" either completes successfully or fails completely. But you are right: If a part of object a is sent we don't know. Thanks for jumping in! I agree now that socket streams can only support blocking I/O.

It is possible to get the condition you describe, too, But my example was a typical insertion expression in which multiple insertions are done on one line and the result isn't checked until after: s << a << b << c << d; Ignoring the possibility that any of those insertions only partially completed because of UDT insertion operators, for example, you don't know which insertion (of a, b, c, or d) failed if the result of the expression is false (i.e., if s.good() is false). Note that you can use unformatted input/output functions on streams. Those could reasonably offer async I/O since they effectively just forward to the underlying streambuf for I/O. However, you still have to figure out how to report EWOULDBLOCK. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Boris

11:31 p.m.

Rob Stewart wrote:

...

[...]

...
I was under the false assumption that "s << a" either completes successfully or fails completely. But you are right: If a part of object a is sent we don't know. Thanks for jumping in! I agree now that socket streams can only support blocking I/O.

It is possible to get the condition you describe, too, But my example was a typical insertion expression in which multiple insertions are done on one line and the result isn't checked until after:

s << a << b << c << d;

Ignoring the possibility that any of those insertions only partially completed because of UDT insertion operators, for example, you don't know which insertion (of a, b, c, or d) failed if the result of the expression is false (i.e., if s.good() is false).

I understand your point but still think this is true for any stream. There might be other ways to find out where insertion breaks (like in your example with a file) but then you leave of course the stream interface and don't use standard methods. However I withdraw the idea of non-blocking I/O support in socket streams anyway as we can't know with the standard interface if "s << a" fails partially.

...

Note that you can use unformatted input/output functions on streams. Those could reasonably offer async I/O since they effectively just forward to the underlying streambuf for I/O. However, you still have to figure out how to report EWOULDBLOCK.

I don't know how many developers require non-blocking unformatted I/O functions. I think most developers like streams because of code like "s << a << b << c << d". It's probably not worth the effort to support non-blocking unformatted I/O functions. If anyone thinks differently he should tell us now! :) Boris

Iain K. Hanson

26 Apr 26 Apr

12:30 a.m.

On Tue, Apr 26, 2005 at 02:31:43AM +0300, Boris wrote:

...

Rob Stewart wrote: [snip]

However I withdraw the idea of non-blocking I/O support in socket streams anyway as we can't know with the standard interface if "s << a" fails partially.

Whether it is non-blocking or not is AFAICS is irrelevant. if is sizeof char then a is not in the buffer and the failure is because of buffer write. if a is > size of char then some part of a is in the buffer upto the point that the buffer overflows. If you know the sizeof the buffer, where you are in the buffer and the size of a then you know how much of a is in the buffer. However, only a fool would allow the buffer to overflow across a single <<.

...

I don't know how many developers require non-blocking unformatted I/O functions. I think most developers like streams because of code like "s << a << b << c << d". It's probably not worth the effort to support non-blocking unformatted I/O functions. If anyone thinks differently he should tell us now! :)

Blocking or non-blocking iiuc is irrelevant. And you want formatted I/O to do marshalling which is the main advantage of iostreams. /ikh

Boris

10:26 a.m.

Iain K. Hanson wrote: Hi Iain,

...

On Tue, Apr 26, 2005 at 02:31:43AM +0300, Boris wrote:

...
Rob Stewart wrote: [snip]

However I withdraw the idea of non-blocking I/O support in socket streams anyway as we can't know with the standard interface if "s << a" fails partially.

Whether it is non-blocking or not is AFAICS is irrelevant. if is sizeof char then a is not in the buffer and the failure is because of buffer write.

in this early stage of the network library I primarily think about interfaces and behaviors and not about buffers. As blocking and non-blocking functions are quite similar (in their interface and behavior) I wanted to elaborate on socket streams if they can support both I/O models. I understand it doesn't work with the standard interface of streams. I see that you are talking about the underlying buffer and want to point out additional problems if I understand you correctly. I didn't get as much down to the underlying buffer as I stopped already at the interface. Boris

...

[...]

Rob Stewart

4:52 p.m.

From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...

On Tue, Apr 26, 2005 at 02:31:43AM +0300, Boris wrote:

...
However I withdraw the idea of non-blocking I/O support in socket streams anyway as we can't know with the standard interface if "s << a" fails partially.

Whether it is non-blocking or not is AFAICS is irrelevant. if is sizeof char then a is not in the buffer and the failure is because of buffer write. if a is > size of char then some part of a is in the buffer upto the point that the buffer overflows. If you know the sizeof the buffer, where you are in the buffer and the size of a then you know how much of a is in the buffer.

Async is an issue because you have to deal with the EWOULDBLOCK condition. However, you are raising a new issue regarding the possibility that an internal buffer of the data to be written on the socket can't be written. That can mean that a portion of an object written with << will be in the buffer, but there's no way to know which portion. Have I got your point right?

...

However, only a fool would allow the buffer to overflow across a single <<.

Statements like this aren't helpful. Try, "The mistake is allowing the buffer to overflow across a single <<." And then follow it up with guidance on the approach that should be taken. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Iain K. Hanson

27 Apr 27 Apr

12:48 p.m.

On Tue, 2005-04-26 at 12:52 -0400, Rob Stewart wrote:

...

From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...
On Tue, Apr 26, 2005 at 02:31:43AM +0300, Boris wrote: However, only a fool would allow the buffer to overflow across a single <<.

Statements like this aren't helpful.

Try, "The mistake is allowing the buffer to overflow across a single <<." And then follow it up with guidance on the approach that should be taken.

My apologies. I was just getting rather frustrated. However, that is not really a good excuse for intemperate language. My apologies to the group. /ikh _______________________________________________________________________ This email has been scanned for all known viruses by the MessageLabs Email Security System. _______________________________________________________________________

Iain K. Hanson

1:33 p.m.

On Tue, 2005-04-26 at 12:52 -0400, Rob Stewart wrote:

...

From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...
On Tue, Apr 26, 2005 at 02:31:43AM +0300, Boris wrote:

...

...

Async is an issue because you have to deal with the EWOULDBLOCK condition. However, you are raising a new issue regarding the possibility that an internal buffer of the data to be written on the socket can't be written. That can mean that a portion of an object written with << will be in the buffer, but there's no way to know which portion. Have I got your point right?

O.k. The buffer is absolutely necessary otherwise, as I have posted previously, you will invoke Nagle, delayed ack, and slow start. The buffer size must be specified by the user and is ideally some multiple of MSS. There are a few application layer protocols that just ship bytes over the wire such as the data channel of FTP. Most applications have a protocol or message structure. The prudent network programmer works out the size of the largest message and makes their buffer at least that large. They can then stream the data to the stream which can marshal the data into the buffer. You would the do an explicit flush ( std::endl ) and it is this operation and only this that can create socket errors. You will want mechanisms similar to those in std::iostream for dealing with errors i.e. both exceptions and failbit but adapted for network programming. Of course, it is the responsibility prudent network programmer to check for the stream status if they have disabled exceptions and to always explicitly flush the stream and not allow overflow to do it for them. If perchance you were to stream out a uint32_t and overflow the buffer then if your program was not keeping track of the number of bytes written then obviously the streambuf knows how many bytes it contains. But doing this, is, as I have said before, an error IMHO. /ikh _______________________________________________________________________ This email has been scanned for all known viruses by the MessageLabs Email Security System. _______________________________________________________________________

Giovanni P. Deretta

5:34 p.m.

Iain K. Hanson wrote:

...

On Tue, 2005-04-26 at 12:52 -0400, Rob Stewart wrote:

...
From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...
On Tue, Apr 26, 2005 at 02:31:43AM +0300, Boris wrote:

...
Async is an issue because you have to deal with the EWOULDBLOCK condition. However, you are raising a new issue regarding the possibility that an internal buffer of the data to be written on the socket can't be written. That can mean that a portion of an object written with << will be in the buffer, but there's no way to know which portion. Have I got your point right?

O.k. The buffer is absolutely necessary otherwise, as I have posted previously, you will invoke Nagle, delayed ack, and slow start.

The buffer size must be specified by the user and is ideally some multiple of MSS.

There are a few application layer protocols that just ship bytes over the wire such as the data channel of FTP. Most applications have a protocol or message structure.

The prudent network programmer works out the size of the largest message and makes their buffer at least that large. They can then stream the data to the stream which can marshal the data into the buffer. You would the do an explicit flush ( std::endl ) and it is this operation and only this that can create socket errors.

You will want mechanisms similar to those in std::iostream for dealing with errors i.e. both exceptions and failbit but adapted for network programming.

Of course, it is the responsibility prudent network programmer to check for the stream status if they have disabled exceptions and to always explicitly flush the stream and not allow overflow to do it for them.

If perchance you were to stream out a uint32_t and overflow the buffer then if your program was not keeping track of the number of bytes written then obviously the streambuf knows how many bytes it contains. But doing this, is, as I have said before, an error IMHO.

Yes, the buffer is absolutely necessary, but does not need to be contiguous in memory: it can be composed from smaller buffers. This smaller buffers can be used by a streambuffer. When the streambuffer detects that the buffer is full, it does not immediatelly write it out, it simply queues it in a buffer vector and gets a new empty buffer. When the size of the buffer vector reaches an ideal size (MSS or MTU), it is written using vectored io (writev or writemsg). This way the streambuffer user does not need to be aware of buffer size requirement. A simmetric aproach can be used for reading. This is expecially useful if you read data from disk and want to write it out after appending a small header, if you don't want to copy header and file data on a single buffer you have to do two writes OR you use vector io. This boils down to one single rule: when reading read as much as possible (i.e. untill we would block), when writing write as much as possible. If user code wants to see writing and reading as small operations they are free to do so. This rule is also usefull when using edge trigered readiness notification APIs (epoll and kqueue in edge trigered mode). -- Giovanni P. Deretta

Iain K. Hanson

28 Apr 28 Apr

12:34 a.m.

On Wed, Apr 27, 2005 at 07:34:25PM +0200, Giovanni P. Deretta wrote:

...

Iain K. Hanson wrote:

[snip]

...

Yes, the buffer is absolutely necessary, but does not need to be contiguous in memory: it can be composed from smaller buffers. This smaller buffers can be used by a streambuffer. When the streambuffer detects that the buffer is full, it does not immediatelly write it out, it simply queues it in a buffer vector and gets a new empty buffer. When the size of the buffer vector reaches an ideal size (MSS or MTU), it is written using vectored io (writev or writemsg). This way the streambuffer user does not need to be aware of buffer size requirement.

Hi Giovanni, Yes i am aware of scatter / gather but I was simplifying in order to get my point across. However, even in scatter / gather each individual buffer must be contigous. The idea of using a concept is that there can be more that one class that implements the concept so not just vectors but also boost::array and unsigned char [], and there may be other containers. BTW char, signed char, and unsigned char are 3 different types and only unsigned char satisfyies the requirements for networking portably. I'm not at all sure that you can absolve the user from buffer size concerns but I also don't know that your wrong. I don't personally have a strong interest in an iostream interface or rather III have not until this recent discussion. I always assumed that the critisisims of an iostream interface for sockets were valid. As I have thought about it more I am inclined more to the view that the overhead could be constant and that is interesting.

...

A simmetric aproach can be used for reading. This is expecially useful if you read data from disk and want to write it out after appending a small header, if you don't want to copy header and file data on a single buffer you have to do two writes OR you use vector io.

This boils down to one single rule: when reading read as much as possible (i.e. untill we would block), when writing write as much as possible. If user code wants to see writing and reading as small operations they are free to do so. This rule is also usefull when using edge trigered readiness notification APIs (epoll and kqueue in edge trigered mode).

Agreed. But I have great difficulty in understanding why anyone would want to do edge triggered epoll / kqueue / dev/poll. regards /ikh

Giovanni P. Deretta

1:29 a.m.

Iain K. Hanson wrote:

...

...
Yes, the buffer is absolutely necessary, but does not need to be contiguous in memory: it can be composed from smaller buffers. This smaller buffers can be used by a streambuffer. When the streambuffer detects that the buffer is full, it does not immediatelly write it out, it simply queues it in a buffer vector and gets a new empty buffer. When the size of the buffer vector reaches an ideal size (MSS or MTU), it is written using vectored io (writev or writemsg). This way the streambuffer user does not need to be aware of buffer size requirement.

Hi Giovanni, Yes i am aware of scatter / gather but I was simplifying in order to get my point across. However, even in scatter / gather each individual buffer must be contigous. The idea of using a concept is that there can be

Of course the single buffers are contiguous. The point here is that you don't need to flush every time a single buffer is full. You keep filling buffers until you explicitly flush the stream (std::endl).

...

more that one class that implements the concept so not just vectors but also boost::array and unsigned char [], and there may be other containers.

I do not see how this is related with the problem. Using multiple buffers is an implementation detail that let the user be oblivious of transfer size optimizations.

...

BTW char, signed char, and unsigned char are 3 different types and only unsigned char satisfyies the requirements for networking portably.

Right, I didn't think of this.

...

I'm not at all sure that you can absolve the user from buffer size concerns but I also don't know that your wrong. I don't personally have a strong interest in an iostream interface or rather III have not until this recent discussion.

Neither do I, but i think that once you have working underlying networing framework framework, it will not be hard to fit iostreams on top of it.

...

I always assumed that the critisisims of an iostream interface for sockets were valid. As I have thought about it more I am inclined more to the view that the overhead could be constant and that is interesting.

...
A simmetric aproach can be used for reading. This is expecially useful if you read data from disk and want to write it out after appending a small header, if you don't want to copy header and file data on a single buffer you have to do two writes OR you use vector io.

This boils down to one single rule: when reading read as much as possible (i.e. untill we would block), when writing write as much as possible. If user code wants to see writing and reading as small operations they are free to do so. This rule is also usefull when using edge trigered readiness notification APIs (epoll and kqueue in edge trigered mode).

Agreed. But I have great difficulty in understanding why anyone would want to do edge triggered epoll / kqueue / dev/poll.

Well why not, it might actually simplify some code. -- Giovanni P. Deretta

Iain K. Hanson

25 Apr 25 Apr

11:07 p.m.

On Mon, Apr 25, 2005 at 11:45:06PM +0300, Boris wrote:

...

Rob Stewart wrote:

...
From: "Boris" <boris@gtemail.net>

...
Rob Stewart wrote:

...
From: "Boris" <boris@gtemail.net>

How would the user level code know which insertion or extraction to repeat in an expression like this:

s << a << b << c << d;

Socket streams supporting non-blocking I/O would require the user to check the result of I/O operations. However this should already be done with today's std::iostreams. If s is eg. of type std::ofstream and a call to << fails you have the same problem, don't you?

Yes and no. When writing to a file, you can inspect its contents and determine what's missing. You can also delete it and rewrite it from the beginning. With sockets, you can't seek back to determine where you left off. Instead, clients must establish a protocol that permits resending a block of data, even if the receiver has had no problems reading data to that point.

IOW, if a given message required more than one TCP block, the receiver will have read some number of blocks and will expect more to complete the current message. The protocol will have to recognize that the sender restarted the current message, throw away the currently read data (from the successful blocks), all because the sender couldn't figure out which insertion (<<) failed. Do you intend to impose that requirement on clients?

I was under the false assumption that "s << a" either completes successfully or fails completely. But you are right: If a part of object a is sent we don't know. Thanks for jumping in! I agree now that socket streams can only support blocking I/O.

Why do people think that just because you are using an iostream interface you can break the basic rules of TCP programming!!! Using iostreams there must be a buffer ( usually a streambuf ). The buffer size *must* be equal tp the MSS / path MTU. As the application protocol writter you have to know how much you have written to the buffer *at all times* and you must know when an overflow / flush will happen. Errors can *only* happen when the buffer writes to the socket ( overflow/flush ) therefore if a programmer does his/her job correctly there will not be multiple operator << traversing and overflow boundary. This is irrespective of sync/async. Braek the rules in network programming and you are up the proverbial creek without a paddle. And C++ can not protect you against all kinds of errors. We can only make it saffer, not safe. /ikh>

Boris

26 Apr 26 Apr

12:22 a.m.

Iain K. Hanson wrote:

...

[...]

...
I was under the false assumption that "s << a" either completes successfully or fails completely. But you are right: If a part of object a is sent we don't know. Thanks for jumping in! I agree now that socket streams can only support blocking I/O.

Why do people think that just because you are using an iostream interface you can break the basic rules of TCP programming!!!

You apparently didn't read the part where I wrote that I made a false assumption. Not everyone can be a genius. Boris

...

[...]

Iain K. Hanson

1:48 a.m.

On Tue, Apr 26, 2005 at 03:22:13AM +0300, Boris wrote:

...

You apparently didn't read the part where I wrote that I made a false assumption. Not everyone can be a genius.

Boris, my comments were not aimed at you personally but at everyone on this and related threads. In networking, there is no encapsulation and layering is only an abstract concept. People who don't understand this intrinsically realy don't understand what a network library needs to be about. /ikh

Victor A. Wagner Jr.

3:36 a.m.

New subject: [boost](Another) socket streams library

At Monday 2005-04-25 18:48, Iain K. Hanson wrote:

...

On Tue, Apr 26, 2005 at 03:22:13AM +0300, Boris wrote:

...
You apparently didn't read the part where I wrote that I made a false assumption. Not everyone can be a genius.

Boris, my comments were not aimed at you personally but at everyone on this and related threads.

In networking, there is no encapsulation and layering is only an abstract concept.

People who don't understand this intrinsically realy don't understand what a network library needs to be about.

when we finally get around to making the network behave like a telephone, where I can "call" a destination (an application on another system) and stream stuff both ways, _then_ the network library will be done. Until then, it's just a bunch of people talking about how difficult it is and how nobody understands.

...

/ikh

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Victor A. Wagner Jr. http://rudbek.com The five most dangerous words in the English language: "There oughta be a law"

Rob Stewart

4:55 p.m.

From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...

Boris, my comments were not aimed at you personally but at everyone on this and related threads.

In networking, there is no encapsulation and layering is only an abstract concept.

People who don't understand this intrinsically realy don't understand what a network library needs to be about.

Instead of attacking, why don't you help? If you think someone needs more information in order to correctly assist in developing the library, give some information or pointers to information that will help them. Your recent posts turn me off to participating in the thread, regardless of whether I can offer anything you and others might find useful. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Iain K. Hanson

27 Apr 27 Apr

12:55 p.m.

On Tue, 2005-04-26 at 12:55 -0400, Rob Stewart wrote:

...

From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...
Boris, my comments were not aimed at you personally but at everyone on this and related threads.

In networking, there is no encapsulation and layering is only an abstract concept.

People who don't understand this intrinsically realy don't understand what a network library needs to be about.

Instead of attacking, why don't you help? If you think someone needs more information in order to correctly assist in developing the library, give some information or pointers to information that will help them.

Your recent posts turn me off to participating in the thread, regardless of whether I can offer anything you and others might find useful.

The point I was trying to make is that while you can have encapsulation and information hiding in C++. These do not really exist in network programming. You have to be aware of *all* of the protocols that sit beneath you or they will come up and bite you. Unfortunately, my tolerance for teaching networking 101 is limited. /ikh _______________________________________________________________________ This email has been scanned for all known viruses by the MessageLabs Email Security System. _______________________________________________________________________

Rob Stewart

26 Apr 26 Apr

4:57 p.m.

From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...

Using iostreams there must be a buffer ( usually a streambuf ). The buffer size *must* be equal tp the MSS / path MTU.

As the application protocol writter you have to know how much you have written to the buffer *at all times* and you must know when an overflow / flush will happen.

Errors can *only* happen when the buffer writes to the socket ( overflow/flush ) therefore if a programmer does his/her job correctly there will not be multiple operator << traversing and overflow boundary. This is irrespective of sync/async.

According to this, there is no way to use streams with sockets. Otherwise, clients of the stream interface would need to keep track of the size of formatted output of each object inserted on the stream to avoid overflow between insertion operators. How can they do that? -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Scott Woods

8:59 p.m.

----- Original Message ----- From: "Rob Stewart" <stewart@sig.com> To: <boost@lists.boost.org> Cc: <boost@lists.boost.org> Sent: Wednesday, April 27, 2005 4:57 AM Subject: Re: [boost] Re: Re: Re: Re: (Another) socket streams library [snip]

...

...
As the application protocol writter you have to know how much you have written to the buffer *at all times* and you must know when an overflow / flush will happen.

Errors can *only* happen when the buffer writes to the socket ( overflow/flush ) therefore if a programmer does his/her job correctly there will not be multiple operator << traversing and overflow boundary. This is irrespective of sync/async.

According to this, there is no way to use streams with sockets. Otherwise, clients of the stream interface would need to keep track of the size of formatted output of each object inserted on the stream to avoid overflow between insertion operators. How can they do that?

Yeah, tricky stuff. After several attempts with varying levels of success the following is the most complete and efficient that I have come up with. This is specifically to deal with "streaming" over async sockets. This may ramble a bit but hopefully with purpose :-) The insertion operator (<<) is pre-defined for all "standard" or "built-in" types. Templates are defined for the standard containers. This is consistent with the approach taken by many, including Boost serialization. Insertion operators are defined for any application types. These have the appearance of; stream & operator<<( stream &s, application_type &a ) { s << a.member_1; s << a.member_2; s << a.member_3; return s; } Nothing new there :-) But (!) rather than the expected conversion to a formatted byte stream, these operators transform the application object to its generic form (i.e. a variant) and place it on an outgoing queue of variants. This activity occurs in the "application zone", a name coined to separate application object activity from the behind-the-scenes processing of network/socket events. In the "socket zone" a "WRITE" notification advises that its a "good time to write". The socket code manages the transfer of those queued variants to a byte buffer suitable for sending. This activity obviously requires the conversion of variants to some byte-by-byte encoding, or format. The task of this "transfer" code is to present optimal blocks to the network API taking its raw material from the queue of outgoing variants. The buffering strategy at this point is really the essence of this solution. The underlying container for my buffering is a std::vector<char>. While my buffer does not contain a block of optimal network size I attempt to take a variant off the queue. If the queue is non-empty I "stream" a varaint onto the end of the buffer, i.e. append the formatted representation of the variant using vector<>::push_back. This specifically allows the buffer to temporarily contain more than the network system requires. This may sound like a crippling design decision but in practise it works very well. It brings (IMO) the huge advantage of allowing the complete "streaming" of any application object. A block is written to the network. The amount accepted by the network is used to adjust a "skip" value, the number of leading buffered bytes that dont need to be written again. The last piece of this solution is a "wrap" operation that tidies the buffer up at the end of a phase of writing (i.e. the socket code handling a "WRITE" notification). If the remaining bytes in the buffer are less than an optimal network block they are all shifted down to the zero-th position and the skip value is reset to zero. The underlying vector will (worst case) be as big as the largest streamed application object plus odd bits and pieces totalling less than an optimal network block, i.e. if application objects tend to "max out" at 8K then the vector may approach 16K in size. Of course "reserve" can be used to minimize some initial memory shuffling. Hopefully this sketch is enough to describe the somewhat bizarre solution I have arrived at to deal with the conflicting requirements in this area. Trying to couple the application event of streaming an object to the network event of writing a byte block is (only my opinion :-) doomed. I would even go as far as saying that this is true for all streaming, i.e. to files. But thats another story. And dont even start me on input streams. Cheers.

Boris

27 Apr 27 Apr

9:47 a.m.

Scott Woods wrote:

...

[...] Yeah, tricky stuff. After several attempts with varying levels of success the following is the most complete and efficient that I have come up with. This is specifically to deal with "streaming" over async sockets. This may ramble a bit but hopefully with purpose :-)

If I understand correctly your socket streams support blocking I/O. However I/O methods are guaranteed to return immediately as data is copied to the so-called "application zone" where it waits until it is sent to the network? Boris

...

[...]

Scott Woods

10:47 p.m.

Hi Boris, ----- Original Message ----- From: "Boris" <boris@gtemail.net> To: <boost@lists.boost.org> Sent: Wednesday, April 27, 2005 9:47 PM Subject: [boost] Re: Re: Re: Re: Re: (Another) socket streams library

...

Scott Woods wrote:

...
[...] Yeah, tricky stuff. After several attempts with varying levels of success the following is the most complete and efficient that I have come up with. This is specifically to deal with "streaming" over async sockets. This may ramble a bit but hopefully with purpose :-)

If I understand correctly your socket streams support blocking I/O. However I/O methods are guaranteed to return immediately as data is copied to the so-called "application zone" where it waits until it is sent to the network?

Yes to the first question (i.e. blocking I/O). Not clear on what you were asking in the second. I'll try the shotgun approach and hope I hit something :-) Streaming operations might be characterized as "application-object-centric" (and rightly so!). Operations in the "application zone" involve the sending of application objects to a "stream". Async I/O operations might be characterized as "network-event-centric". Code responding to network notifications (e.g. ready-to-send) is focused on making as much of the opportunity as possible (e.g. present the maximum amount of data to the network API). The data flow sketched out in my previous mail, has two "buffers". The first lives between the application and network zones. The second between the "network zone" and the actual network API (e.g. ::send). Full justification for the "double buffering" is probably out of scope here; it does relate to the general need for async behaviour "intra-process". The point of presenting it at all was more about the un-coupling of streaming output operation from the actual I/O operation. Relating the streaming of an application object to a subsequent I/O error is very difficult. I specifically include the detection of such errors on "flush". After taking another approach not only has my mood improved but I now even consider the possibility that relating application streaming operation to the ultimate ::send/::write is.... wrong? My approach to input is (inevitably? ;-) completely different. There are no "operator>>( stream &, application_object & )"'s to match the output operators; the design is asymmetric. This is for the simple reason that a function with such a signature implies blocking; it must wait for potentially multiple network reads to complete the application_object. The design is fully async. Did I hit anything? Cheers.

Boris

11:12 p.m.

Scott Woods wrote: Hi Scott,

...

[...]

...
If I understand correctly your socket streams support blocking I/O. However I/O methods are guaranteed to return immediately as data is copied to the so-called "application zone" where it waits until it is sent to the network?

Yes to the first question (i.e. blocking I/O).

Not clear on what you were asking in the second. I'll try the shotgun approach and hope I hit something :-)

I try again, too. :-) What I wanted to say is that the blocking I/O methods of your socket stream always return immediately. They block but don't need to wait because they just copy data to another application buffer. When the library user calls your operator<< he knows the call will return immediately and not after eg. 10 seconds. Is this correct?

...

[...] My approach to input is (inevitably? ;-) completely different. There are no "operator>>( stream &, application_object & )"'s to match the output operators; the design is asymmetric. This is for the simple reason that a function with such a signature implies blocking; it must wait for potentially multiple network reads to complete the application_object. The design is fully async.

If there is no operator>> don't you think you make socket streams less useful for library users? Isn't the idea of socket streams that library users who are familiar with the interface of iostreams can start sending and receiving data on the network without knowing too much about the details? If you write "std::cin >> a;" you know it will stop your program until something is entered. I'd think it makes sense if socket streams behave similarly? Boris

Scott Woods

28 Apr 28 Apr

2:55 a.m.

...

...
Not clear on what you were asking in the second. I'll try the shotgun approach and hope I hit something :-)

I try again, too. :-) What I wanted to say is that the blocking I/O methods of your socket stream always return immediately. They block but don't need to wait because they just copy data to another application buffer. When

----- Original Message ----- From: "Boris" <boris@gtemail.net> [snip] the

...

library user calls your operator<< he knows the call will return immediately and not after eg. 10 seconds. Is this correct?

Yes.

...

...
My approach to input is (inevitably? ;-) completely different. There are no "operator>>( stream &, application_object & )"'s to match the output operators; the design is asymmetric. This is for the simple reason that a function with such a signature implies blocking; it must wait for potentially multiple network reads to complete the application_object. The design is fully async.

If there is no operator>> don't you think you make socket streams less useful for library users? Isn't the idea of socket streams that library users who are familiar with the interface of iostreams can start sending and receiving data on the network without knowing too much about the details? If you write "std::cin >> a;" you know it will stop your program until something is entered. I'd think it makes sense if socket streams behave similarly?

Tricky to answer (i.e. "less useful"). There are probably several "right" answers. Hopefully I have one of them ;-) In certain scenarios it would be reasonable to provide "stream >> a" 's that block as necessary. I would like to say that those scenarious exist in "simple" applications but that would be too easy. And there is certainly value in being able to write such code in small test programs. The diffifulties begin when you _truly_ need async input, e.g. for reading of application objects off an async socket. What mechanism can we use? I think its significant that there is no standard answer to this. The argument over whether the blocking version should exist is possibly separate to the fact that the non-blocking doesnt? The essence of my solution (which is only one of the "right" ones ;-) starts with a class, say "SMTP_server_session" (the following is a major simplification); class SMTP_server_session { stream remote; .. int some_method() { remote << SMTP_reject(); } }; This includes a method that calls the stream output operator (<<). The _output_ is shown to originate from a class because _input_ is always directed at that same class (well, instance of course). So the originating object always has the following method; class SMTP_server_session { .. void operator()( variant & ); }; If I've judged it right then you can imagine the data flow. All receiving is achieved generically, i.e. the low-level input code deals in variants. On completion of a variant, which may take 1 or more network blocks, it is presented to a "session owner". As above. The session performs the conversion to application types. So to achieve a solution to async I/O I've had to develop a minimal async framework. For me you cant have one without the other. I suspect that this is the root of the difficulty with async I/O. Boost discussions seem to bounce away from the framework issue because it appears adjunct to some, irrelevant to others. Understandable reactions. BTW, conversion to application types looks like this; application_type & operator>>( variant &v, application_type &t ) { variant_array &a = v; a[ 0 ] >> t.member_1; a[ 1 ] >> t.member_2; a[ 2 ] >> t.member_3; } Templates are pre-defined for all the standard types and containers. All variant to application type operators are defined to implement "move" semantics, e.g. the string that is allocated by the low-level socket reading code is the same string that is eventually used at application level; it effectively "moves up" the software stack thanks to "swap". <sigh> Well it was nice when I got it all working but probably meaningless in this discussion. Or maybe not... The significant thing about the above "input" operator is that it is non-blocking; a complete variant has already been recognised by the low-level code. Cheers.

Boris

29 Apr 29 Apr

1:30 p.m.

Scott Woods wrote:

...

[...] In certain scenarios it would be reasonable to provide "stream >> a" 's that block as necessary. I would like to say that those scenarious exist in "simple" applications but that would be too easy. And there is certainly value in being able to write such code in small test programs.

The diffifulties begin when you _truly_ need async input, e.g. for reading of application objects off an async socket. What mechanism can we use? I think its significant that there is no standard answer to this. The argument over whether the blocking version should exist is possibly separate to the fact that the non-blocking doesnt?

I agree with what you say. However I think support for async I/O doesn't belong to socket streams because the stream interface doesn't support async I/O by default. Instead of adding async I/O support to streams I think async I/O should be provided by another package. If I think about a library user he might want to use async I/O. The network library could then provide an ACE-like package with full async I/O support. If the library user however wants to use socket streams I think his decision is based on the interface he knows. Then he has to accept that only blocking I/O is supported as with other streams, too. Whoever wants to use async I/O has to understand how async I/O is supported anyway. But whoever wants to use streams probably doesn't want to learn a new interface. Or do you think there are other reasons why a library user would like to use socket streams? Boris

...

[...]

Scott Woods

1 May 1 May

8:50 p.m.

New subject: (Another) socket streams library

----- Original Message ----- From: "Boris" <boris@gtemail.net> To: <boost@lists.boost.org> Sent: Saturday, April 30, 2005 1:30 AM Subject: [boost] Re: Re: Re: Re: Re: Re: Re: (Another) socket streams library [snip]

...

...
of application objects off an async socket. What mechanism can we use? I think its significant that there is no standard answer to this. The argument over whether the blocking version should exist is possibly separate to the fact that the non-blocking doesnt?

I agree with what you say. However I think support for async I/O doesn't belong to socket streams because the stream interface doesn't support async I/O by default. Instead of adding async I/O support to streams I think async I/O should be provided by another package.

...

If I think about a library user he might want to use async I/O. The network library could then provide an ACE-like package with full async I/O support. If the library user however wants to use socket streams I think his decision is based on the interface he knows. Then he has to accept that only blocking I/O is supported as with other streams, too. Whoever wants to use async I/O has to understand how async I/O is supported anyway. But whoever wants to use streams probably doesn't want to learn a new interface. Or do you

Yes. That's the "async flow" library thats been mentioned in this or a related thread. think

...

there are other reasons why a library user would like to use socket streams?

Yes. The separation of async into a separate library/framework is the go. While your comments I agree with, they read as if sync/async is a developers choice. I'm assuming that's accidental? I feel as if we have clarified one thing; there can be no iostream input operators over async sockets (the function signature is blocking). Which leaves the output operators and whether iostream output operators over async has any value? Cheers.

Boris

5 May 5 May

9:50 p.m.

New subject: (Another) socket streamslibrary

Scott Woods wrote:

...

[...] I feel as if we have clarified one thing; there can be no iostream input operators over async sockets (the function signature is blocking). Which leaves the output operators and whether iostream output operators over async has any value?

The async implementation you proposed can be supported by blocking I/O functions as far as I understand (or should I say blocking O functions :-) as the whole asynchronicity is hidden behind the interface. The only difference to the standard blocking I/O functions is that your implementation guarantees that blocking I/O function return immediately (which is of course no problem and might be useful for library users). Boris

Scott Woods

11:48 p.m.

New subject: (Another) socketstreamslibrary

Hi Boris, As you said somewhere; there has been "heavy discussion". Almost sounds like something that parents might disapprove of. ----- Original Message ----- From: "Boris" <boris@gtemail.net> To: <boost@lists.boost.org> Sent: Friday, May 06, 2005 9:50 AM Subject: [boost] Re: Re: Re: Re: Re: Re: Re: Re: (Another) socketstreamslibrary

...

Scott Woods wrote:

...
[...] I feel as if we have clarified one thing; there can be no iostream input operators over async sockets (the function signature is blocking). Which leaves the output operators and whether iostream output operators over async has any value?

The async implementation you proposed can be supported by blocking I/O

Was that the wee table of libs in an previous message?

...

functions as far as I understand (or should I say blocking O functions :-) as the whole asynchronicity is hidden behind the interface. The only difference to the standard blocking I/O functions is that your implementation guarantees that blocking I/O function return immediately (which is of course no problem and might be useful for library users).

Nathan and I have been through a recent exchange of mail. Its been interesting and for me at least there has been a conclusion. Iostreaming over sockets running in asynchronous mode are possible. Its just not a goal for me. Thats not because I dont need asynchronous software, in fact, quite the opposite. An asynchronous application is more difficult to write using streams over async sockets. At best. Taking a simplistic telephony example; A PBX is establishing a call between two phones. Imagine these phones are in communication with the PBX software over separate TCP connections (i.e. sockets in async mode). Using the iostream model there will be calls such as; calling_phone >> telephony_message; If "calling_phone" is the istream for one of the TCP connections then the caller (of operator>>) is committing itself to the completion of the next message from that phone. What happens if a message arrives from the other phone, on the other stream, before a message is completed from "calling_phone"? IIUC, Nathans approach (and an approach consistent with traditional iostream software) would involve two threads - one for each phone stream. Received messages would be transferred to the PBX component as a distinct step. In my approach, the messages are transferred directly to the PBX component as a normal function of the async framework. There is code that processes inbound socket data. But that code is generic (i.e. it doesnt know about "telephony_message" types) and having completed a message it directs it to the relevant party according to info retained in the framework. Cheers.

Iain K. Hanson

27 Apr 27 Apr

1:40 p.m.

On Tue, 2005-04-26 at 12:57 -0400, Rob Stewart wrote:

...

From: "Iain K. Hanson" <ikh@hansons.demon.co.uk>

...
Using iostreams there must be a buffer ( usually a streambuf ). The buffer size *must* be equal tp the MSS / path MTU.

As the application protocol writter you have to know how much you have written to the buffer *at all times* and you must know when an overflow / flush will happen.

Errors can *only* happen when the buffer writes to the socket ( overflow/flush ) therefore if a programmer does his/her job correctly there will not be multiple operator << traversing and overflow boundary. This is irrespective of sync/async.

According to this, there is no way to use streams with sockets. Otherwise, clients of the stream interface would need to keep track of the size of formatted output of each object inserted on the stream to avoid overflow between insertion operators. How can they do that?

No. But see my previous post for an explanation of how it works. BTW the only formatting I think a socket stream should do is marshaling and that does not effect the number of bytes written when doing things in network byte order. Whether you could use streams with say XDR / BER / DER / PER / CDR marshaling I don't know ( at the moment ) and given that I am working on socket wrappers at the moment, it is not something I intend to give a great deal of consideration just now. /ikh _______________________________________________________________________ This email has been scanned for all known viruses by the MessageLabs Email Security System. _______________________________________________________________________

pedro.lamarao＠mndfck.org

22 Apr 22 Apr

10:05 p.m.

Caleb Epstein wrote:

...

...
I don't think it is necessary to throw an exception from the stream to signal it, for the same reason the stream can signal other kinds of errors without throwing exceptions. The standard basic_ios base class is even nice enough to offer the exceptions(iostate) "property" for you to choose if you want or don't want exceptions to be thrown. Nothing new here.

But from my reading of your code, you explicitly throw whenever the system calls return -1. Please correct me if I am wrong.

That's inside the basic_socket class; the point there is to support some kind of easy, low level socket operation. Thus, let's throw instead of constantly checking for (-1). But inside the streambuf, all exceptions are caught. I haven't worked much more on this aspect of the streambuf yet, but here the point is never to throw. Even if we want exceptions to be thrown, that should be handled by basic_ios -- setting failbit. At least, that's what's currently on my mind.

...

...
I believe a non-blocking mode is possible for std::iostream, but I'm unsure of how... elegant would it be to use it.

I'm not suggesting we implement non-blocking iostreams (I'll leave that to Jonathan :-), but the lowest level basic_socket implementation ought to support it. I don't see how it does in your implementation.

It doesn't, because I didn't bother to add a wrapper to fcntl(). That's easy to do, though, and would be consistent with the purpose of the basic_socket class. But you can always pass MSG_DONTWAIT as the flag parameter of the receive and send methods. That will provide you with a non-blocking call.

...

...
But should we offer the possibility for the user to set socket options, or merely offer a blocking(bool) method?

At the basic_socket level, all socket functionality should be accessible. Its all there for a reason, so if you leave something out, someone is guaranteed to complain.

Agreed. The point of the exposed "socket" class is to provide an easy entry point to what I would call "hybrid" networking code -- so that someone can easily substitute raw system calls to use "socket" objects without much code rewriting. -- Pedro Lamarão

Caleb Epstein

25 Apr 25 Apr

2:10 p.m.

On 4/22/05, pedro.lamarao@mndfck.org <pedro.lamarao@mndfck.org> wrote:

...

But you can always pass MSG_DONTWAIT as the flag parameter of the receive and send methods. That will provide you with a non-blocking call.

This is not universally supported. Linux has it but, e.g. Solaris does not. -- Caleb Epstein caleb dot epstein at gmail dot com

7390

Age (days ago)

7405

Last active (days ago)

List overview

Download

47 comments

11 participants

participants (11)

Boris
Caleb Epstein
Giovanni P. Deretta
Iain K. Hanson
Iain K. Hanson
Jeff Garland
Michel André
pedro.lamarao＠mndfck.org
Rob Stewart
Scott Woods
Victor A. Wagner Jr.