[asio] Socket closed notification?

newer
newbie - graph library - vertex...

Nicola Michael Gutberlet

14 Dec 2009 14 Dec '09

5:52 p.m.

Hello everybody, I'm using TCP sockets and wonder, if there's a possibility to get an automatic notification when one side of a socket is closed (even if the other side is crashed / not closed properly). Thanks in advance for any help. Regards, Nicola

Attachments:

attachment.html (text/html — 1.9 KB)

Show replies by date

Jonathan Franklin

14 Dec 14 Dec

10:43 p.m.

On Mon, Dec 14, 2009 at 10:52 AM, Nicola Michael Gutberlet <nicola.gutberlet@hhi.fraunhofer.de> wrote:

...

I’m using TCP sockets and wonder, if there’s a possibility to get an automatic notification when one side of a socket is closed (even if the other side is crashed / not closed properly).

You generally have to do a read on the socket to reliably detect when the other side has closed (or crashed). Jon

Scott Gifford

15 Dec 15 Dec

3:50 a.m.

Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...

On Mon, Dec 14, 2009 at 10:52 AM, Nicola Michael Gutberlet <nicola.gutberlet@hhi.fraunhofer.de> wrote:

...
Im using TCP sockets and wonder, if theres a possibility to get an automatic notification when one side of a socket is closed (even if the other side is crashed / not closed properly).

You generally have to do a read on the socket to reliably detect when the other side has closed (or crashed).

And a write, too, or else use TCP keepalives. Otherwise a crashed client which is never sending anything will not send any error indication for you to read. ----Scott.

Andrew Maclean

3:58 a.m.

Do you have any ideas on how to do this within the ASIO framework? Andrew On Tue, Dec 15, 2009 at 2:50 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...

Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...
On Mon, Dec 14, 2009 at 10:52 AM, Nicola Michael Gutberlet <nicola.gutberlet@hhi.fraunhofer.de> wrote:

...
I’m using TCP sockets and wonder, if there’s a possibility to get an automatic notification when one side of a socket is closed (even if the other side is crashed / not closed properly).

You generally have to do a read on the socket to reliably detect when the other side has closed (or crashed).

And a write, too, or else use TCP keepalives. Otherwise a crashed client which is never sending anything will not send any error indication for you to read.

----Scott. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

-- ___________________________________________ Andrew J. P. Maclean Centre for Autonomous Systems The Rose Street Building J04 The University of Sydney 2006 NSW AUSTRALIA Ph: +61 2 9351 3283 Fax: +61 2 9351 7474 URL: http://www.acfr.usyd.edu.au/ ___________________________________________

Scott Gifford

4:17 a.m.

Andrew Maclean <andrew.amaclean@gmail.com> writes:

...

On Tue, Dec 15, 2009 at 2:50 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...
Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...
On Mon, Dec 14, 2009 at 10:52 AM, Nicola Michael Gutberlet <nicola.gutberlet@hhi.fraunhofer.de> wrote:

...
Im using TCP sockets and wonder, if theres a possibility to get an automatic notification when one side of a socket is closed (even if the other side is crashed / not closed properly).

You generally have to do a read on the socket to reliably detect when the other side has closed (or crashed).

And a write, too, or else use TCP keepalives. Otherwise a crashed client which is never sending anything will not send any error indication for you to read.

Do you have any ideas on how to do this within the ASIO framework?

Sure, just use any of the read functions and see if they return an error. In my application, I use read_until(), but any function that reads should do. Jonathan Franklin <franklin.jonathan@gmail.com> writes: [...]

...

Unfortunately, write isn't reliable since it will always succeed until you fill up your send buffer.

Well, OK, maybe a write and a flush. Certainly there are othe buffers, but typically a TCP implementation will time out at some point if it has some pending data which is not acknowledged, and typically once data is flushed to the TCP layer it will try to write it and begin waiting for its timeout. ----Scott.

Jonathan Franklin

4:43 a.m.

On Mon, Dec 14, 2009 at 9:17 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...

Andrew Maclean <andrew.amaclean@gmail.com> writes:

...
On Tue, Dec 15, 2009 at 2:50 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...
Jonathan Franklin <franklin.jonathan@gmail.com> writes: Sure, just use any of the read functions and see if they return an error. In my application, I use read_until(), but any function that reads should do.

If the remote app closes unexpectedly, you read will return EOF.

...

...
Unfortunately, write isn't reliable since it will always succeed until you fill up your send buffer.

Well, OK, maybe a write and a flush. Certainly there are othe buffers, but typically a TCP implementation will time out at some point if it has some pending data which is not acknowledged, and typically once data is flushed to the TCP layer it will try to write it and begin waiting for its timeout.

Yeah, it's not reliable, and can take *many* writes. Jon

Scott Gifford

5:07 a.m.

Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...

On Mon, Dec 14, 2009 at 9:17 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...
Andrew Maclean <andrew.amaclean@gmail.com> writes:

...
On Tue, Dec 15, 2009 at 2:50 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...
Jonathan Franklin <franklin.jonathan@gmail.com> writes: Sure, just use any of the read functions and see if they return an error. In my application, I use read_until(), but any function that reads should do.

If the remote app closes unexpectedly, you read will return EOF.

...

From when I tested this a while back, my notes indicate I got one of these errors for an unclean shutdown (timeout or a TCP RST packet):

boost::system::errc::no_such_file_or_directory boost::asio::error::shut_down but it's been awhile and I don't rememberly clearly under exactly what circumstances.

...

...
...
Unfortunately, write isn't reliable since it will always succeed until you fill up your send buffer.

Well, OK, maybe a write and a flush. Certainly there are othe buffers, but typically a TCP implementation will time out at some point if it has some pending data which is not acknowledged, and typically once data is flushed to the TCP layer it will try to write it and begin waiting for its timeout.

Yeah, it's not reliable, and can take *many* writes.

The TCP stack may have to do many writes, but the app should only have to write once and do a flush to detect an unresponsive other side. At the very least, this is a technique I have consistently seen recommended, and has always worked for me. ----Scott.

Jonathan Franklin

5:08 p.m.

On Mon, Dec 14, 2009 at 10:07 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...

Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...
If the remote app closes unexpectedly, you read will return EOF.

...
From when I tested this a while back, my notes indicate I got one of these errors for an unclean shutdown (timeout or a TCP RST packet):

On a read, ASIO will immediately give you an EOF. On a write, you will eventually get an error code that indicates the remote socket has closed. Something along the lines of: "An established connection was aborted by the software in your host machine"

...

...
Yeah, it's not reliable, and can take *many* writes.

The TCP stack may have to do many writes, but the app should only have to write once and do a flush to detect an unresponsive other side.

The application *may* have to do many writes. Your mileage will vary, depending on many variables, such as whether the remote host is on the same network, etc. IME, if the remote process lives on the same box, a single application call to write() will most likely immediately detect the downed host. If the remote process lives on another box on the same network, then it will most likely take at least *two* application writes to detect, even when using TCP keep-alives. Note that if I can't always detect a downed host with a single call (e.g. to write()), then I don't consider it a reliable method.

...

At the very least, this is a technique I have consistently seen recommended, and has always worked for me.

The only reliable way to detect the remote host dropping off, is to attempt a read. This is the consistent recommendation by socket programming experts. Please refer to _Effective TCP/IP Programming_, Tip 19; and Stevens. I have attached some toy applications that you can play with. Experiment with running the server on the local host, another box on the same network, and if possible, out on the internet. Jon

Scott Gifford

5:25 p.m.

Jonathan Franklin <franklin.jonathan@gmail.com> writes: [...]

...

...
At the very least, this is a technique I have consistently seen recommended, and has always worked for me.

The only reliable way to detect the remote host dropping off, is to attempt a read. This is the consistent recommendation by socket programming experts. Please refer to _Effective TCP/IP Programming_, Tip 19; and Stevens.

Sure, I agree completely that attempting a read is necessary, but IME it is not sufficient. You must additionally send data, either with an OS write or TCP keepalive, to detect a completely unresponsive peer (i.e. one which has fallen off the network). The only way to detect an unresponsive peer is via a timeout, and with no data to send, there is nothing to time out. I also agree that write() won't always return an error, but it should attempt to send data, which will cause the TCP layer to wait for an acknowledgement of that data. If that times out, the TCP layer should detect an error on the socket, and a subsequent call to read() should return an error. ----Scott.

Jonathan Franklin

6:22 p.m.

On Tue, Dec 15, 2009 at 10:25 AM, Scott Gifford <sgifford@suspectclass.com> wrote:

...

Jonathan Franklin <franklin.jonathan@gmail.com> writes: Sure, I agree completely that attempting a read is necessary, but IME it is not sufficient. You must additionally send data, either with an OS write or TCP keepalive, to detect a completely unresponsive peer (i.e. one which has fallen off the network). The only way to detect an unresponsive peer is via a timeout, and with no data to send, there is nothing to time out.

I also agree that write() won't always return an error, but it should attempt to send data, which will cause the TCP layer to wait for an acknowledgement of that data. If that times out, the TCP layer should detect an error on the socket, and a subsequent call to read() should return an error.

I think we're agreeing, but not being clear enough for each other, or the OP. The OP is interested in detecting a closed (or crashed) remote socket. There are 2 scenarios to consider: 1. The OP has no control over the application protocol, and there is no application-level ping or ACK mechanism built-in. In this case, the application cannot send any data outside of the "normal" operation (e.g. can't actively try to detect whether the remote host is still there). The application must rely on read() returning an EOF when it is notified that the remote socket has closed (e.g. by the remote system, the TCP keep-alive mechanism, an ICMP message from an intermediate router, etc). If the remote host is hard-down (blue screened, cable cut, etc), and there is no TCP keep-alive, then you're pretty much hosed. The only possibility would be to add an application-level timeout to the read. e.g. reset your timer each time you read data. Kill the socket when the timeout occurs. However, this may not be an option for your use case. 2. There is an application-level ping/ACK mechanism available (the OP may need to add it). In this case, the "ping" is sent to the hard-down remote host. The write() call will not fail, and it may take many write() calls to generate a failure. However, as soon as the TCP stack times out the send (right about when the writes will begin to fail), the read() call will immediately return an EOF. In neither case can one rely on write() failing. In case 2, one *can* rely on the read() eventually returning EOF. The worst-case scenario for case 1 will never detect the downed remote host. However, attempting to send data in case 1 under "normal" operation will generate an EOF from read(), but not a failure in write(). I prefer timing out "inactive" connections to sending "heart-beat" messages, when possible. Jon

Scott Gifford

16 Dec 16 Dec

3:26 a.m.

Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...

On Tue, Dec 15, 2009 at 10:25 AM, Scott Gifford <sgifford@suspectclass.com> wrote:

...
Sure, I agree completely that attempting a read is necessary [...] I also agree that write() won't always return an error, [...]

...

I think we're agreeing, but not being clear enough for each other, or the OP.

Yes, everything you say in this message is spot on. Hopefully this exchange has been helpful to the OP at least. :-) -----Scott.

Nicola Michael Gutberlet

17 Dec 17 Dec

8:44 a.m.

Thank you all for your explanations. They are very helpful so I think I'm going to get it implemented a sensible way. Best regards, Nicola -----Original Message----- From: boost-users-bounces@lists.boost.org [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Scott Gifford Sent: 16 December 2009 04:26 To: boost-users@lists.boost.org Subject: Re: [Boost-users] [asio] Socket closed notification? Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...

On Tue, Dec 15, 2009 at 10:25 AM, Scott Gifford <sgifford@suspectclass.com> wrote:

...
Sure, I agree completely that attempting a read is necessary [...] I also agree that write() won't always return an error, [...]

...

I think we're agreeing, but not being clear enough for each other, or the OP.

Yes, everything you say in this message is spot on. Hopefully this exchange has been helpful to the OP at least. :-) -----Scott. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Jonathan Franklin

15 Dec 15 Dec

4:03 a.m.

On Mon, Dec 14, 2009 at 8:50 PM, Scott Gifford <sgifford@suspectclass.com> wrote:

...

Jonathan Franklin <franklin.jonathan@gmail.com> writes:

...
On Mon, Dec 14, 2009 at 10:52 AM, Nicola Michael Gutberlet You generally have to do a read on the socket to reliably detect when the other side has closed (or crashed).

And a write, too, or else use TCP keepalives.

Unfortunately, write isn't reliable since it will always succeed until you fill up your send buffer. Jon

Nicola Michael Gutberlet

9:26 a.m.

At first: Thanks to all the fast responses - this mailing list seems to work properly ;-) Scott Gifford:

...

[...] or else use TCP keepalives.

Can you shortly explain how to use "keepalives" and how they work / react in case of a closed other side? Don't worry, I'm also going to check out "keepalives" on my own. But help is always welcome ;-) Thanks, Nicola

Scott Gifford

4 p.m.

"Nicola Michael Gutberlet" <nicola.gutberlet@hhi.fraunhofer.de> writes:

...

At first: Thanks to all the fast responses - this mailing list seems to work properly ;-)

Scott Gifford:

...
[...] or else use TCP keepalives.

Can you shortly explain how to use "keepalives" and how they work / react in case of a closed other side?

See this example: http://www.boost.org/doc/libs/1_41_0/doc/html/boost_asio/reference/socket_ba... Basically, with this option set your TCP implementation will periodically send a packet that effectively has zero length but requires acknowledgement from the other side, triggering the timeout counter to start. How frequently keepalive packets are sent is determined by your TCP implementation, and is typically set on a system-wide basis and defaults to about an hour. The effect of this is the same as you writing to the stream periodically, except that the other side doesn't see any writes at the application layer. That is, you still have to read from the socket to detect an error. If the application protocol you are using on top of TCP supports it, sending some kind of no-op command periodically will give you a bit more control over timing. Hope this helps! ----Scott.

5730

Age (days ago)

5733

Last active (days ago)

List overview

Download

14 comments

4 participants

participants (4)

Andrew Maclean
Jonathan Franklin
Nicola Michael Gutberlet
Scott Gifford