Linux TCP session stays ESTABLISHED when interface is removed

Hi all, I'm experiencing an odd phenomenon when using Boost ASIO 1.46.1. (I know it's older, but this was inherited.) Under normal operation, everything is fine. I can create a TCP server socket, listen, and accept as expected. If I disconnect the TCP client gracefully, everything resets as expected. However, if I physically remove the network interface -- such as yanking the USB adapter out of its socket -- the TCP session stills reports as ESTABLISHED via netstat and the application receives no read error. I'm at a bit of a loss how this is possible when the underlying OS shows that the interface no longer exists yet the socket is blocking away happily. Has anyone else experienced similar symptoms? Any advice? Thanks in advance, Chris

On 9/26/2013 9:36 AM, Quoth Chris Verges:
However, if I physically remove the network interface -- such as yanking the USB adapter out of its socket -- the TCP session stills reports as ESTABLISHED via netstat and the application receives no read error.
I'm at a bit of a loss how this is possible when the underlying OS shows that the interface no longer exists yet the socket is blocking away happily. Has anyone else experienced similar symptoms? Any advice?
This is normal for TCP connections. If you only have outstanding read operations you cannot detect non-graceful shutdowns. However the next time you attempt to write to the socket you will get a disconnection error. One strategy you can use is to have a timer that periodically writes a "null message" (whatever that happens to be according to the application protocol in use) if nothing else has been written to the port recently (ie. reset the timer whenever you do a write). You'll want to set the timer long enough to not waste bandwidth or get in the way of normal communication but short enough that you can detect disconnection within a reasonable time. Exactly how long that will be depends on your use case. :)

On Thu, Sep 26, 2013 at 11:40:56AM +1200, Gavin Lambert wrote:
On 9/26/2013 9:36 AM, Quoth Chris Verges:
However, if I physically remove the network interface -- such as yanking the USB adapter out of its socket -- the TCP session stills reports as ESTABLISHED via netstat and the application receives no read error.
I'm at a bit of a loss how this is possible when the underlying OS shows that the interface no longer exists yet the socket is blocking away happily. Has anyone else experienced similar symptoms? Any advice?
This is normal for TCP connections. If you only have outstanding read operations you cannot detect non-graceful shutdowns.
Thanks for the quick reply and insight. I'm floored that the physical interface and/or IP address being removed from the system doesn't cause a related-socket closing. I now understand that this type of functionality is more of a kernel mod than a Boost one.
However the next time you attempt to write to the socket you will get a disconnection error.
Having a "null message" (nop, heartbeat, whatever) should do what is desired, agreed. However, the link is a pay-per-byte thing. I'd prefer to not pay for null. :-) Do you happen to know if writing a zero-length message would be enough to trigger the socket layers to reset without forcing data across the physical layer? (I'm in the middle of coding it up as a test, but figured I'd ask just in case you happened to know.) Much appreciation for your reply, Chris

On 25 Sep 2013 at 17:21, Chris Verges wrote:
Do you happen to know if writing a zero-length message would be enough to trigger the socket layers to reset without forcing data across the physical layer? (I'm in the middle of coding it up as a test, but figured I'd ask just in case you happened to know.)
Much better to turn on TCP keepalive (it's one of the socket options). Then you'll learn when the transport has gone away. Niall -- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/

On Wed, Sep 25, 2013 at 8:00 PM, Niall Douglas
On 25 Sep 2013 at 17:21, Chris Verges wrote:
Do you happen to know if writing a zero-length message would be enough to trigger the socket layers to reset without forcing data across the physical layer? (I'm in the middle of coding it up as a test, but figured I'd ask just in case you happened to know.)
Much better to turn on TCP keepalive (it's one of the socket options). Then you'll learn when the transport has gone away.
Only after the keepalive timer has fired. I believe that on current linux you won't have to endure the FIN_WAIT_1 and FIN_WAIT_2 if the interface itself is gone. But, if you're hoping for a quick notification that the interface is gone, just for keepalive won't do it for you. -- Chris Cleeland

On 9/26/2013 1:00 PM, Quoth Niall Douglas:
Much better to turn on TCP keepalive (it's one of the socket options). Then you'll learn when the transport has gone away.
Yes and no. It's cleaner (invisible at the application layer, so won't confuse your protocol) but you have less control over it (and the defaults usually still take quite a long time to detect a disconnection). So if you're paying for the link itself it might be more expensive. If you're only paying for the application-layer data then it'll be cheaper.

On Thu, Sep 26, 2013 at 02:20:08PM +1200, Gavin Lambert wrote:
So if you're paying for the link itself it might be more expensive. If you're only paying for the application-layer data then it'll be cheaper.
As it sounds like Boost Asio just follows whatever the OS provides, I've kicked this over to the netdev mailing list for further discussion on the underlying mechanisms. For anyone interested, you can follow here: http://marc.info/?l=linux-netdev&m=138017549432302&w=2 Thanks to everyone who has responded. Chris

Hi
Yes and no. It's cleaner (invisible at the application layer, so won't confuse your protocol) but you have less control over it (and the defaults usually still take quite a long time to detect a disconnection).
You don't have less control over that. But you need to write platform-specific code to configure it. Under Windows as well as under Linux you can set the idle time, the interval per connection. Linux also lets you specify the retry coutner per connection.
So if you're paying for the link itself it might be more expensive. If you're only paying for the application-layer data then it'll be cheaper.
Definitely not. Every byte you write will result in a tcp packet containing at least that byte and all header information. And it will be acknowledged by the peer. TCP-Keepalive sends packets without payload, so at least one byte less. So unless, you send a zero byte message, you will create more traffic. Zero byte messages however will be ignored by the network stack. Basically tcp keepalive is the method of choice for minimal traffic - as long as you stick to tcp... Regards, Steffen
participants (5)
-
Chris Cleeland
-
Chris Verges
-
Gavin Lambert
-
Niall Douglas
-
Steffen Heil (Mailinglisten)