SO_KEEPALIVE 保持连接检测对方主机是否崩溃


SO_KEEPALIVE

在《UNIX网络编程第1卷》中也有详细的阐述:

SO_KEEPALIVE 保持连接检测对方主机是否崩溃,避免(服务器)永远阻塞于TCP连接的输入。设置该选项后,如果2小时内在此套接口的任一方向都没有数据交换,TCP就自 动给对方 发一个保持存活探测分节(keepalive probe)。这是一个对方必须响应的TCP分节.它会导致以下三种情况:对方接收一切正常:以期望的ACK响应。2小时后,TCP将发出另一个探测分 节。对方已崩溃且已重新启动:以RST响应。套接口的待处理错误被置为ECONNRESET,套接 口本身则被关闭。对方无任何响应:源自berkeley的TCP发送另外8个探测分节,相隔75秒一个,试图得到一个响应。在发出第一个探测分节11分钟 15秒后若仍无响应就放弃。套接口的待处理错误被置为ETIMEOUT,套接口本身则被关闭。如ICMP错误是“host unreachable(主机不可达)”,说明对方主机并没有崩溃,但是不可达,这种情况下待处理错误被置为 EHOSTUNREACH。

根据上面的介绍我们可以知道对端以一种非优雅的方式断开连接的时候,我们可以设置SO_KEEPALIVE属性使得我们在2小时以后发现对方的TCP连接是否依然存在。   

keepAlive = 1;
Setsockopt(listenfd, SOL_SOCKET, SO_KEEPALIVE, (void*)&keepAlive, sizeof(keepAlive));

如果我们不能接受如此之长的等待时间,从TCP-Keepalive-HOWTO上可以知道一共有两种方式可以设置,一种是修改内核关于网络方面的 配置参数,另外一种就是SOL_TCP字段的TCP_KEEPIDLE, TCP_KEEPINTVL, TCP_KEEPCNT三个选项。

1) The tcp_keepidle parameter specifies the interval of inactivity that causes TCP to generate a KEEPALIVE transmission for an application that requests them. tcp_keepidle defaults to 14400 (two hours).
/*开始首次KeepAlive探测前的TCP空闭时间 */
 
2) The tcp_keepintvl parameter specifies the interval between the nine retries that are attempted if a KEEPALIVE transmission is not acknowledged. tcp_keepintvl defaults to 150 (75 seconds).
    /* 两次KeepAlive探测间的时间间隔  */    3) The tcp_keepcnt option specifies the maximum number of keepalive probes to be sent. The value of TCP_KEEPCNT is an integer value between 1 and n, where n is the value of the systemwide tcp_keepcnt parameter.
/* 判定断开前的KeepAlive探测次数
    
int                 keepIdle = 1000;
int                 keepInterval = 10;
int                 keepCount = 10;
Setsockopt(listenfd, SOL_TCP, TCP_KEEPIDLE, (void *)&keepIdle, sizeof(keepIdle));
Setsockopt(listenfd, SOL_TCP,TCP_KEEPINTVL, (void *)&keepInterval, sizeof(keepInterval));
Setsockopt(listenfd,SOL_TCP, TCP_KEEPCNT, (void *)&keepCount, sizeof(keepCount));

Remember that keepalive is not program?related, but socket?related, so if you have multiple sockets, you can handle keepalive for each of them separately.