要了解以下答案,需要一些背景知识
Am I performing the tests in a wrong way?
是的,你的测试有些不正确.问题是您的测试使用PERSISTENT连接发送10个请求.您可以通过运行以下测试轻松检查它,并且您将不会重置任何连接(因为您每个连接只发送一个请求):
httperf --server=127.0.0.1 --port=80 --uri=/ --num-conns=10 --num-calls=1
Why am I getting this connection resets?
Old worker processes, receiving a command to shut down, stop accepting new connections and continue to service current requests until all such requests are serviced. After that, the old worker processes exit.
这是事实,但文档没有提到持久连接发生了什么.我在旧的mailing list中找到了答案.在当前正在运行请求之后,nginx将通过向客户端发送[FIN,ACK]来启动持久连接关闭.
为了检查它,我使用WireShark并配置了一个简单的工作服务器,根据请求休眠5秒然后回复.我使用以下命令发送请求:
httperf --server=127.0.0.1 --port=80 --uri=/ --num-conns=1 --num-calls=2
发出前面提到的命令后,我重新加载了nginx(当它处理第一个请求时).以下是WireShark嗅探的软件包:
> 3892-3894 – 通常的TCP连接建立.
> 3895 – 客户发送第一个请求.
> 3896 – 服务器承认3895.
>这里执行了nginx重载.
> 4089 – 服务器发送响应.
> 4090 – 服务器发送了紧密的连接信号.
> 4091 – 客户承认4089.
> 4092 – 客户承认4090.
> 4093 – 客户发送第二个请求(WTF?)
> 4094 – 客户端发送了紧密的连接信号.
> 4095 – 服务器承认4093.
> 4096 – 服务器确认4094.
那很好,该服务器没有向第二个请求发送任何响应.根据TCP connection termination:
The side that has terminated can no longer send any data into the connection, but the other side can. The terminating side should continue reading the data until the other side terminates as well.
接下来的问题是客户收到服务器的紧密连接信号后4093发生了什么?
I would say that the POST happens at the same time as the FIN, i.e. the client sent the POST because its TCP stack did not process the FIN from the server yet. Note that packet capturing is done before the data are processed by the system.
我不能对此发表评论,因为我不是网络专家.也许其他人可以提供更有见地的答案为什么发送第二个请求.
UPD以前链接的问题不相关.问separate question问题.
Is there a solution to this problem?
HTTP/1.1 clients are required to handle keepalive connection close, so this shouldn’t be a problem.
我认为它应该在客户端处理.如果连接被服务器关闭,客户端应该打开新连接并重试请求.
I actually need a load balancer which I can dynamically add and remove servers from it, any better solutions which fits my problem?
我不知道其他服务器,所以不能在这里提出建议.
一旦您的客户端正确处理连接关闭,就不应该有任何理由阻止您使用nginx.