问题描述

其他业务线的同学在测试环境发现应用程序一直不能获取redis连接,我帮忙看了下。 首先看应用错误日志

Caused by: org.springframework.data.redis.RedisConnectionFailureException: Cannot get Jedis connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:97)
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.getConnection(JedisConnectionFactory.java:143)
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.getConnection(JedisConnectionFactory.java:41)
    at org.springframework.data.redis.core.RedisConnectionUtils.doGetConnection(RedisConnectionUtils.java:85)
    at org.springframework.data.redis.core.RedisConnectionUtils.getConnection(RedisConnectionUtils.java:55)
    at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:169)
    at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:149)
    ... 76 more
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
    at redis.clients.util.Pool.getResource(Pool.java:22)
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:90)
    ... 83 more
Caused by: java.util.NoSuchElementException: Could not create a validated object, cause: ValidateObject failed
    at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:871)
    at redis.clients.util.Pool.getResource(Pool.java:20)
    ... 84 more

问题调查

确定环境

发现是使用spring-data-redis通过jedis连接的redis服务端。 这个系统的代码很久没动,已经忘记了。先看看使用的jar版本吧。 查看应用程序使用的相关jar:

lsof -p 19377 | grep -E "jedis|pool|redis"

发现输出的jar包含:commons-pool-1.3.jar、spring-data-redis-1.1.1.RELEASE.jar、jedis-2.1.0.jar 翻了下commons pool相关代码

try {
    _factory.activateObject(latch.getPair().value);
    if(_testOnBorrow &&
            !_factory.validateObject(latch.getPair().value)) {
        throw new Exception("ValidateObject failed");
    }
    synchronized(this) {
        _numInternalProcessing--;
        _numActive++;
    }
    return latch.getPair().value;
}
catch (Throwable e) {
    PoolUtils.checkRethrow(e);
    // object cannot be activated or is invalid
    try {
        _factory.destroyObject(latch.getPair().value);
    } catch (Throwable e2) {
        PoolUtils.checkRethrow(e2);
        // cannot destroy broken object
    }
    synchronized (this) {
        _numInternalProcessing--;
        if (!newlyCreated) {
            latch.reset();
            _allocationQueue.add(0, latch);
        }
        allocate();
    }
    if(newlyCreated) {
        throw new NoSuchElementException("Could not create a validated object, cause: " + e.getMessage());
    }
    else {
        continue; // keep looping
    }
}

可见客户端应该是配置了testOnBorrow,在校验连接时失败了。

java操作redis有多种客户端,项目使用spring-data-redis操作redis,在spring-data-redis中也有不同的客户端实现如jedis,lettuce等。根据错误日志推断使用的redis客户端实现为jedis。 查看JedisConnectionFactory源码JedisPool中定义了校验对象的代码。

public boolean validateObject(final Object obj) {
    if (obj instanceof Jedis) {
        final Jedis jedis = (Jedis) obj;
        try {
            return jedis.isConnected() && jedis.ping().equals("PONG");
        } catch (final Exception e) {
            return false;
        }
    } else {
        return false;
    }
}

通过wireshark查看TCP包并确定问题原因

熟悉redis的同学都知道,redis客户端发送“PING”后服务端会返回一个“PONG“作为回应,一般会作为连接的检验方法。 既然校验报错,那抓包看看请求和响应吧!

首先查看网卡编号ip a 再使用tcpdump对eth1网卡的6379端口数据抓包。

tcpdump -i eth1 port 6379 -w target.cap

最后使用wireshark对target.cap进行分析,可借助wireshark的redis插件进行分析。 根据应用错误日志打印的时间,查询到此时客户端(应用服务器)向服务端(redis服务器)发送了一个RST包。

感觉是有问题的。就往上查了下。

可以看到,箭头位置上方客户端发送了PING命令,箭头位置应该返回客户端一个PONG作为响应。而是返回了以下信息:

MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.

意思是,redis服务端配置了RDB快照持久化,但当前不能进行持久化。有可能修改数据集的命令都被禁用了。(但是通过看源码发现,除了涉及修改的命令,PING也在禁用之列,redis-3.2.9 server.c,而读取涉及的命令应该不会受到影响) 以下代码是redis-3.2.9 server.c中in processCommand(client *c)发生持久化异常后的处理代码

/* Don't accept write commands if there are problems persisting on disk
     * and if this is a master instance. */
    if (((server.stop_writes_on_bgsave_err &&
          server.saveparamslen > 0 &&
          server.lastbgsave_status == C_ERR) ||
          server.aof_last_write_status == C_ERR) &&
        server.masterhost == NULL &&
        (c->cmd->flags & CMD_WRITE ||
         c->cmd->proc == pingCommand))
    {
        flagTransaction(c);
        if (server.aof_last_write_status == C_OK)
            addReply(c, shared.bgsaveerr);
        else
            addReplySds(c,
                sdscatprintf(sdsempty(),
                "-MISCONF Errors writing to the AOF file: %s\r\n",
                strerror(server.aof_last_write_errno)));
        return C_OK;
    }

之后客户端发送QUIT命令退出,服务器返回OK响应退出成功。 那个返回的配置错误信息是说在持久化RDB时出现了问题。于是到redis服务器上看了下磁盘信息和redis的日志,果然,磁盘空间不足了。

到此,问题基本查明,是由于redis所在服务器磁盘不足导致,由于是测试服务器,也没有配置磁盘的监控。腾出空间后即可恢复。

对RST包的理解

但是我还有一个问题,那就是为什么会有一个RST包呢?如果没有那个RST包,其实问题还不好发现,虽然按照错误日志的时间,挨个查找Redis数据包的信息,能够查询出来,但是RST无疑从一开始就吸引了我的注意,让我能够更加快速的定位问题。

初识RST

那现在问题来了,为什么会有RST包呢? 首先了解一下RST。(可参考TCP/IP详解 卷118.7 复位报文段) 归纳起来,当以下任一情况发生时,会产生RST包:

  • 到不存在的端口的连接请求
  • 异常终止一个连接
  • 检测半打开连接

jedis与redis的关闭机制

观察RST之前的几个包

使用wireshark的专家信息查看多个RST包,发现RST之前都会有QUIT,OK的交互。那看来应该是框架层面的问题。 再翻看上面GenericObjectPool的相关代码,在borrowObject时如果发生异常,会调用destroyObject()方法,这个destroyObject是延迟到子类实现的,也就是上面说到的JedisPool。

public void destroyObject(final Object obj) throws Exception {
    if (obj instanceof Jedis) {
        final Jedis jedis = (Jedis) obj;
        if (jedis.isConnected()) {
            try {
                try {
                    jedis.quit();
                } catch (Exception e) {
                }
                jedis.disconnect();
            } catch (Exception e) {

            }
        }
    }
}

最终调用redis.clients.jedis.Connection的disconnect,关闭输入输出流。

public void disconnect() {
    if (isConnected()) {
        try {
            inputStream.close();
            outputStream.close();
            if (!socket.isClosed()) {
                socket.close();
            }
        } catch (IOException ex) {
            throw new JedisConnectionException(ex);
        }
    }
}

这也就解释了为什么会出现RST包: 客户端请求QUIT,服务端返回OK。(此时客户端在接收完quit返回后,调用了disconnect方法,导致连接断开)紧接着服务端发起TCP挥手,发送FIN包到之前交互的客户端51311端口,但调用完disconnect的客户端已经断开了和服务端的连接。客户端只能通过发送RST,通知服务端“你发送了一个到不存在的端口的关闭请求”。

翻看新版的jedis代码,除了将之前JedisPool中实现的代码挪到了JedisFactory中实现,大致逻辑依然没有改变()

// 2.10 JedisFactory
@Override
  public void destroyObject(PooledObject<Jedis> pooledJedis) throws Exception {
    final BinaryJedis jedis = pooledJedis.getObject();
    if (jedis.isConnected()) {
      try {
        try {
          jedis.quit();
        } catch (Exception e) {
        }
        jedis.disconnect();
      } catch (Exception e) {

      }
    }
  }

@Override
public boolean validateObject(PooledObject<Jedis> pooledJedis) {
  final BinaryJedis jedis = pooledJedis.getObject();
  try {
    HostAndPort hostAndPort = this.hostAndPort.get();

    String connectionHost = jedis.getClient().getHost();
    int connectionPort = jedis.getClient().getPort();

    return hostAndPort.getHost().equals(connectionHost)
        && hostAndPort.getPort() == connectionPort && jedis.isConnected()
        && jedis.ping().equals("PONG");
  } catch (final Exception e) {
    return false;
  }
}

而disconnect最终调用的Connection有变化。

public void disconnect() {
  if (isConnected()) {
    try {
      outputStream.flush();
      socket.close();
    } catch (IOException ex) {
      broken = true;
      throw new JedisConnectionException(ex);
    } finally {
      IOUtils.closeQuietly(socket);
    }
  }
}

由之前的inpusStream.close()和outputStream.close()改成了outputStream.flush()。原因是jedis自定义了带缓冲的RedisOutputStream,在socket.close前要确保缓冲内容写到流中。 客户端使用disconnect确实能够快速释放资源,在调用disconnect时关闭了客户端端口,回收了文件句柄资源。 试想如果在quit后,服务端就已经释放了文件句柄,关闭了socket连接,而客户端不调用disconnect释放资源,就会一直占用资源,在进程结束才会释放。 下图也进行了验证。第一次注释掉disconnect中关闭socket的代码,程序sleep10秒后退出,可以看到直到进程退出时,客户端的连接才被关闭。而第二次是恢复注释掉的代码,客户端在quit后马上就关闭了连接释放了资源。

redis连接开启和关闭时的系统调用

这个问题困扰了我一天,到底怎么产生的RST包?不管是客户端还是服务端,调用close后,都应该进行正常的四次握手吧? 我反复看了redis服务端关闭客户端连接的源码(redis 3.2.9 networking.c#unlinkClient)。也只是调用了系统调用close(fd),甚至为了避免干扰还新建了一个redis实例,使用strace -f -p $pid -tt -T跟踪关闭附近的系统调用

[pid 25442] 10:29:42.299132 epoll_wait(3, {{EPOLLIN, {u32=4, u64=4}}}, 11024, 100) = 1 <0.004041>
[pid 25442] 10:29:42.303248 accept(4, {sa_family=AF_INET, sin_port=htons(52294), sin_addr=inet_addr("192.168.3.45")}, [16]) = 5 <0.000025>
[pid 25442] 10:29:42.303356 fcntl(5, F_GETFL) = 0x2 (flags O_RDWR) <0.000014>
[pid 25442] 10:29:42.303417 fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000010>
[pid 25442] 10:29:42.303456 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000012>
[pid 25442] 10:29:42.303499 epoll_ctl(3, EPOLL_CTL_ADD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:42.303544 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 96) = 1 <0.073370>
[pid 25442] 10:29:42.376968 read(5, "*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\nbar\r\n", 16384) = 31 <0.000014>
[pid 25442] 10:29:42.377071 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000013>
[pid 25442] 10:29:42.377144 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 22) = 1 <0.000017>
[pid 25442] 10:29:42.377210 write(5, "+OK\r\n", 5) = 5 <0.000034>
[pid 25442] 10:29:42.377304 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000025>
[pid 25442] 10:29:42.377377 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 22) = 1 <0.007943>
[pid 25442] 10:29:42.385376 read(5, "*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n", 16384) = 22 <0.000013>
[pid 25442] 10:29:42.385432 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:42.385477 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 14) = 1 <0.000010>
[pid 25442] 10:29:42.385518 write(5, "$3\r\nbar\r\n", 9) = 9 <0.000019>
[pid 25442] 10:29:42.385567 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:42.385617 epoll_wait(3, {}, 11024, 14) = 0 <0.014075>
[pid 25442] 10:29:42.399742 epoll_wait(3, {}, 11024, 100) = 0 <0.100126>
[pid 25442] 10:29:42.499930 epoll_wait(3, {}, 11024, 100) = 0 <0.100126>
[pid 25442] 10:29:42.600115 epoll_wait(3, {}, 11024, 100) = 0 <0.100071>
[pid 25442] 10:29:42.700276 epoll_wait(3, {}, 11024, 100) = 0 <0.100131>
[pid 25442] 10:29:42.800482 epoll_wait(3, {}, 11024, 100) = 0 <0.100129>
[pid 25442] 10:29:42.900687 epoll_wait(3, {}, 11024, 100) = 0 <0.100141>
[pid 25442] 10:29:43.000895 epoll_wait(3, {}, 11024, 100) = 0 <0.100132>
[pid 25442] 10:29:43.101095 epoll_wait(3, {}, 11024, 100) = 0 <0.100131>
[pid 25442] 10:29:43.201305 epoll_wait(3, {}, 11024, 100) = 0 <0.100134>
[pid 25442] 10:29:43.301521 epoll_wait(3, {}, 11024, 100) = 0 <0.100136>
[pid 25442] 10:29:43.401725 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 100) = 1 <0.003552>
[pid 25442] 10:29:43.405350 read(5, "*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n", 16384) = 22 <0.000016>
[pid 25442] 10:29:43.405425 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:43.405477 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 96) = 1 <0.000014>
[pid 25442] 10:29:43.405531 write(5, "$3\r\nbar\r\n", 9) = 9 <0.000022>
[pid 25442] 10:29:43.405601 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:43.405660 epoll_wait(3, {}, 11024, 96) = 0 <0.096129>
[pid 25442] 10:29:43.501877 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 100) = 1 <0.003474>
[pid 25442] 10:29:43.505429 read(5, "*1\r\n$4\r\nQUIT\r\n", 16384) = 14 <0.000018>
[pid 25442] 10:29:43.505514 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000015>
[pid 25442] 10:29:43.505578 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 96) = 1 <0.000012>
[pid 25442] 10:29:43.505623 write(5, "+OK\r\n", 5) = 5 <0.000028>
[pid 25442] 10:29:43.505693 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000016>
[pid 25442] 10:29:43.505764 epoll_ctl(3, EPOLL_CTL_DEL, 5, {0, {u32=5, u64=5}}) = 0 <0.000016>
[pid 25442] 10:29:43.505830 close(5)    = 0 <0.000111>
[pid 25442] 10:29:43.505992 epoll_wait(3, {}, 11024, 96) = 0 <0.096134>

java客户端junit测试代码(根据jedis测试用例JedisPoolTest#checkConnections修改):

JedisPool pool = new JedisPool(new JedisPoolConfig(), hnp.getHost(), hnp.getPort(), 2000);
    Jedis jedis = pool.getResource();
    jedis.set("foo", "bar");
    assertEquals("bar", jedis.get("foo"));
    pool.returnResource(jedis);

    try {
      Thread.sleep(1*1000);
    } catch (InterruptedException e) {
      e.printStackTrace();
    }
    System.out.println("hello");
    jedis.get("foo");
    pool.destroy();
    assertTrue(pool.isClosed());

观察服务端系统调用,

setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0 ... close(5) = 0

在socket连接时只设置了TCP_NODELAY,禁用了Nagle算法。

jedis客户端的socket设置

正在无解之际,突然想到是不是redis客户端设置了一些参数呢? 终于,在jedis控制连接的redis.clients.jedisConnection类中,找到了连接时对socket的设置:

public void connect() {
    if (!isConnected()) {
      try {
        socket = new Socket();
        // ->@wjw_add
        socket.setReuseAddress(true);
        socket.setKeepAlive(true); // Will monitor the TCP connection is
        // valid
        socket.setTcpNoDelay(true); // Socket buffer Whetherclosed, to
        // ensure timely delivery of data
        socket.setSoLinger(true, 0); // Control calls close () method,
        // the underlying socket is closed
        // immediately
        // <-@wjw_add

        socket.connect(new InetSocketAddress(host, port), connectionTimeout);
        socket.setSoTimeout(soTimeout);

        if (ssl) {
          if (null == sslSocketFactory) {
            sslSocketFactory = (SSLSocketFactory)SSLSocketFactory.getDefault();
          }
          socket = (SSLSocket) sslSocketFactory.createSocket(socket, host, port, true);
          if (null != sslParameters) {
            ((SSLSocket) socket).setSSLParameters(sslParameters);
          }
          if ((null != hostnameVerifier) &&
              (!hostnameVerifier.verify(host, ((SSLSocket) socket).getSession()))) {
            String message = String.format(
                "The connection to '%s' failed ssl/tls hostname verification.", host);
            throw new JedisConnectionException(message);
          }
        }

        outputStream = new RedisOutputStream(socket.getOutputStream());
        inputStream = new RedisInputStream(socket.getInputStream());
      } catch (IOException ex) {
        broken = true;
        throw new JedisConnectionException("Failed connecting to host " 
            + host + ":" + port, ex);
      }
    }
  }

这个socket.setSoLinger(true, 0);引起了我的注意。 根据SCTP rfc SO_LINGER的解释

If the l_linger value is set to 0, calling close() is the same as the ABORT primitive.

继续看SCTP_ABORT:

SCTP_ABORT: Setting this flag causes the specified association to abort by sending an ABORT message to the peer. The ABORT chunk will contain an error cause of 'User Initiated Abort' with cause code 12. The cause-specific information of this error cause is provided in msg_iov.

不太明白,看下TCP中对Abort的解释吧 TCP rfc对Abort的解释:

This command causes all pending SENDs and RECEIVES to be aborted, the TCB to be removed, and a special RESET message to be sent to the TCP on the other side of the connection. Depending on the implementation, users may receive abort indications for each outstanding SEND or RECEIVE, or may simply receive an ABORT-acknowledgment. 注:TCB是一个抽象的控制块(Transmission Control Block)

Socket选项SO_LINGER用于强制中断

到此才算明白,由于jedis客户端在连接时,设置了socket.setSoLinger(true, 0);,这样在关闭连接时就等同与TCP的Abort,也就是忽略所有正在发送和接收的数据,直接向对方发送一个RESET消息。这也是为什么jedis要在socket.close()前flush缓冲,以确保在途数据不会丢失。 我去掉了客户端对SO_LINGER的设置,终于又看到了正常的TCP挥手。

还想深入的同学,可以阅读linux源码net/ipv4/tcp.c。我大概看了下,代码逻辑很明确(linux内核版本有区别)如果设置了SO_LINGER,在close时,会直接调用tcp_disconnect发送RST数据包,而不再做常规的四次挥手流程。虽然我觉得这样做不太优雅,更优雅的做法可能是socket.setSoLinger(true, timeout)设置一个超时阀值。 在这个github jedis issue Improving socket performance中描述了加入以下四项设置用于提升性能。

socket.setReuseAddress(true);
socket.setKeepAlive(true);
socket.setTcpNoDelay(true);
socket.setSoLinger(true,0);

在issue下加了个comment询问了下,有消息了再更新吧。

总结

此次应用程序中Jedis连接池不能获取redis连接的问题,原因是redis服务器磁盘空间满,导致不能保存快照(rdb snapshot)。应用程序中在testOnBorrow为true的情况下,使用redisPING PONG命令测试redis连接是否有效时,收到了MISCONF Redis is configured to save RDB snapshots的响应,而非正常的PONG。这就导致jedis判断连接无效,强制断开了连接。 之后对TCP中RST flag做了浅尝辄止的分析。当设置了socket.setSoLinger(true, 0)后,关闭此socket将清空数据并向对方发送RST消息。 可以深入的地方还有不少,自己关于网络编程的知识也有待加强。准备补充下相关知识,再结合一些优秀的开源项目如redis、nginx深入了解下。


参考

  1. Jedis源码 https://github.com/xetorthio/jedis
  2. Commons-pool源码 https://github.com/apache/commons-pool
  3. Spring-data-redis源码 https://github.com/spring-projects/spring-data-redis
  4. redis-wireshark源码 https://github.com/jzwinck/redis-wireshark
  5. Redis源码 https://github.com/antirez/redis
  6. TCP/IP详解在线电子书 http://www.52im.net/topic-tcpipvol1.html
  7. SCTP rfc - https://tools.ietf.org/html/rfc6458
  8. TCP rfc - https://tools.ietf.org/html/rfc793
  9. 几种TCP连接中出现RST的情况
  10. setsockopt()--Set Socket Options
  11. StackOverflow What is AF_INET, and why do I need it?
  12. Socket选项系列之SO_LINGER(《深入剖析Nginx》作者) - http://www.lenky.info/archives/2013/02/2220