
     记录下一些感悟,重新考虑两个问题 1 心跳机制  2 Redis实现分布式公平锁




    由于在长连接的场景下,客户端和服务端并不是一直处于通信状态,如果双方长期没有通讯,双方都不清楚对方目前的状态,所以需要发送一段很小的报文告诉对方 "I am alive",同时也能告诉对方当前服务依旧能正常处理检测存活逻辑。







    1 客户端向服务端建立长连接,Mina在创建长连接时会把该连接包装成一个IoSession,同时把需要空闲检测的session加入到(空闲状态检测)IdleStatusChecker的sessions集合中。

    2  客户端会定时向服务端发送心跳信息 MSG_HEART_BEATING 

    3  服务端收到的MSG_HEART_BEATING 后回复客户端一个ACK

    4 IdleStatusChecker 每隔1s中,会遍历sessions中所有的连接,通过系统当前时间来计算该连接空闲时间。

public void run() {    thread = Thread.currentThread();    try {        while (!cancelled) {            // Check idleness with fixed delay (1 second).            long currentTime = System.currentTimeMillis();            notifySessions(currentTime);            try {                Thread.sleep(1000);            } catch (InterruptedException e) {               // will exit the loop if interrupted from interrupt()            }        }    } finally {        thread = null;    }}

     5 IdleStatusChecker 计算空闲时间是否达到超时时间,超时则主动断开连接,

private static void notifyWriteTimeout(IoSession session, long currentTime) {    long writeTimeout = session.getConfig().getWriteTimeoutInMillis();    if ((writeTimeout > 0)             && (currentTime - session.getLastWriteTime() >= writeTimeout)            && !session.getWriteRequestQueue().isEmpty(session)) {        WriteRequest request = session.getCurrentWriteRequest();        if (request != null) {            session.setCurrentWriteRequest(null);            WriteTimeoutException cause = new WriteTimeoutException(request);            request.getFuture().setException(cause);            session.getFilterChain().fireExceptionCaught(cause);            // WriteException is an IOException, so we close the session.            session.close(true);        }    }}

    6 当session断开连接的时候,业务实现的 IoHandlerAdapter会收到sessionClosed 事件,此时,业务侧执行对应的客户端下线逻辑。

     7 当客户端恢复之后,执行重连。




    回想下JUC 包下的ReentrantLock,其实现了独占锁,并支持 可重入获取锁,公平锁|非公平锁,中断等待锁的线程的机制(等待获取锁超时)。从这几个特性开始考虑。

     1 可重入获得锁



          ReentrantLock利用AQS 通过判断持有锁线程是与申请锁线程一致,则CAS更新AQS的State属性

protected final boolean tryAcquire(int acquires) {    final Thread current = Thread.currentThread();    int c = getState();    if (c == 0) {        if (!hasQueuedPredecessors() &&            compareAndSetState(0, acquires)) {            setExclusiveOwnerThread(current);            return true;        }    }    else if (current == getExclusiveOwnerThread()) {        int nextc = c + acquires;        if (nextc < 0)            throw new Error("Maximum lock count exceeded");        setState(nextc);        return true;    }    return false;}

     2  公平锁|非公平锁



            ReentrantLock利用AQS 通过Node构建的双链表来实现FIFO队列,当线程没有抢占到锁的时候,先进入队列等待,FairSync 和 NoFairSync 区别在于FairSync调用tryAcquire 在发现AQS state = 0时候会先判断队列有无线程等待,有等待则放弃抢占锁,

if (c == 0) {      if (!hasQueuedPredecessors() &&          compareAndSetState(0, acquires)) {          setExclusiveOwnerThread(current);          return true;      }  }

    3   中断等待锁的客户端




public boolean tryLock(long timeout, TimeUnit unit)            throws InterruptedException {        return sync.tryAcquireNanos(1, unit.toNanos(timeout));    }    private boolean doAcquireNanos(int arg, long nanosTimeout)        throws InterruptedException {    if (nanosTimeout <= 0L)        return false;    final long deadline = System.nanoTime() + nanosTimeout;    final Node node = addWaiter(Node.EXCLUSIVE);    boolean failed = true;    try {        // 自旋获取锁        for (;;) {            final Node p = node.predecessor();            if (p == head && tryAcquire(arg)) {                setHead(node);                p.next = null; // help GC                failed = false;                return true;            }            nanosTimeout = deadline - System.nanoTime();            if (nanosTimeout <= 0L)                return false;             //避免死循环,如果还没超时,则挂起线程一段时间            if (shouldParkAfterFailedAcquire(p, node) &&                nanosTimeout > spinForTimeoutThreshold)                LockSupport.parkNanos(this, nanosTimeout);            if (Thread.interrupted())                throw new InterruptedException();        }    } finally {        if (failed)            cancelAcquire(node);    }}



     锁资源Hash Key, 等待队列 waitList,超时队列 timeoutZSet

     (1)场景1 一个客户端可以获得锁的判断流程

redis设置心跳 redis心跳机制_怎么重置blockinput的锁

     (2)场景2 一个新客户端过来获取锁,通过场景1发现自己不能获得锁后执行的操作

redis设置心跳 redis心跳机制_redis_02

    我们常用Redission组件(https://github.com/redisson/redisson )来实现分布式锁,但老版本的Redission只提供了非公平锁的实现。今天发现其在3.13.3版本提供了最基本的公平锁的实现。目前最新版本3.13.6提供了等待锁超时的实现。

    其获取公平锁实现在org.redisson.RedissonFairLock 类中,其实现获取公平锁的lua脚本如下。

RFuture tryLockInnerAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand command) {  internalLockLeaseTime = unit.toMillis(leaseTime);  long wait = threadWaitTime;  if (waitTime != -1) {      wait = unit.toMillis(waitTime);  } if (command == RedisCommands.EVAL_LONG) {     return evalWriteAsync(getName(), LongCodec.INSTANCE, command,    // remove stale threads    "while true do " +        "local firstThreadId2 = redis.call('lindex', KEYS[2], 0);" +        "if firstThreadId2 == false then " +            "break;" +        "end;" +        "local timeout = tonumber(redis.call('zscore', KEYS[3], firstThreadId2));" +        "if timeout <= tonumber(ARGV[4]) then " +            // remove the item from the queue and timeout set            // NOTE we do not alter any other timeout            "redis.call('zrem', KEYS[3], firstThreadId2);" +            "redis.call('lpop', KEYS[2]);" +        "else " +            "break;" +        "end;" +    "end;" +    // check if the lock can be acquired now    "if (redis.call('exists', KEYS[1]) == 0) " +        "and ((redis.call('exists', KEYS[2]) == 0) " +            "or (redis.call('lindex', KEYS[2], 0) == ARGV[2])) then " +        // remove this thread from the queue and timeout set        "redis.call('lpop', KEYS[2]);" +        "redis.call('zrem', KEYS[3], ARGV[2]);" +        // decrease timeouts for all waiting in the queue        //更新队列中所有等待的超时时间        "local keys = redis.call('zrange', KEYS[3], 0, -1);" +        "for i = 1, #keys, 1 do " +            "redis.call('zincrby', KEYS[3], -tonumber(ARGV[3]), keys[i]);" +        "end;" +        // acquire the lock and set the TTL for the lease        "redis.call('hset', KEYS[1], ARGV[2], 1);" +        "redis.call('pexpire', KEYS[1], ARGV[1]);" +        "return nil;" +    "end;" +    // check if the lock is already held, and this is a re-entry    "if redis.call('hexists', KEYS[1], ARGV[2]) == 1 then " +        "redis.call('hincrby', KEYS[1], ARGV[2],1);" +        "redis.call('pexpire', KEYS[1], ARGV[1]);" +        "return nil;" +    "end;" +    // the lock cannot be acquired    // check if the thread is already in the queue    "local timeout = redis.call('zscore', KEYS[3], ARGV[2]);" +    "if timeout ~= false then " +        // the real timeout is the timeout of the prior thread        // in the queue, but this is approximately correct, and        // avoids having to traverse the queue            //真正的超时是队列中前一个线程的超时时间,但这大致是正确的,      // 并避免了必须遍历队列        "return timeout - tonumber(ARGV[3]) - tonumber(ARGV[4]);" +    "end;" +    // add the thread to the queue at the end, and set its timeout in the timeout set to the timeout of    // the prior thread in the queue (or the timeout of the lock if the queue is empty) plus the    // threadWaitTime     //在结束时将线程添加到等待队列中,     // 并在超时设置中将其超时设置为队列中前一个线程的超时(或者如果队列为空,则为锁的超时)加上threadWaitTime    "local lastThreadId = redis.call('lindex', KEYS[2], -1);" +    "local ttl;" +    "if lastThreadId ~= false and lastThreadId ~= ARGV[2] then " +        "ttl = tonumber(redis.call('zscore', KEYS[3], lastThreadId)) - tonumber(ARGV[4]);" +    "else " +        "ttl = redis.call('pttl', KEYS[1]);" +    "end;" +    "local timeout = ttl + tonumber(ARGV[3]) + tonumber(ARGV[4]);" +    "if redis.call('zadd', KEYS[3], timeout, ARGV[2]) == 1 then " +        "redis.call('rpush', KEYS[2], ARGV[2]);" +    "end;" +    "return ttl;",    Arrays.asList(getName(), threadsQueueName, timeoutSetName),    internalLockLeaseTime, getLockName(threadId), wait, currentTime);        }        throw new IllegalArgumentException();    }
