Redis 哨兵集群主库故障修复重新上线

  • 1. 修复故障主库
  • 2. 查看恢复节点的配置文件
  • 3. 查看恢复节点主从关系
  • 4. 配置故障节点选举为主库
  • 5. 查看节点的主从复制关系
  • 6. 将权重值调整为默认值
  • 7. Redis 哨兵集群注意



上一篇文章我们已经看到了主库故障后,哨兵集群会选出新的主库,并将其他从库重新复制新的主库。那么,主库修改后重新上线之后呢?

当主库修复后重新上线,首先通过哨兵知道谁是当前的主库,然后就会找主库复制数据,并且会自动修改配置文件,当数据复制完成之后。如何将重新上线的主库设置为主库呢?需要将重新上线的主库的权重提高,然后重新选举,这时原来的主库就能成为新的主库了,调整完在将权重调整为默认的

实现思路:

1. 将故障的主库重新恢复
2. 查看当前的主从状态,验证由于主库宕机,与从库产生的数据是否同步
3. 调整权重值
4. 重新选举,使原来的主库变成新的主库
5. 恢复的主库重新成为新的主库后,要把调整的权重值全部变成默认值

1. 修复故障主库

# 1. 修改主库
[root@vm66-51 ~]# systemctl start redis-6379.service
[root@vm66-51 ~]# systemctl start redis-sentinel.service

# 2. 查看其他两个节点的日志输出,任意一个节点都会输出,表示192.168.66.51已经加入集群了
[root@vm66-52 ~]# tailf /opt/redis/26379/logs/26379.log
……省略……
20327:X 14 Mar 2022 21:45:37.557 # -sdown slave 192.168.66.51:6379 192.168.66.51 6379 @ myredis 192.168.66.52 6379
20327:X 14 Mar 2022 21:45:58.580 # -sdown sentinel 8478b39abc578f168e70765fbf106779772d0b6a 192.168.66.51 26379 @ myredis 192.168.66.52 6379

2. 查看恢复节点的配置文件

[root@vm66-51 ~]# cat /opt/redis/26379/etc/26379.conf 
bind 192.168.66.51
port 26379
daemonize yes
logfile "/opt/redis/26379/logs/26379.log"
dir "/data/redis/26379"
sentinel myid 8478b39abc578f168e70765fbf106779772d0b6a
sentinel deny-scripts-reconfig yes
sentinel monitor myredis 192.168.66.52 6379 2  # <--- 可以看到已经自动修改为当前库的地址
sentinel down-after-milliseconds myredis 3000
# Generated by CONFIG REWRITE
maxclients 4064
protected-mode no
supervised systemd
sentinel failover-timeout myredis 18000
sentinel config-epoch myredis 1
sentinel leader-epoch myredis 1
sentinel known-replica myredis 192.168.66.51 6379
sentinel known-replica myredis 192.168.66.53 6379
sentinel known-sentinel myredis 192.168.66.53 26379 f73b697ac1a89f99fe35d12256875caecfc19d31
sentinel known-sentinel myredis 192.168.66.52 26379 f39473098df4bdfbc98b5a4a05a3b02bcf8c7381
sentinel current-epoch 1

3. 查看恢复节点主从关系

# 1. 已经复制当前主库
[root@vm66-51 ~]# redis-cli config get slaveof
1) "slaveof"
2) "192.168.66.52 6379"

# 2. 查看节点故障时间,从变主产生的数据
[root@vm66-51 ~]# redis-cli get k1
"v1"

4. 配置故障节点选举为主库

配置已经修复的节点的选举权重,使用重新为主库

哨兵选择首先就是查看谁的权重较高,谁就当选主库

# 1. 查看其他两个节点的权重值
[root@vm66-51 ~]# redis-cli -h 192.168.66.52 -p 6379 config get slave-priority 
1) "slave-priority"
2) "100"
[root@vm66-51 ~]# redis-cli -h 192.168.66.53 -p 6379 config get slave-priority 
1) "slave-priority"
2) "100"

# 2. 将其他两个节点的权重值改为0
[root@vm66-51 ~]# redis-cli -h 192.168.66.52 -p 6379 config set slave-priority 0
OK
[root@vm66-51 ~]# redis-cli -h 192.168.66.53 -p 6379 config set slave-priority 0
OK

# 3. 主动发生选举
[root@vm66-51 ~]# redis-cli -h 192.168.66.51 -p 26379 sentinel failover myredis
OK


# 4. 查看节点sentinel输出的日志
[root@vm66-51 ~]# tailf /opt/redis/26379/logs/26379.log 
26354:X 15 Mar 2022 04:43:21.285 # Executing user requested FAILOVER of 'myredis'
26354:X 15 Mar 2022 04:43:21.285 # +new-epoch 2
26354:X 15 Mar 2022 04:43:21.285 # +try-failover master myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:21.352 # +vote-for-leader 8478b39abc578f168e70765fbf106779772d0b6a 2
26354:X 15 Mar 2022 04:43:21.352 # +elected-leader master myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:21.352 # +failover-state-select-slave master myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:21.429 # +selected-slave slave 192.168.66.51:6379 192.168.66.51 6379 @ myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:21.429 * +failover-state-send-slaveof-noone slave 192.168.66.51:6379 192.168.66.51 6379 @ myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:21.481 * +failover-state-wait-promotion slave 192.168.66.51:6379 192.168.66.51 6379 @ myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:22.431 # +promoted-slave slave 192.168.66.51:6379 192.168.66.51 6379 @ myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:22.432 # +failover-state-reconf-slaves master myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:22.486 * +slave-reconf-sent slave 192.168.66.53:6379 192.168.66.53 6379 @ myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:23.531 * +slave-reconf-inprog slave 192.168.66.53:6379 192.168.66.53 6379 @ myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:23.531 * +slave-reconf-done slave 192.168.66.53:6379 192.168.66.53 6379 @ myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:23.616 # +failover-end master myredis 192.168.66.52 6379
26354:X 15 Mar 2022 04:43:23.616 # +switch-master myredis 192.168.66.52 6379 192.168.66.51 6379
26354:X 15 Mar 2022 04:43:23.617 * +slave slave 192.168.66.53:6379 192.168.66.53 6379 @ myredis 192.168.66.51 6379
26354:X 15 Mar 2022 04:43:23.617 * +slave slave 192.168.66.52:6379 192.168.66.52 6379 @ myredis 192.168.66.51 6379

[root@vm66-52 ~]# tailf /opt/redis/26379/logs/26379.log
0327:X 15 Mar 2022 04:30:05.609 # +tilt #tilt mode entered
20327:X 15 Mar 2022 04:30:35.669 # -tilt #tilt mode exited
20327:X 15 Mar 2022 04:43:22.002 # +new-epoch 2
20327:X 15 Mar 2022 04:43:22.489 # +config-update-from sentinel 8478b39abc578f168e70765fbf106779772d0b6a 192.168.66.51 26379 @ myredis 192.168.66.52 6379
20327:X 15 Mar 2022 04:43:22.489 # +switch-master myredis 192.168.66.52 6379 192.168.66.51 6379
20327:X 15 Mar 2022 04:43:22.490 * +slave slave 192.168.66.53:6379 192.168.66.53 6379 @ myredis 192.168.66.51 6379
20327:X 15 Mar 2022 04:43:22.490 * +slave slave 192.168.66.52:6379 192.168.66.52 6379 @ myredis 192.168.66.51 6379
20327:X 15 Mar 2022 04:43:32.611 * +convert-to-slave slave 192.168.66.52:6379 192.168.66.52 6379 @ myredis 192.168.66.51 6379

[root@vm66-53 ~]# tailf /opt/redis/26379/logs/26379.log 
5655:X 15 Mar 2022 00:00:43.437 # +tilt #tilt mode entered
5655:X 15 Mar 2022 04:29:25.399 # +tilt #tilt mode entered
5655:X 15 Mar 2022 04:29:55.460 # -tilt #tilt mode exited
5655:X 15 Mar 2022 04:30:05.594 # +tilt #tilt mode entered
5655:X 15 Mar 2022 04:30:35.658 # -tilt #tilt mode exited
5655:X 15 Mar 2022 04:43:22.002 # +new-epoch 2
5655:X 15 Mar 2022 04:43:22.492 # +config-update-from sentinel 8478b39abc578f168e70765fbf106779772d0b6a 192.168.66.51 26379 @ myredis 192.168.66.52 6379
5655:X 15 Mar 2022 04:43:22.492 # +switch-master myredis 192.168.66.52 6379 192.168.66.51 6379
5655:X 15 Mar 2022 04:43:22.493 * +slave slave 192.168.66.53:6379 192.168.66.53 6379 @ myredis 192.168.66.51 6379
5655:X 15 Mar 2022 04:43:22.493 * +slave slave 192.168.66.52:6379 192.168.66.52 6379 @ myredis 192.168.66.51 6379

5. 查看节点的主从复制关系

主库没有同步的库,其他两个节点都复制 192.168.66.51 的主库

# 1. 查看当前谁是主库
[root@vm66-51 ~]# redis-cli -h 192.168.66.51 -p 26379 sentinel get-master-addr-by-name myredis
1) "192.168.66.51"
2) "6379"

# 2. 查看 slaveof 参数
[root@vm66-51 ~]# redis-cli -h 192.168.66.51 -p 6379 config get slaveof
1) "slaveof"
2) ""
[root@vm66-51 ~]# redis-cli -h 192.168.66.52 -p 6379 config get slaveof
1) "slaveof"
2) "192.168.66.51 6379"
[root@vm66-51 ~]# redis-cli -h 192.168.66.53 -p 6379 config get slaveof
1) "slaveof"
2) "192.168.66.51 6379"

6. 将权重值调整为默认值

将权重值调整为默认值,方便下次选举时作为判断条件

[root@vm66-51 ~]# redis-cli -h 192.168.66.51 -p 6379 config set slave-priority 100
OK
[root@vm66-51 ~]# redis-cli -h 192.168.66.52 -p 6379 config set slave-priority 100
OK
[root@vm66-51 ~]# redis-cli -h 192.168.66.53 -p 6379 config set slave-priority 100
OK

7. Redis 哨兵集群注意

1. 哨兵发起故障转移的条件是 master 节点失去联系,从节点挂掉不会发起故障转移
2. 主从关系不用写进配置文件
3. 主从关系建立好之后,就不要轻易改动配置文件了
4. 哨兵会自己维护配置文件,不需要手动修改
5. 如果主从的结构发生变化,哨兵之间会自动同步最新的消息并且自动更新配置文件