前言

       虽然自己已经使用redis已经很就了,自己搭建的3主3从redis集群也很稳定,没有出现过问题,但是考虑到服务器突发情况,还是简单研究研究redis几群的“扩容、缩容”。

 因为自己使用的redis-5.0.5版本,同时也网上查了查5.0.5版本的相关资料,同时添加了自己的一些操作和理解,在这里记录一下

一、环境准备

OS  hostname    IP  role    port
CentOS 7.6  redis01 10.4.7.100  master、slave    6379、16379、6380、16380
CentOS 7.6  redis02 10.4.7.101  master、slave    6379、16379、6380、16380
CentOS 7.6  redis03 10.4.7.102  master、slave    6379、16379、6380、16380

说明:此集群采用三个主机,每个主机均运行两个redis实例,也就是说,三个主机上,共运行六个redis实例,构建为集群,集群采用三主三从模式!

集群默认使用redis监听端口+10000作为集群间通信端口!

二、安装redis

无特殊说明,在redis01机器上操作即可!

#########################编译安装redis#########################
$ yum install gcc*
$ wget http://download.redis.io/releases/redis-5.0.5.tar.gz
$ tar zxf redis-5.0.5.tar.gz;cd redis-5.0.5/
$ make          # 只需编译,无需make install
####################准备工作目录及redis配置文件####################
$ mkdir /usr/local/redis/{bin,conf,data,logs} -p
$ mkdir /usr/local/redis/data/{6379,6380}
$ cp /root/redis-5.0.5/src/redis* /usr/local/redis/bin/
$ cp /root/redis-5.0.5/redis.conf /usr/local/redis/conf/
$ cd /usr/local/redis/bin/
$ rm -f *.{h,c,o}

三、修改配置文件

#########################生成6379实例的配置文件#########################
$ cd /usr/local/redis/conf/
# 可参考编译后生成的配置文件redis.conf,我这里就自己编写了
$ vim redis_6379.conf          # 实例6379配置文件如下
# 关于配置文件各项解释,可以自行搜索,或者看自带配置文件redis.conf中的注释
port 6379
bind 0.0.0.0
daemonize yes
pidfile /var/run/redis_6379.pid
cluster-enabled yes
cluster-config-file nodes-6379.conf
cluster-node-timeout 25000
# rdb持久化相关配置
save 900 1
save 300 10
save 60 10000
dbfilename dump_6379.rdb
dir /usr/local/redis/data/6379
stop-writes-on-bgsave-error no
rdbcompression yes
rdbchecksum yes
rdb-save-incremental-fsync yes
 
# aof持久化相关配置
appendonly yes
appendfilename "appendonly_6379.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-rewrite-incremental-fsync yes
 
 
loglevel notice
logfile /usr/local/redis/logs/redis-6379.logs
maxclients 15000
maxmemory 20gb
maxmemory-policy volatile-lru
protected-mode no
# 主从复制相关配置
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
replica-priority 100
repl-backlog-size 10mb
repl-timeout 120
# 集群密码认证(集群中所有节点密码必须一致,可不开启)
masterauth "123456"
requirepass "123456"
#########################生成6379实例的配置文件#########################
$ cp redis_6379.conf redis_6380.conf
$ sed -i 's/6379/6380/g' redis_6380.conf
将redis相关文件分发到其余两个机器上

$ for i in 10.4.7.10{1..2};do scp -r /usr/local/redis ${i}:/usr/local/;done
设置命令路径的环境变量(集群中所有集群都做)

$ echo 'export PATH=$PATH:/usr/local/redis/bin/' >> /etc/profile
$ source /etc/profile
$ redis-server -v     # 查看redis版本
Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=526c61b9b59aa5bc
至此,三个主机共6个实例的配置文件就准备好了,可以启动了(但是启动之前需要修复一下启动过程中日志产生的warning事件)。

四、解决启动过程中日志产生的warning事件

以下操作需要在三台主机上都执行一下!

4.1 解决最大打开文件数问题

$ ulimit -n    # 查看当前值
1024
$ echo '*     -     nofile      65535' >> /etc/security/limits.conf
# 修改后,重新登录即可生效,重新登录后再次查看当前值
$ ulimit -n
65535

4.2 解决TCP积压值过小问题

$ echo "net.core.somaxconn = 65535" > /etc/sysctl.d/redis.conf
$ sysctl -p /etc/sysctl.d/redis.conf   # 刷新使其生效
net.core.somaxconn = 1024

4.3 允许分配所有的物理内存

$ echo "vm.overcommit_memory = 1" >> /etc/sysctl.d/redis.conf
$ sysctl -p /etc/sysctl.d/redis.conf   # 刷新使其生效
net.core.somaxconn = 1024
vm.overcommit_memory = 1

4.4 解决内存透明大页警告warning问题

$ echo never > /sys/kernel/mm/transparent_hugepage/enabled
#上述指令只是当前生效,重启后就会失效,接下来改为永久生效
$ echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
$ chmod +x /etc/rc.d/rc.local
至此,reboot重启服务器也好,只是重启redis服务也好,都不会再报哪些warning问题了!

五、启动redis

$ redis-server /usr/local/redis/conf/redis_6379.conf
$ redis-server /usr/local/redis/conf/redis_6380.conf
$ ss -antp | grep 63

LISTEN     0      511          *:6379                     *:*                   users:(("redis-server",pid=13273,fd=6))
LISTEN     0      511          *:6380                     *:*                   users:(("redis-server",pid=13292,fd=6))
LISTEN     0      511          *:16379                    *:*                   users:(("redis-server",pid=13273,fd=9))
LISTEN     0      511          *:16380                    *:*                   users:(("redis-server",pid=13292,fd=9))

$ pkill -9 redis-server          # 可以使用该命令停止redis实例

六、创建集群

6.1 取消集群密码

        如果你的集群有密码,则需进行此步骤,若没有设置密码,则可忽略。此脚本在后面恢复集群密码,以及对redis集群扩缩容,都会用到。

$ cat redis_setpass.sh

#!/usr/bin/env bash
# 定义参与集群的IP
IPS=(
10.4.7.100
10.4.7.101
10.4.7.102
)
# 定义集群密码
PASS='123456'
# 定义每个节点的监听端口
PORTS=(
6379
6380
)
 
# delete password
del_pass() {
for ip in ${IPS[@]}
do
   for port in ${PORTS[@]}
   do
        redis-cli -c -h $ip -p $port -a $PASS config set masterauth ""
        redis-cli -c -h $ip -p $port -a $PASS config set requirepass ""
   done
   echo "$ip delete password"
done
}
 
add_pass() { 
for ip in ${IPS[@]}
do
   for port in ${PORTS[@]}
   do
        redis-cli -c -h $ip -p $port config set masterauth "$PASS"
        redis-cli -c -h $ip -p $port config set requirepass "$PASS"
   done
   echo "$ip add password"
done
}
 
env=$1
 
if [[ ${env} == "del" ]];then
    echo "del redis password"
    del_pass
elif [[ ${env} == "add" ]];then
    add_pass
else
    echo  "${env} not add || del "
    echo ' exit ..'
    exit
fi

$ chmod +x redis_setpass.sh
$ ./redis_setpass.sh del            # 删除集群密码

备注:虽然目测该脚本是用于删除集群密码(方面命令操作),但是如果客户端带有密码连接redis服务器的话,似乎会出现问题,所以我个人建议:命令行操作redis最好还是输入密码进程操作。

例如,本地连接6380端口:

./redis-cli -p 6380 -a "rd.pwd"

6.2 创建集群

以下操作在任意一台主机上进行即可!

$ redis-cli  --cluster create 10.4.7.100:6379 10.4.7.100:6380 10.4.7.101:6379 10.4.7.101:6380 10.4.7.102:6379 10.4.7.102:6380 --cluster-replicas 1

# 创建集群,需指定参与集群的实例,并指定replicas副本数为1
 

redis4.0 扩容 redis sds扩容_redis集群

6.3 测试集群

$ redis-cli -c -h 10.4.7.100 -p 6379    # 连接集群需要加-c参数
 
10.4.7.100:6379> cluster nodes          # 查看master及slave之间的对照关系
c81b92bf2a1ea24c0881c296176f6a48cdce4460 10.4.7.102:6379@16379 master - 0 1612245916000 5 connected 10923-16383
3e10873c57baf06acad788b815344edecc6b3428 10.4.7.100:6379@16379 myself,master - 0 1612245915000 1 connected 0-5460
82ff3b96e4f9bbe5e723fd066fe593294602bb02 10.4.7.102:6380@16380 slave 22311cc5373dff550d373f30b05dd41f2be50838 0 1612245916000 6 connected
a48d503ce893f4e4b3f58a81fd2380dde358db48 10.4.7.101:6380@16380 slave 3e10873c57baf06acad788b815344edecc6b3428 0 1612245915000 4 connected
95f8bd253511c7c177e5d5f4caa5367ab85ca54a 10.4.7.100:6380@16380 slave c81b92bf2a1ea24c0881c296176f6a48cdce4460 0 1612245916740 5 connected
22311cc5373dff550d373f30b05dd41f2be50838 10.4.7.101:6379@16379 master - 0 1612245913000 3 connected 5461-10922
 
10.4.7.100:6379> cluster info      # 查看集群状态
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:6647
cluster_stats_messages_pong_sent:6778
cluster_stats_messages_sent:13425
cluster_stats_messages_ping_received:6773
cluster_stats_messages_pong_received:6647
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:13425
# 查询key值
10.4.7.100:6379> set name zz      # 设置key
-> Redirected to slot [5798] located at 10.4.7.101:6379   # 此key被存储到了10.4.7.101:6379这个实例
OK
# 可以发现下面的命令提示符都自动换成了10.4.7.101:6379这个实例的
10.4.7.101:6379> get name 
"zz"
# 登录到其他节点,查看是否可以查询到刚刚的key值
$ redis-cli -c -p 6380 -h 10.4.7.102
10.4.7.102:6380> get name
-> Redirected to slot [5798] located at 10.4.7.101:6379
"zz"

      至于其高可用,及master和salve的故障自动切换,自行测试即可。正常情况下,master宕机后,slave会自动顶上成为新的master。待master恢复后,会变成新master节点的slave。

6.4 使用脚本恢复集群密码

$ ./redis_setpass.sh add

6.5 重建redis集群

        若需要重建redis集群,则需要删除rdb、aof这两种数据持久化文件(若没开启aof,则不会有aof持久化文件),以及群集配置文件,在此博文中,需要删除的文件如下:

$ tree /usr/local/redis/data/     # 要删除的文件都在data目录下了
/usr/local/redis/data/
├── 6379
│   ├── appendonly_6379.aof
│   ├── dump_6379.rdb
│   └── nodes-6379.conf
└── 6380
    ├── appendonly_6380.aof
    ├── dump_6380.rdb
    └── nodes-6380.conf
 
2 directories, 6 files
# 将数据文件删除后,重新执行redis 创建集群的命令即可!
$ redis-cli  --cluster create 10.4.7.100:6379 10.4.7.100:6380 10.4.7.101:6379 10.4.7.101:6380 10.4.7.102:6379 10.4.7.102:6380 --cluster-replicas 1

七、集群扩容

      当现有redis集群无法满足业务需求,需要扩容的话,可以按照如下进行配置(redis较高版本和低版本的扩容方式不太一样,如果你的集群是 3.x.x的,或者redis-cli 命令不支持集群扩容操)。

7.1 部署新的redis节点

       先准备好要加入集群中的redis实例,我这里的IP为 10.4.7.103,上面启动了两个实例,分别是 6379 和 6380。

$ scp -r /usr/local/redis 10.4.7.103:/usr/local/    # 在现有的redis节点上(10.4.7.100),将redis目录拷贝至新机器上
$ rm -rf /usr/local/redis/data/*/*    # 新机器上删除原本的数据文件
$ rm -rf /usr/local/redis/logs/*      # 新机器上删除原本的日志文件
###################解决启动日志中产生的warning事件是必做的###################
$ echo '*     -     nofile      65535' >> /etc/security/limits.conf
# 修改后,重新登录即可生效,重新登录后再次查看当前值
$ ulimit -n
65535
$ cat >> /etc/sysctl.d/redis.conf << EOF
net.core.somaxconn = 65535
vm.overcommit_memory = 1
EOF
$ sysctl -p /etc/sysctl.d/redis.conf   #刷新使其生效
 
$ echo never > /sys/kernel/mm/transparent_hugepage/enabled
$ echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
$ chmod +x /etc/rc.d/rc.local
# 配置环境变量
$ echo 'export PATH=$PATH:/usr/local/redis/bin/' >> /etc/profile
$ source /etc/profile
$ redis-server -v       # 查看redis版本
Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=37a632ba3989f893

7.2 启动新节点的redis实例

$ redis-server /usr/local/redis/conf/redis_6379.conf 
$ redis-server /usr/local/redis/conf/redis_6380.conf  
$ ss -anpt | grep 63
LISTEN     0      511          *:6379                     *:*                   users:(("redis-server",pid=8302,fd=6))
LISTEN     0      511          *:6380                     *:*                   users:(("redis-server",pid=8321,fd=6))
LISTEN     0      511          *:16379                    *:*                   users:(("redis-server",pid=8302,fd=9))
LISTEN     0      511          *:16380                    *:*                   users:(("redis-server",pid=8321,fd=9))

7.3 取消redis集群密码

# 脚本中需增加刚刚新增的redis的IP及端口(自行更改)
$ ./redis_setpass.sh del           # redis_setpass.sh 为上面的脚本

7.4 redis新节点加入集群

# 以下操作在任意一台可以连接到集群的节点上进行即可(10.4.7.100:6379  为集群中已存在的节点)
$ redis-cli --cluster add-node 10.4.7.103:6379 10.4.7.100:6379      # 添加 6379 实例
$ redis-cli --cluster add-node 10.4.7.103:6380 10.4.7.100:6379      # 添加 6380 实例
2021-02-02_143920

7.5 查询现有节点信息

$ redis-cli -c -h 10.4.7.100 -p 6379 cluster nodes

redis4.0 扩容 redis sds扩容_服务器_02

可以看到,10.4.7.103 的两个实例都已加入(都为master),但还没有分配slots。

7.6 给新加入的master分配solts

$ redis-cli --cluster reshard 10.4.7.100:6379
执行上述命令后,操作如下:

如果在下面选择 all 的话,请注意,你如果是计划扩容多个master到集群中,那么你每次需要指定的要分配的slots是不一样的。

假设集群现有master为 3个,那么增加第四个master时,要分配的slots为:16384 / 4 = 4096。如果继续增加第五个master,那么此时要分配的slots就应该为:16384 / 5 = 3276

redis4.0 扩容 redis sds扩容_redis4.0 扩容_03

redis4.0 扩容 redis sds扩容_redis_04

 因为redis 5现在还不支持自动平衡slot,所以需要自行计算需要移动的slot数量,并手动执行命令。

计算好slots,迁移多少个slots呢,16384/4=4096,那么就迁移4096个.

用命令行直接操作:

  1. ./redis-cli -p 6379 -a "rd.pwd" --cluster reshard 10.4.7.100:6379 --cluster-from c81b92bf2a1ea24c0881c296176f6a48cdce4460 --cluster-to 210174dd1a76e9d77b3348854dcfb97223374496 --cluster-slots 4096 --cluster-yes
  2. ./redis-cli -p 6379 -a "rd.pwd" --cluster reshard 10.4.7.100:6379 --cluster-from 3e10873c57baf06acad788b815344edecc6b3428 --cluster-to 210174dd1a76e9d77b3348854dcfb97223374496 --cluster-slots 4096 --cluster-yes
  3. ./redis-cli -p 6379 -a "rd.pwd" --cluster reshard 10.4.7.100:6379 --cluster-from 22311cc5373dff550d373f30b05dd41f2be50838 --cluster-to 210174dd1a76e9d77b3348854dcfb97223374496 --cluster-slots 4096 --cluster-yes

7.7 给新加入的master分配slave

$ redis-cli -h 10.4.7.103 -p 6380 -c      # 连接至新的slave实例
10.4.7.103:6380> cluster nodes            # 查询节点信息(找到那个没有从节点的 master的ID)
8a0605ab116aa6ce911468feeadafbe68440fb46 10.4.7.103:6380@16380 myself,master - 0 1612248723000 0 connected
210174dd1a76e9d77b3348854dcfb97223374496 10.4.7.103:6379@16379 master - 0 1612248724000 7 connected 0-1364 5461-6826 10923-12287
c81b92bf2a1ea24c0881c296176f6a48cdce4460 10.4.7.102:6379@16379 master - 0 1612248724004 5 connected 12288-16383
3e10873c57baf06acad788b815344edecc6b3428 10.4.7.100:6379@16379 master - 0 1612248726019 1 connected 1365-5460
a48d503ce893f4e4b3f58a81fd2380dde358db48 10.4.7.101:6380@16380 slave 3e10873c57baf06acad788b815344edecc6b3428 0 1612248725012 1 connected
95f8bd253511c7c177e5d5f4caa5367ab85ca54a 10.4.7.100:6380@16380 slave c81b92bf2a1ea24c0881c296176f6a48cdce4460 0 1612248724000 5 connected
22311cc5373dff550d373f30b05dd41f2be50838 10.4.7.101:6379@16379 master - 0 1612248722997 3 connected 6827-10922
82ff3b96e4f9bbe5e723fd066fe593294602bb02 10.4.7.102:6380@16380 slave 22311cc5373dff550d373f30b05dd41f2be50838 0 1612248724000 3 connected
# 注,由于我只增加了一台机器上的redis,主从都在一个机器上,所以只能这样
# 如果你要扩容多个节点,那么最好将slave和master分别放在不同机器上
 
10.4.7.103:6380> CLUSTER REPLICATE 210174dd1a76e9d77b3348854dcfb97223374496
# 指定复制 新master
 
10.4.7.103:6380> cluster nodes     # 确认集群中的master和slave无异常
8a0605ab116aa6ce911468feeadafbe68440fb46 10.4.7.103:6380@16380 myself,slave 210174dd1a76e9d77b3348854dcfb97223374496 0 1612248895000 0 connected
210174dd1a76e9d77b3348854dcfb97223374496 10.4.7.103:6379@16379 master - 0 1612248895000 7 connected 0-1364 5461-6826 10923-12287
c81b92bf2a1ea24c0881c296176f6a48cdce4460 10.4.7.102:6379@16379 master - 0 1612248898000 5 connected 12288-16383
3e10873c57baf06acad788b815344edecc6b3428 10.4.7.100:6379@16379 master - 0 1612248898347 1 connected 1365-5460
a48d503ce893f4e4b3f58a81fd2380dde358db48 10.4.7.101:6380@16380 slave 3e10873c57baf06acad788b815344edecc6b3428 0 1612248895000 1 connected
95f8bd253511c7c177e5d5f4caa5367ab85ca54a 10.4.7.100:6380@16380 slave c81b92bf2a1ea24c0881c296176f6a48cdce4460 0 1612248897000 5 connected
22311cc5373dff550d373f30b05dd41f2be50838 10.4.7.101:6379@16379 master - 0 1612248897338 3 connected 6827-10922
82ff3b96e4f9bbe5e723fd066fe593294602bb02 10.4.7.102:6380@16380 slave 22311cc5373dff550d373f30b05dd41f2be50838 0 1612248895326 3 connected

      当然,直接一个命令可以搞定:在随便原有一台redis服务器中(例如:10.4.7.100:6379),

10.4.7.100:6379 。因为,集群会自动为从节点数最少的主节点创建从节点,所以这里不需要指定主节点

./redis-cli -p 6379 -a "rd.pwd"  --cluster add-node 10.4.7.103:6380 10.4.7.100:6379 --cluster-slave

或指定主节点

./redis-cli -p 6379 -a "rd.pwd"  --cluster add-node 10.4.7.103:6380 10.4.7.100:6379 --cluster-slave --cluster-master-id 210174dd1a76e9d77b3348854dcfb97223374496

7.8 集群恢复密码

$ ./redis_setpass.sh add

八、集群缩容

假设我们要将上面新增的 10.4.7.103 的6379和6380两个实例从集群中移除,那么要怎样操作呢?

便于操作,先把集群密码取消!

$ ./redis_setpass.sh del

8.1 移除slave

$ redis-cli -c -h 10.4.7.100 -p 6379 cluster nodes     # 查到你要移除的  slave的 节点ID
# 我这次需要移除的slaveID为:8a0605ab116aa6ce911468feeadafbe68440fb46

210174dd1a76e9d77b3348854dcfb97223374496 10.4.7.103:6379@16379 master - 0 1612249324000 7 connected 0-1364 5461-6826 10923-12287
c81b92bf2a1ea24c0881c296176f6a48cdce4460 10.4.7.102:6379@16379 master - 0 1612249324000 5 connected 12288-16383
8a0605ab116aa6ce911468feeadafbe68440fb46 10.4.7.103:6380@16380 slave 210174dd1a76e9d77b3348854dcfb97223374496 0 1612249327362 7 connected
3e10873c57baf06acad788b815344edecc6b3428 10.4.7.100:6379@16379 myself,master - 0 1612249324000 1 connected 1365-5460
82ff3b96e4f9bbe5e723fd066fe593294602bb02 10.4.7.102:6380@16380 slave 22311cc5373dff550d373f30b05dd41f2be50838 0 1612249325346 6 connected
a48d503ce893f4e4b3f58a81fd2380dde358db48 10.4.7.101:6380@16380 slave 3e10873c57baf06acad788b815344edecc6b3428 0 1612249326351 4 connected
95f8bd253511c7c177e5d5f4caa5367ab85ca54a 10.4.7.100:6380@16380 slave c81b92bf2a1ea24c0881c296176f6a48cdce4460 0 1612249324339 5 connected
22311cc5373dff550d373f30b05dd41f2be50838 10.4.7.101:6379@16379 master - 0 1612249325000 3 connected 6827-10922

# 开始移除
$ redis-cli --cluster del-node 10.4.7.100:6379 8a0605ab116aa6ce911468feeadafbe68440fb46

>>> Removing node 8a0605ab116aa6ce911468feeadafbe68440fb46 from cluster 10.4.7.100:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
# 从输出信息可以看到已经移除成功了,并且要下线的redis实例已被停止。

$ redis-cli -c -h 10.4.7.100 -p 6379 cluster nodes  # 查看节点信息进行确认

210174dd1a76e9d77b3348854dcfb97223374496 10.4.7.103:6379@16379 master - 0 1612249527000 7 connected 0-1364 5461-6826 10923-12287
c81b92bf2a1ea24c0881c296176f6a48cdce4460 10.4.7.102:6379@16379 master - 0 1612249529000 5 connected 12288-16383
3e10873c57baf06acad788b815344edecc6b3428 10.4.7.100:6379@16379 myself,master - 0 1612249530000 1 connected 1365-5460
82ff3b96e4f9bbe5e723fd066fe593294602bb02 10.4.7.102:6380@16380 slave 22311cc5373dff550d373f30b05dd41f2be50838 0 1612249527000 6 connected
a48d503ce893f4e4b3f58a81fd2380dde358db48 10.4.7.101:6380@16380 slave 3e10873c57baf06acad788b815344edecc6b3428 0 1612249529931 4 connected
95f8bd253511c7c177e5d5f4caa5367ab85ca54a 10.4.7.100:6380@16380 slave c81b92bf2a1ea24c0881c296176f6a48cdce4460 0 1612249528924 5 connected
22311cc5373dff550d373f30b05dd41f2be50838 10.4.7.101:6379@16379 master - 0 1612249530940 3 connected 6827-10922

8.2 移除master

        我们尝试删除之前加入的主节点6379,这个步骤相对比较麻烦一些,因为主节点的里面是有分配了hash槽的,所以我们这里必须先把6379里的hash槽放入到其他的可用主节点中去,然后再进行移除节点操作,不然会出现数据丢失问题(最好将要下线的master的slots数量平均分配至其他master上,所以只能一次分配部分solts槽,有几个master就要分配几次):

      由于节点210174dd1a76e9d77b3348854dcfb97223374496 有4096个slot,那么3个节点平均可以分配4096/3=1365.3,不能有小数,那么我们可以分别分配1365,1365,1366给其他3个master节点

$ redis-cli --cluster reshard 10.4.7.100:6379

redis4.0 扩容 redis sds扩容_redis_05

按照上述方式,多分配几次,直到将要下线的master的solts完全分配出去。

或者使用命令行直接操作:

  1. ./redis-cli -p 6379 -a "rd.pwd" --cluster reshard 10.4.7.100:6379 --cluster-from 210174dd1a76e9d77b3348854dcfb97223374496 --cluster-to c81b92bf2a1ea24c0881c296176f6a48cdce4460  --cluster-slots 1365 --cluster-yes
  2. ./redis-cli -p 6379 -a "rd.pwd" --cluster reshard 10.4.7.100:6379 --cluster-from 210174dd1a76e9d77b3348854dcfb97223374496 --cluster-to 3e10873c57baf06acad788b815344edecc6b3428  --cluster-slots 1365 --cluster-yes
  3. ./redis-cli -p 6379 -a "rd.pwd" --cluster reshard 10.4.7.100:6379 --cluster-from 210174dd1a76e9d77b3348854dcfb97223374496 --cluster-to 22311cc5373dff550d373f30b05dd41f2be50838 --cluster-slots 1366 --cluster-yes

$ redis-cli -c -h 10.4.7.100 -p 6379 cluster nodes    

# 再次查看集群状态,确认要下线的master没有任何solts槽

210174dd1a76e9d77b3348854dcfb97223374496 10.4.7.103:6379@16379 master - 0 1612250066245 7 connected
c81b92bf2a1ea24c0881c296176f6a48cdce4460 10.4.7.102:6379@16379 master - 0 1612250064000 9 connected 5461-6825 12288-16383
3e10873c57baf06acad788b815344edecc6b3428 10.4.7.100:6379@16379 myself,master - 0 1612250065000 8 connected 0-5460
82ff3b96e4f9bbe5e723fd066fe593294602bb02 10.4.7.102:6380@16380 slave 22311cc5373dff550d373f30b05dd41f2be50838 0 1612250068261 10 connected
a48d503ce893f4e4b3f58a81fd2380dde358db48 10.4.7.101:6380@16380 slave 3e10873c57baf06acad788b815344edecc6b3428 0 1612250066000 8 connected
95f8bd253511c7c177e5d5f4caa5367ab85ca54a 10.4.7.100:6380@16380 slave c81b92bf2a1ea24c0881c296176f6a48cdce4460 0 1612250067253 9 connected
22311cc5373dff550d373f30b05dd41f2be50838 10.4.7.101:6379@16379 master - 0 1612250065238 10 connected 6826-12287

# 移除要下线的主机点,redis进程会自动关闭
$ redis-cli --cluster del-node 10.4.7.100:6379 210174dd1a76e9d77b3348854dcfb97223374496

>>> Removing node 210174dd1a76e9d77b3348854dcfb97223374496 from cluster 10.4.7.100:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.

至此,那台10.4.7.103 那台机器可以光荣退休了!

8.3 检查集群状态

$ redis-cli --cluster check 10.4.7.100:6379

10.4.7.100:6379 (3e10873c...) -> 0 keys | 5461 slots | 1 slaves.
10.4.7.102:6379 (c81b92bf...) -> 1 keys | 5461 slots | 1 slaves.
10.4.7.101:6379 (22311cc5...) -> 0 keys | 5462 slots | 1 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 10.4.7.100:6379)
M: 3e10873c57baf06acad788b815344edecc6b3428 10.4.7.100:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
M: c81b92bf2a1ea24c0881c296176f6a48cdce4460 10.4.7.102:6379
   slots:[5461-6825],[12288-16383] (5461 slots) master
   1 additional replica(s)
S: 82ff3b96e4f9bbe5e723fd066fe593294602bb02 10.4.7.102:6380
   slots: (0 slots) slave
   replicates 22311cc5373dff550d373f30b05dd41f2be50838
S: a48d503ce893f4e4b3f58a81fd2380dde358db48 10.4.7.101:6380
   slots: (0 slots) slave
   replicates 3e10873c57baf06acad788b815344edecc6b3428
S: 95f8bd253511c7c177e5d5f4caa5367ab85ca54a 10.4.7.100:6380
   slots: (0 slots) slave
   replicates c81b92bf2a1ea24c0881c296176f6a48cdce4460
M: 22311cc5373dff550d373f30b05dd41f2be50838 10.4.7.101:6379
   slots:[6826-12287] (5462 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...

[OK] All 16384 slots covered.

8.4 集群恢复密码

# 当然需要把刚才移除的节点的IP地址从脚本删除
$ ./redis_setpass.sh add     # 还是执行上面的脚本