分片集群结构
主从和哨兵可以借鉴高可用,高并发读的问题。但是依然有两个问题没有解决:
1. 海量数据存储问题
2. 高并发写的问题
使用分片集群可以很好的解决以上2个问题,分片集群的特征如下:
1. 集群中有多个master,每个master保存不同的数据
2. 每个master主节点都可以有多个slave从节点
3. master直接通过ping来感知彼此的健康状态
4. 客户端的访问都可以转发到任意节点,最终转发到正确的节点。
 master
M: 2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002
slots:[5461-10922] (5462 slots) master
M: b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003
slots:[10923-16383] (5461 slots) master
S: 8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004
replicates b2aa644274742bf38849ab52f9b27f6883cad7dc
S: f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005
replicates 131fc7b7608aac3853a62abde00925d46db42abc
S: 466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006
replicates 2bba2a3ab435001137769f81233137c4cd67f59c
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.# 可以看到3主3从配置成功,这里的M(master)主节点;S(slave)从节点
>>> Performing Cluster Check (using node 192.168.25.129:7001)
M: 131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006
slots: (0 slots) slave
replicates 2bba2a3ab435001137769f81233137c4cd67f59c
S: f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005
slots: (0 slots) slave
replicates 131fc7b7608aac3853a62abde00925d46db42abc
M: 2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004
slots: (0 slots) slave
replicates b2aa644274742bf38849ab52f9b27f6883cad7dc
M: b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
随便进入一个Redis查看集群状态
root@f7cb69c25237:/data# redis-cli -h 192.168.25.129 -p 7001 cluster nodes
466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006@17006 slave 2bba2a3ab435001137769f81233137c4cd67f59c 0 1636888476793 2 connected
f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005@17005 slave 131fc7b7608aac3853a62abde00925d46db42abc 0 1636888475776 1 connected
2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002@17002 master - 0 1636888476590 2 connected 5461-10922
8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004@17004 slave b2aa644274742bf38849ab52f9b27f6883cad7dc 0 1636888477504 3 connected
131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001@17001 myself,master - 0 1636888477000 1 connected 0-5460
b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003@17003 master - 0 1636888477807 3 connected 10923-16383
测试集群搭建是否正确
# -h 是主机地址 -p是redis节点的端口
root@ce67f012b23e:/data# redis-cli -h 192.168.25.129 -p 7005
# 设置值发现报错,此时需要增加-c,如下
192.168.25.129:7005> set name zs
(error) MOVED 5798 192.168.25.129:7002
192.168.25.129:7005> exit
# 重新进入
root@ce67f012b23e:/data# redis-cli -c -h 192.168.25.129 -p 7005
# 写入值
192.168.25.129:7005> set name zs
# 计算出name的值在插槽的位置是5798,所以重定向到7002的节点保存
# 插槽的分段值可以在Redis查看集群状态打印的内容中看到
-> Redirected to slot [5798] located at 192.168.25.129:7002
OK
192.168.25.129:7002> get name
"zs"
192.168.25.129:7002>
至此,集群搭建成功!
散列插槽
Redis会把每一个master节点映射到0~16383,共16384个插槽(hash slot)上,查看集群信息时可以看到
数据key不是与节点绑定,而是与插槽绑定。redis会根据key的有效部分计算插槽值,分两种情况:
- key中包含“{}”,且“{}”中至少包含一个字符,“{}”中的部分是有效部分。
- key中不包含“{}”,则整个key都是有效部分
插槽值是利用CRC16算法得到一个hash值,然后对16384取余,得到的结果就是slot值。
集群伸缩(增减Redis节点)
为集群中增加一个节点
重新启动一个容器
[root@192 redis]# docker run -d --name redis7007 -p 7007:7007 -p 1700:17007 -v /home/docker/redis/7007/conf/redis.conf:/etc/local/redis/redis.conf -v /home/docker/redis/7007/data/:/data redis /etc/local/redis/redis.conf
825dbdda5ad9823e43d3754741051e91800ed15e38f25ef0eb25febbd415ad45
将启动的这个容器加入到Redis集群中
# add-node:给集群中新增节点
# 192.168.25.129:7007:新增的节点地址
#192.168.25.129:7001:随便指定的一个已经从在的节点
root@825dbdda5ad9:/data# redis-cli --cluster add-node 192.168.25.129:7007 192.168.25.129:7001
>>> Adding node 192.168.25.129:7007 to cluster 192.168.25.129:7001
>>> Performing Cluster Check (using node 192.168.25.129:7001)
M: 131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006
slots: (0 slots) slave
replicates 2bba2a3ab435001137769f81233137c4cd67f59c
S: f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005
slots: (0 slots) slave
replicates 131fc7b7608aac3853a62abde00925d46db42abc
M: 2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004
slots: (0 slots) slave
replicates b2aa644274742bf38849ab52f9b27f6883cad7dc
M: b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
#标示7007节点已经成功加入到集群中
>>> Send CLUSTER MEET to node 192.168.25.129:7007 to make it join the cluster.
[OK] New node added correctly.
查看Redis节点信息
# 随便输入一个节点信息都可以查询
root@f7cb69c25237:/data# redis-cli -p 7001 cluster nodes
f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005@17005 slave 131fc7b7608aac3853a62abde00925d46db42abc 0 1636941655000 1 connected
2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002@17002 master - 0 1636941656532 2 connected 5461-10922
8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004@17004 slave b2aa644274742bf38849ab52f9b27f6883cad7dc 0 1636941656532 3 connected
b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003@17003 master - 0 1636941656000 3 connected 10923-16383
37b58e7256c033182c4e47cc6d102e2c987f7ee6 192.168.25.129:7007@17007 master - 0 1636941656000 0 connected
466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006@17006 slave 2bba2a3ab435001137769f81233137c4cd67f59c 0 1636941656000 2 connected
131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001@17001 myself,master - 0 1636941656000 1 connected 0-5460
从上边查询的结果可以看到查询到了7007的这个节点,但是这个节点未被分配插槽。
分配插槽
如果不知道命令可以用 redis-cli --cluster help查看帮助说明
# 可以看到num这个key在7001中插槽的2765这个位置
root@825dbdda5ad9:/data# redis-cli -c -h 192.168.25.129 -p 7004
192.168.25.129:7004> get num
-> Redirected to slot [2765] located at 192.168.25.129:7001
"123"
# 给7007节点分配0-3000的插槽
root@f7cb69c25237:/data# redis-cli --cluster reshard 192.168.25.129:7001
>>> Performing Cluster Check (using node 192.168.25.129:7001)
M: 131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005
slots: (0 slots) slave
replicates 131fc7b7608aac3853a62abde00925d46db42abc
M: 2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004
slots: (0 slots) slave
replicates b2aa644274742bf38849ab52f9b27f6883cad7dc
M: b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
M: 37b58e7256c033182c4e47cc6d102e2c987f7ee6 192.168.25.129:7007
slots: (0 slots) master
S: 466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006
slots: (0 slots) slave
replicates 2bba2a3ab435001137769f81233137c4cd67f59c
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
# 需要移动的插槽范围,这里是1 to 16384
How many slots do you want to move (from 1 to 16384)? 3000
# 需要分配节点的ID,这里是7007所以复制7007的节点ID
What is the receiving node ID? 37b58e7256c033182c4e47cc6d102e2c987f7ee6
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
# 需要从那个节点分配,这里从7001分配,所以写7001的节点ID
Source node #1: 131fc7b7608aac3853a62abde00925d46db42abc
# 结束
Source node #2: done
Source node #2: done
# 以下就开始给7007分配插槽了
Ready to move 3000 slots.
Source nodes:
M: 131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
Destination node:
M: 37b58e7256c033182c4e47cc6d102e2c987f7ee6 192.168.25.129:7007
slots: (0 slots) master
Resharding plan:
Moving slot 0 from 131fc7b7608aac3853a62abde00925d46db42abc
Moving slot 1 from 131fc7b7608aac3853a62abde00925d46db42abc
Moving slot 2 from 131fc7b7608aac3853a62abde00925d46db42abc
.......................
Moving slot 2996 from 192.168.25.129:7001 to 192.168.25.129:7007:
Moving slot 2997 from 192.168.25.129:7001 to 192.168.25.129:7007:
Moving slot 2998 from 192.168.25.129:7001 to 192.168.25.129:7007:
Moving slot 2999 from 192.168.25.129:7001 to 192.168.25.129:7007:
至此插槽分配完成!
测试是否分配成功
# 查看节点信息,可以看到 7007节点的插槽位置是0-2999
root@f7cb69c25237:/data# redis-cli -p 7001 cluster nodes
f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005@17005 slave 131fc7b7608aac3853a62abde00925d46db42abc 0 1636942415275 1 connected
2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002@17002 master - 0 1636942415580 2 connected 5461-10922
8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004@17004 slave b2aa644274742bf38849ab52f9b27f6883cad7dc 0 1636942415073 3 connected
b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003@17003 master - 0 1636942416294 3 connected 10923-16383
37b58e7256c033182c4e47cc6d102e2c987f7ee6 192.168.25.129:7007@17007 master - 0 1636942416801 7 connected 0-2999
466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006@17006 slave 2bba2a3ab435001137769f81233137c4cd67f59c 0 1636942416000 2 connected
131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001@17001 myself,master - 0 1636942414000 1 connected 3000-5460
# 测试数据,获取num;每增加7007之前num是在7001的2765插槽位置存放
root@f7cb69c25237:/data# redis-cli -c -h 192.168.25.129 -p 7007
192.168.25.129:7007> get num
"123"
故障转移
集群中如果一个master宕机
自动故障转移
- 首先该实例与集群其他实例失去连接
- 然后疑似宕机状态
- 最后确定下线,会自动从这个master节点对应的从节点中选一个作为新的主节点
- 等到宕机恢复后就会成为一个子节点
故障恢复以后自动变为从节点,如下图:
数据迁移
利用cluster failover命令可以手动让集群中的某个master宕机,切换到执行cluster failover命令的这个slave节点,实现无感知的数据迁移。
手动的Failover支持三种不同的模式:
- 缺省:自动做数据数据同步,如下图
- force:省略了对offset的一致性校验
- takeover:直接执行上图的第五步,忽略数据一致,忽略master状态和其他master意见。
# 将7003设置成master主节点
root@f7cb69c25237:/data# redis-cli -h 192.168.25.129 -p 7003
192.168.25.129:7003> cluster failover
OK
192.168.25.129:7003> exit
root@f7cb69c25237:/data# redis-cli -p 7001 cluster nodes
f3e56d116c1b2ad41d326ea3cd7dfb9ff4231ecd 192.168.25.129:7005@17005 slave 131fc7b7608aac3853a62abde00925d46db42abc 0 1636944453000 1 connected
2bba2a3ab435001137769f81233137c4cd67f59c 192.168.25.129:7002@17002 master - 0 1636944453000 2 connected 5461-10922
8344d2abd5585b130bab7473aea806f2c160a4aa 192.168.25.129:7004@17004 slave b2aa644274742bf38849ab52f9b27f6883cad7dc 0 1636944454592 9 connected
b2aa644274742bf38849ab52f9b27f6883cad7dc 192.168.25.129:7003@17003 master - 0 1636944453574 9 connected 10923-16383
37b58e7256c033182c4e47cc6d102e2c987f7ee6 192.168.25.129:7007@17007 master - 0 1636944453065 7 connected 0-2999
466f5acac9d64763b4076df58aca29ab545d0137 192.168.25.129:7006@17006 slave 2bba2a3ab435001137769f81233137c4cd67f59c 0 1636944454082 2 connected
131fc7b7608aac3853a62abde00925d46db42abc 192.168.25.129:7001@17001 myself,master - 0 1636944452000 1 connected 3000-5460
RedisTemplate访问分片集群
RedisTemplate底层同样基于lettuce实现了分片集群的支持,而使用的步骤和哨兵模式基本一致。
引入依赖
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
配置分片集群地址
spring:
redis:
#分片集群模式
cluster:
nodes:
- 192.168.25.129:7001
- 192.168.25.129:7002
- 192.168.25.129:7003
- 192.168.25.129:7004
- 192.168.25.129:7005
- 192.168.25.129:7006
- 192.168.25.129:7007
配置读写分离
@Bean
public LettuceClientConfigurationBuilderCustomizer configurationBuilderCustomizer(){
return clientConfigurationBuilder -> clientConfigurationBuilder.readFrom(ReadFrom.REPLICA_PREFERRED);
}
查看日志
11-15 10:58:23:481 DEBUG 22664 --- [nio-8080-exec-2] i.l.core.cluster.RedisClusterClient : connectCluster([RedisURI [host='192.168.25.129', port=7001], RedisURI [host='192.168.25.129', port=7002], RedisURI [host='192.168.25.129', port=7003], RedisURI [host='192.168.25.129', port=7004], RedisURI [host='192.168.25.129', port=7005], RedisURI [host='192.168.25.129', port=7006], RedisURI [host='192.168.25.129', port=7007]])