文章目录
- 1. 什么是分片
- 2. 为什么要分片
- 3. 分片的工作原理
- 4. 分片集群搭建
- 4.1 配置 并启动config 节点集群
- 4.2 配置shard集群
- 4.3 配置和启动 路由节点
- 4.4 mongos(路由)中添加分片节点
- 4.5 开启数据库和集合分片(指定片键)
- 4.6 向集合中插入数据测试
- 4.7 验证分片效果
1. 什么是分片
分片(sharding)是MongoDB用来将大型集合水平分割到不同服务器(或者复制集)上所采用的方法。 不需要功能强大的大型计算机就可以存储更多的数据,处理更大的负载。
2. 为什么要分片
- 存储容量需求超出单机磁盘容量。
- 活跃的数据集超出单机内存容量,导致很多请求都要从磁盘读取数据,影响性能。
- IOPS超出单个MongoDB节点的服务能力,随着数据的增长,单机实例的瓶颈会越来越明显。
- 副本集具有节点数量限制。
垂直扩展:增加更多的CPU和存储资源来扩展容量。
水平扩展:将数据集分布在多个服务器上。水平扩展即分片。
3. 分片的工作原理
分片集群由以下3个服务组成:
Shards Server: 每个shard由一个或多个mongod进程组成,用于存储数据。
Router Server: 数据库集群的请求入口,所有请求都通过Router(mongos)进行协调,不需要在应用程序添加一个路由选择器,Router(mongos)就是一个请求分发中心它负责把应用程序的请求转发到对应的 Shard服务器上。
Config Server: 配置服务器。存储所有数据库元信息(路由、分片)的配置。
片键(shard key)
为了在数据集合中分配文档,MongoDB使用分片主键分割集合。
区块(chunk)
在一个shard server内部,MongoDB还是会把数据分为chunks,每个chunk代表这个shard server内部一部分数据。MongoDB分割分片数据到区块,每一个区块包含基于分片主键的左闭右开的 区间范围。
分片策略
- 范围分片(Range based sharding)
范围分片是基于分片主键的值切分数据,每一个区块将会分配到一个范围。
范围分片适合满足在一定范围内的查找,例如查找X的值在[20,30)之间的数据,mongo 路由根据Config server中存储的元数据,可以直接定位到指定的shard的Chunk中。
缺点: 如果shard key有明显递增(或者递减)趋势,则新插入的文档多会分布到同一个chunk,无法扩展写的能力。
- hash分片(Hash based sharding)
Hash分片是计算一个分片主键的hash值,每一个区块将分配一个范围的hash值。
Hash分片与范围分片互补,能将文档随机的分散到各个chunk,充分的扩展写能力,弥补了范围 分片的不足,缺点是不能高效的服务范围查询,所有的范围查询要分发到后端所有的Shard才能找 出满足条件的文档。
- 组合片键 A + B(散列思想 不能是直接hash)
数据库中没有比较合适的片键供选择,或者是打算使用的片键基数太小(即变化少如星期只有7天 可变化),可以选另一个字段使用组合片键,甚至可以添加冗余字段来组合。一般是粗粒度+细粒 度进行组合。
- 合理的选择shard key
无非从两个方面考虑,数据的查询和写入,最好的效果就是数据查询时能命中更少的分片,数据写入时 能够随机的写入每个分片,关键在于如何权衡性能和负载。
4. 分片集群搭建
4.1 配置 并启动config 节点集群
节点1 config-17017.conf
# 数据库文件位置
dbpath=/usr/local/mongodb/config/config1
#日志文件位置
logpath=/usr/local/mongodb/config/logs/config1.log
# 以追加方式写入日志
logappend=true
# 是否以守护进程方式运行
fork = true
bind_ip=0.0.0.0
port = 17017
# 表示是一个配置服务器
configsvr=true
#配置服务器副本集名称
replSet=configsvr
节点2 config-17018.conf
# 数据库文件位置
dbpath=/usr/local/mongodb/config/config2
#日志文件位置
logpath=/usr/local/mongodb/config/logs/config2.log
# 以追加方式写入日志
logappend=true
# 是否以守护进程方式运行
fork = true
bind_ip=0.0.0.0
port = 17018
# 表示是一个配置服务器
configsvr=true
#配置服务器副本集名称
replSet=configsvr
节点3 config-17019.conf
# 数据库文件位置
dbpath=/usr/local/mongodb/config/config3
#日志文件位置
logpath=/usr/local/mongodb/config/logs/config3.log
# 以追加方式写入日志
logappend=true
# 是否以守护进程方式运行
fork = true
bind_ip=0.0.0.0
port = 17019
# 表示是一个配置服务器
configsvr=true
#配置服务器副本集名称
replSet=configsvr
启动配置节点
./bin/mongod -f config/config-17017.conf
./bin/mongod -f config/config-17018.conf
./bin/mongod -f config/config-17019.conf
进入任意节点的mongo shell 并添加 配置节点集群 注意use admin
./bin/mongo --port 17017
use admin
var cfg ={"_id":"configsvr",
"members":[
{"_id":1,"host":"127.0.0.1:17017"},
{"_id":2,"host":"127.0.0.1:17018"},
{"_id":3,"host":"127.0.0.1:17019"}
]};
rs.initiate(cfg)
4.2 配置shard集群
shard1集群搭建37017到37019
dbpath=/usr/local/mongodb/shard/shard1/shard1-37017
bind_ip=0.0.0.0
port=37017
fork=true
logpath=/usr/local/mongodb/shard/shard1/shard1-37017.log
replSet=shard1
shardsvr=true
dbpath=/usr/local/mongodb/shard/shard1/shard1-37018
bind_ip=0.0.0.0
port=37018
fork=true
logpath=/usr/local/mongodb/shard/shard1/shard1-37018.log
replSet=shard1
shardsvr=true
dbpath=/usr/local/mongodb/shard/shard1/shard1-37019
bind_ip=0.0.0.0
port=37019
fork=true
logpath=/usr/local/mongodb/shard/shard1/shard1-37019.log
replSet=shard1
shardsvr=true
启动每个mongod 然后进入其中一个进行集群配置
./bin/mongod -f shard/shard1/shard1-37017.conf
./bin/mongod -f shard/shard1/shard1-37018.conf
./bin/mongod -f shard/shard1/shard1-37019.conf
var cfg ={"_id":"shard1",
"protocolVersion" : 1,
"members":[
{"_id":1,"host":"127.0.0.1:37017"},
{"_id":2,"host":"127.0.0.1:37018"},
{"_id":3,"host":"127.0.0.1:37019"}
]};
rs.initiate(cfg)
rs.status()
shard2集群搭建47017到47019
dbpath=/usr/local/mongodb/shard/shard2/shard2-47017
bind_ip=0.0.0.0
port=47017
fork=true
logpath=/usr/local/mongodb/shard/shard2/shard2-47017.log
replSet=shard2
shardsvr=true
dbpath=/usr/local/mongodb/shard/shard2/shard2-47018
bind_ip=0.0.0.0
port=47018
fork=true
logpath=/usr/local/mongodb/shard/shard2/shard2-47018.log
replSet=shard2
shardsvr=true
dbpath=/usr/local/mongodb/shard/shard2/shard2-47019
bind_ip=0.0.0.0
port=47019
fork=true
logpath=/usr/local/mongodb/shard/shard2/shard2-47019.log
replSet=shard2
shardsvr=true
启动每个mongod 然后进入其中一个进行集群配置
./bin/mongod -f shard/shard2/shard2-47017.conf
./bin/mongod -f shard/shard2/shard2-47018.conf
./bin/mongod -f shard/shard2/shard2-47019.conf
var cfg ={"_id":"shard2",
"protocolVersion" : 1,
"members":[
{"_id":1,"host":"127.0.0.1:47017"},
{"_id":2,"host":"127.0.0.1:47018"},
{"_id":3,"host":"127.0.0.1:47019"}
]};
rs.initiate(cfg)
rs.status()
4.3 配置和启动 路由节点
route-27017.conf
port=27017
bind_ip=0.0.0.0
fork=true
logpath=/usr/local/mongodb/route/route.log
configdb=configsvr/127.0.0.1:17017,127.0.0.1:17018,127.0.0.1:17019
启动路由节点使用 mongos (注意不是mongod)
./bin/mongos -f route/route-27017.conf
4.4 mongos(路由)中添加分片节点
进入路由mongos
mongo --port 27017
sh.status()
sh.addShard("shard1/127.0.0.1:37017,127.0.0.1:37018,127.0.0.1:37019");
sh.addShard("shard2/127.0.0.1:47017,127.0.0.1:47018,127.0.0.1:47019");
sh.status()
[root@rpp mongodb]# ./bin/mongo -port=27017
MongoDB shell version v4.2.9
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("6639aef1-6bc0-434d-a081-63e55f2578a4") }
MongoDB server version: 4.2.9
Server has startup warnings:
2020-10-10T18:22:15.444+0800 I CONTROL [main]
2020-10-10T18:22:15.444+0800 I CONTROL [main] ** WARNING: Access control is not enabled for the database.
2020-10-10T18:22:15.444+0800 I CONTROL [main] ** Read and write access to data and configuration is unrestricted.
2020-10-10T18:22:15.444+0800 I CONTROL [main] ** WARNING: You are running this process as the root user, which is not recommended.
2020-10-10T18:22:15.444+0800 I CONTROL [main]
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5f8183ddb1aac6cff1404a62")
}
shards:
active mongoses:
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
mongos> sh.addShard("shard1/127.0.0.1:37017,127.0.0.1:37018,127.0.0.1:37019");
{
"shardAdded" : "shard1",
"ok" : 1,
"operationTime" : Timestamp(1602325711, 7),
"$clusterTime" : {
"clusterTime" : Timestamp(1602325711, 7),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.addShard("shard2/127.0.0.1:47017,127.0.0.1:47018,127.0.0.1:47019");
{
"shardAdded" : "shard2",
"ok" : 1,
"operationTime" : Timestamp(1602325717, 7),
"$clusterTime" : {
"clusterTime" : Timestamp(1602325717, 7),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5f8183ddb1aac6cff1404a62")
}
shards:
{ "_id" : "shard1", "host" : "shard1/127.0.0.1:37017,127.0.0.1:37018,127.0.0.1:37019", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/127.0.0.1:47017,127.0.0.1:47018,127.0.0.1:47019", "state" : 1 }
active mongoses:
"4.2.9" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
mongos>
4.5 开启数据库和集合分片(指定片键)
继续使用mongos完成分片开启和分片大小设置
为数据库开启分片功能
sh.enableSharding("rpp_resume")
为指定集合开启分片功能
sh.shardCollection("rpp_resume.rpp_resume_datas",{"name":"hashed"})
mongos> sh.enableSharding("rpp_resume")
{
"ok" : 1,
"operationTime" : Timestamp(1602326356, 5),
"$clusterTime" : {
"clusterTime" : Timestamp(1602326356, 5),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.shardCollection("rpp_resume.rpp_resume_datas",{"name":"hashed"})
{
"collectionsharded" : "rpp_resume.rpp_resume_datas",
"collectionUUID" : UUID("c4e62fd0-499e-420e-895a-4dbe8a66646f"),
"ok" : 1,
"operationTime" : Timestamp(1602326368, 30),
"$clusterTime" : {
"clusterTime" : Timestamp(1602326368, 30),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos>
4.6 向集合中插入数据测试
通过路由循环向集合中添加数
use rpp_resume;
for(var i=1;i<= 1000;i++){
db.rpp_resume_datas.insert({"name":"test"+i, salary:(Math.random()*20000).toFixed(2)});
}
4.7 验证分片效果
分别进入 shard1 和 shard2 中的主从数据库进行验证
shard1 37017
shard1:PRIMARY> db.rpp_resume_datas.find()
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f4"), "name" : "test3", "salary" : "415.41" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f5"), "name" : "test4", "salary" : "567.00" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f6"), "name" : "test5", "salary" : "5979.64" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fa"), "name" : "test9", "salary" : "19307.87" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290a"), "name" : "test25", "salary" : "14712.60" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290b"), "name" : "test26", "salary" : "15679.53" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290c"), "name" : "test27", "salary" : "17134.48" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290d"), "name" : "test28", "salary" : "10127.22" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290e"), "name" : "test29", "salary" : "19221.05" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702910"), "name" : "test31", "salary" : "12848.60" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702911"), "name" : "test32", "salary" : "9617.72" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702914"), "name" : "test35", "salary" : "1220.32" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702915"), "name" : "test36", "salary" : "10541.12" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702916"), "name" : "test37", "salary" : "891.18" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702917"), "name" : "test38", "salary" : "16313.15" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70291a"), "name" : "test41", "salary" : "14766.15" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70291c"), "name" : "test43", "salary" : "15438.16" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702921"), "name" : "test48", "salary" : "1059.71" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702922"), "name" : "test49", "salary" : "648.91" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702924"), "name" : "test51", "salary" : "6677.53" }
Type "it" for more
shard1:PRIMARY>
shard1 37018
shard1:SECONDARY> db.rpp_resume_datas.find()
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f4"), "name" : "test3", "salary" : "415.41" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f5"), "name" : "test4", "salary" : "567.00" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f6"), "name" : "test5", "salary" : "5979.64" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fa"), "name" : "test9", "salary" : "19307.87" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290a"), "name" : "test25", "salary" : "14712.60" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290b"), "name" : "test26", "salary" : "15679.53" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290c"), "name" : "test27", "salary" : "17134.48" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290d"), "name" : "test28", "salary" : "10127.22" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702910"), "name" : "test31", "salary" : "12848.60" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70290e"), "name" : "test29", "salary" : "19221.05" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702911"), "name" : "test32", "salary" : "9617.72" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702914"), "name" : "test35", "salary" : "1220.32" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702916"), "name" : "test37", "salary" : "891.18" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702915"), "name" : "test36", "salary" : "10541.12" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702917"), "name" : "test38", "salary" : "16313.15" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70291a"), "name" : "test41", "salary" : "14766.15" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd70291c"), "name" : "test43", "salary" : "15438.16" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702921"), "name" : "test48", "salary" : "1059.71" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702922"), "name" : "test49", "salary" : "648.91" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702924"), "name" : "test51", "salary" : "6677.53" }
Type "it" for more
shard1:SECONDARY>
shard2 47017
shard2:PRIMARY> db.rpp_resume_datas.find()
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f2"), "name" : "test1", "salary" : "3222.95" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f3"), "name" : "test2", "salary" : "12116.39" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f7"), "name" : "test6", "salary" : "8448.45" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f8"), "name" : "test7", "salary" : "11397.26" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f9"), "name" : "test8", "salary" : "2285.39" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fb"), "name" : "test10", "salary" : "12548.81" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fc"), "name" : "test11", "salary" : "15763.79" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fd"), "name" : "test12", "salary" : "5284.59" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fe"), "name" : "test13", "salary" : "12727.03" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028ff"), "name" : "test14", "salary" : "16147.36" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702900"), "name" : "test15", "salary" : "6856.62" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702901"), "name" : "test16", "salary" : "8630.26" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702902"), "name" : "test17", "salary" : "18266.78" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702903"), "name" : "test18", "salary" : "2521.55" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702904"), "name" : "test19", "salary" : "15168.43" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702905"), "name" : "test20", "salary" : "17051.43" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702906"), "name" : "test21", "salary" : "9859.60" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702907"), "name" : "test22", "salary" : "5442.15" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702908"), "name" : "test23", "salary" : "801.52" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702909"), "name" : "test24", "salary" : "13727.49" }
Type "it" for more
shard2:PRIMARY>
shard2 47018
shard2:SECONDARY> db.rpp_resume_datas.find()
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f2"), "name" : "test1", "salary" : "3222.95" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f3"), "name" : "test2", "salary" : "12116.39" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f7"), "name" : "test6", "salary" : "8448.45" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f8"), "name" : "test7", "salary" : "11397.26" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028f9"), "name" : "test8", "salary" : "2285.39" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fc"), "name" : "test11", "salary" : "15763.79" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fb"), "name" : "test10", "salary" : "12548.81" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fd"), "name" : "test12", "salary" : "5284.59" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702900"), "name" : "test15", "salary" : "6856.62" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028fe"), "name" : "test13", "salary" : "12727.03" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702902"), "name" : "test17", "salary" : "18266.78" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd7028ff"), "name" : "test14", "salary" : "16147.36" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702901"), "name" : "test16", "salary" : "8630.26" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702903"), "name" : "test18", "salary" : "2521.55" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702904"), "name" : "test19", "salary" : "15168.43" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702908"), "name" : "test23", "salary" : "801.52" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702906"), "name" : "test21", "salary" : "9859.60" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702909"), "name" : "test24", "salary" : "13727.49" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702905"), "name" : "test20", "salary" : "17051.43" }
{ "_id" : ObjectId("5f818fa9d5fb279dbd702907"), "name" : "test22", "salary" : "5442.15" }
Type "it" for more
shard2:SECONDARY>