在mongodb中,数据同步有两种类型:
master/slave:主从。已经废弃,被副本集取代。
replica set:副本集。

一个副本集只能有一个主节点,可以有多个从节点。主节点可以读写,从节点只能读。

https://docs.mongodb.com/v2.6/core/replica-set-members/

mongodb主副本节点同步状态延时 mongodb主从同步速度_mongodb主副本节点同步状态延时

主节点将数据修改操作保存至oplog中。

需要奇数个节点,至少三个节点。

心跳信息每两秒传递一次。

通过选举,实现自动故障转移。主节点宕机后会进行自动选举。

mongodb主副本节点同步状态延时 mongodb主从同步速度_mongodb_02

副本集中的从节点的特殊类型:

https://docs.mongodb.com/v2.6/core/replica-set-secondary/
1、0优先级节点:冷备节点,参与选举,但不会被选举成为主节点。
2、隐藏的从节点:首先是一个0优先级节点,并且对客户端不可见。
3、延迟复制从节点:首先是一个0优先级节点,并且复制时间落后于主节点一个固定时长。持有的数据始终为过期数据。
4、arbiter:仲裁节点,不持有数据。

mongodb主副本节点同步状态延时 mongodb主从同步速度_mongodb_03

oplog:每个节点都有oplog,但只有主节点会写oplog,然后同步给从节点。是一个大小固定的文件,存储在local库中。

新加入的节点数据同步的过程:
1、初始同步:initial sync
2、回滚后追赶:post-rollback catch-up
3、切分块迁移:sharding chunk migrations

local数据库:存放了所有元数据和oplog。用于存储oplog的collection的名称为oplog.rs。

--oplogSize arg:size to use (in MB) for replication oplog. default is 5% of disk space. 指定oplog的大小。默认为磁盘大小的5%.

会进行初始同步的场景:
1、节点没有任何数据时。
2、节点丢失副本复制历史。

初始同步的步骤:
1、克隆所有数据库。
2、应用数据集的所有改变,即复制oplog并应用于本地。
3、为所有collection构建索引。
---

ReplicaSet安装

准备三台机器,系统版本为CentOS7.3

1、修改Hosts

192.168.135.170         node1
192.168.135.171         node2
192.168.135.169         node3

2、分别安装mongodb

#yum install -y mongodb mongodb-server

3、修改各节点配置文件

#vim /etc/mongod.conf
#bind_ip = 127.0.0.1        新创建的节点记得修改监听地址,默认只监听127.0.0.1。
replSet = testSet
replIndexPrefetch = all

4、启动各节点mongod服务

#systemctl start mongod
> rs.help()     查看常用命令
rs.status():{ replSetGetStatus : 1 } checks repl set status. 显示复制集的状态。
rs.initiate():{ replSetInitiate : null } initiates set with default settings. 以默认配置初始化复制集。
rs.conf():get the current configuration object from local.system.replset. 查看当前配置信息。
rs.add(hostportstr):add a new member to the set with default attributes (disconnects). 以默认属性添加一个新成员到集群。
rs.slaveOk():shorthand for db.getMongo().setSlaveOk(). 在从节点上标记从节点为Ok状态。
db.isMaster():check who is primary. 检查谁是主节点。
rs.stepDown([secs]):step down as primary (momentarily) (disconnects). 将主节点降级为从节点,这时会重新选举。
rs.reconfig(cfg):updates the configuration of a running replica set with cfg (disconnects). 更新配置。
rs.addArb(hostportstr):add a new member which is arbiterOnly:true (disconnects). 添加一个新节点为arbiter节点。
rs.remove(hostportstr):remove a host from the replica set (disconnects). 将一个节点从集群中删除。
rs.printReplicationInfo():check oplog size and time range. 查看oplog的大小和起始时间范围。
rs.printSlaveReplicationInfo():check replica set members and replication lag. 查看集群成员的复制延时信息。

5、初始化副本集

> rs.status()
{
    "startupStatus" : 3,
    "info" : "run rs.initiate(...) if not yet done for the set",
    "ok" : 0,
    "errmsg" : "can't get local.system.replset config from self or any seed (EMPTYCONFIG)"
}

> rs.initiate()
{
    "info2" : "no configuration explicitly specified -- making one",
    "me" : "node1:27017",
    "info" : "Config now saved locally.  Should come online in about a minute.",
    "ok" : 1
}

> rs.status()
{
    "set" : "testSet",
    "date" : ISODate("2017-03-11T05:49:18Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "node1:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 137,
            "optime" : Timestamp(1489211281, 1),
            "optimeDate" : ISODate("2017-03-11T05:48:01Z"),
            "electionTime" : Timestamp(1489211281, 2),
            "electionDate" : ISODate("2017-03-11T05:48:01Z"),
            "self" : true
        }
    ],
    "ok" : 1
}

6、添加新成员至副本集

> rs.add("node2:27017")     添加node2
{ "ok" : 1 }
> rs.add("node3:27017")     添加node3
{ "ok" : 1 }

7、再次查看副本集状态,这时就能看到三个节点了

> rs.status()
{
    "set" : "testSet",
    "date" : ISODate("2017-03-11T06:00:29Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "node1:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 808,
            "optime" : Timestamp(1489211964, 1),
            "optimeDate" : ISODate("2017-03-11T05:59:24Z"),
            "electionTime" : Timestamp(1489211281, 2),
            "electionDate" : ISODate("2017-03-11T05:48:01Z"),
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "node2:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 322,
            "optime" : Timestamp(1489211964, 1),
            "optimeDate" : ISODate("2017-03-11T05:59:24Z"),
            "lastHeartbeat" : ISODate("2017-03-11T06:00:28Z"),
            "lastHeartbeatRecv" : ISODate("2017-03-11T06:00:29Z"),
            "pingMs" : 3,
            "syncingTo" : "node1:27017"
        },
        {
            "_id" : 2,
            "name" : "node3:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 65,
            "optime" : Timestamp(1489211964, 1),
            "optimeDate" : ISODate("2017-03-11T05:59:24Z"),
            "lastHeartbeat" : ISODate("2017-03-11T06:00:28Z"),
            "lastHeartbeatRecv" : ISODate("2017-03-11T06:00:28Z"),
            "pingMs" : 5,
            "syncingTo" : "node1:27017"
        }
    ],
    "ok" : 1
}

8、分别在各从节点设置节点为Ok状态

> rs.slaveOk()

9、查看副本集配置信息
https://docs.mongodb.com/v2.6/reference/replica-configuration/

> rs.conf()
{
    "_id" : "testSet",
    "version" : 3,
    "members" : [
        {
            "_id" : 0,
            "host" : "node1:27017"
        },
        {
            "_id" : 1,
            "host" : "node2:27017"
        },
        {
            "_id" : 2,
            "host" : "node3:27017"
        }
    ]
}

10、将主节点降级为从节点

> rs.stepDown()
> rs.status()       这里重新选举后,node3成为主节点。
{
    "set" : "testSet",
    "date" : ISODate("2017-03-11T06:40:03Z"),
    "myState" : 2,
    "syncingTo" : "node3:27017",
    "members" : [
        {
            "_id" : 0,
            "name" : "node1:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 3182,
            "optime" : Timestamp(1489211964, 1),
            "optimeDate" : ISODate("2017-03-11T05:59:24Z"),
            "infoMessage" : "syncing to: node3:27017",
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "node2:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 2696,
            "optime" : Timestamp(1489211964, 1),
            "optimeDate" : ISODate("2017-03-11T05:59:24Z"),
            "lastHeartbeat" : ISODate("2017-03-11T06:40:01Z"),
            "lastHeartbeatRecv" : ISODate("2017-03-11T06:40:02Z"),
            "pingMs" : 6,
            "lastHeartbeatMessage" : "syncing to: node1:27017",
            "syncingTo" : "node1:27017"
        },
        {
            "_id" : 2,
            "name" : "node3:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 2439,
            "optime" : Timestamp(1489211964, 1),
            "optimeDate" : ISODate("2017-03-11T05:59:24Z"),
            "lastHeartbeat" : ISODate("2017-03-11T06:40:02Z"),
            "lastHeartbeatRecv" : ISODate("2017-03-11T06:40:03Z"),
            "pingMs" : 2,
            "electionTime" : Timestamp(1489214375, 1),
            "electionDate" : ISODate("2017-03-11T06:39:35Z")
        }
    ],
    "ok" : 1
}

11、查看副本集信息,这里会显示oplog大小和起始时间范围。

> db.printReplicationInfo()
configured oplog size:   2464.306884765625MB
log length start to end: 0secs (0hrs)
oplog first event time:  Sat Mar 11 2017 00:59:24 GMT-0500 (EST)
oplog last event time:   Sat Mar 11 2017 00:59:24 GMT-0500 (EST)
now:                     Sat Mar 11 2017 01:46:07 GMT-0500 (EST)

> rs.printReplicationInfo()
configured oplog size:   2464.36328125MB
log length start to end: 6205secs (1.72hrs)
oplog first event time:  Sat Mar 11 2017 13:55:07 GMT+0800 (CST)
oplog last event time:   Sat Mar 11 2017 15:38:32 GMT+0800 (CST)
now:                     Sat Mar 11 2017 15:42:26 GMT+0800 (CST)

> rs.printSlaveReplicationInfo()
source: node1:27017
    syncedTo: Sat Mar 11 2017 15:38:32 GMT+0800 (CST)
    0 secs (0 hrs) behind the primary 
source: node3:27017
    syncedTo: Sat Mar 11 2017 15:38:32 GMT+0800 (CST)
    0 secs (0 hrs) behind the primary
副本集重新选举的条件:
1、心跳信息
2、节点优先级
3、optime
4、网络分区

触发重新选举的场景:
1、新副本集初始化时
2、从节点联系不到主节点时
3、主节点降级时

12、修改节点优先级,只能在主节点上修改
https://docs.mongodb.com/v2.6/tutorial/adjust-replica-set-member-priority/ 默认优先级为1,优先级的设定范围为0-1000。

> cfg=rs.conf()
> cfg.members[1].priority = 2
2
> rs.reconfig(cfg)      这时会重新选举节点2为主节点
2017-03-11T02:38:32.726-0500 DBClientCursor::init call() failed
2017-03-11T02:38:32.796-0500 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2017-03-11T02:38:32.825-0500 reconnect 127.0.0.1:27017 (127.0.0.1) ok
reconnected to server after rs command (which is normal)

13、将一个从节点转换为arbiter
https://docs.mongodb.com/v2.6/tutorial/convert-secondary-into-arbiter/

a、在主节点remove该从节点
b、将从节点stop
c、清空数据目录
d、启动服务
e、将节点以arbiter的角色添加到副本集