上篇文章已经详细的介绍了replica set的搭建过程,这篇文章主要对故障的自动切换、节点的增、删、改进行介绍

http://1413570.blog.51cto.com/1403570/1337619 mongodb 的replica set的搭建过程

模拟示列一:

res1:PRIMARY> rs.conf();

{

        "_id" : "res1",

        "version" : 1,

        "members" : [

                {

                        "_id" : 0,

                        "host" : "192.168.1.248:27017",

                        "priority" : 2

                },

                {

                        "_id" : 1,

                        "host" : "192.168.1.247:27018",

                        "priority" : 0

                },

                {

                        "_id" : 2,

                        "host" : "192.168.1.250:27019"

                }

        ]

}

看出,primary 是host:192.168.1.248,因为priority 属性大,其次是 host:192.168.1.250,当host 192.168.1.248宕机时,就有host 192.168.1.250 作为primary ,主库

假设 host 192.168.1.248 停掉mongodb主进程

ps -ef | grep mongodb

 kill 8665

尽量不要使用kill -9 这个可能会导致mongo数据文件的损坏

OK,现在其他两台server的日志已经提示

Fri Dec  6 16:36:10.522 [rsHealthPoll] couldn't connect to 192.168.1.248:27017: couldn't connect to server 192.168.1.248:27017

之后有host 192.168.20.250 来作为primary

Fri Dec  6 16:36:40.707 [conn248] end connection 192.168.1.250:46500 (1 connection now open)

Fri Dec  6 16:36:40.708 [initandlisten] connection accepted from 192.168.1.250:46592 #249 (2 connections now open)

Fri Dec  6 16:36:40.710 [conn249]  authenticate db: local { authenticate: 1, nonce: "f70f5a8aea558178", user: "__system", key: "19fb73382ae940816c685b2561b0a76e" }

现在通过mongodb的shell ,登录

[root@anenjoy ~]# /usr/local/mongodb/bin/mongo --port 27019

MongoDB shell version: 2.4.8

connecting to: 127.0.0.1:27019/test

res1:PRIMARY> 

就会显示primary

之后通过rs.ststus();

res1:PRIMARY> rs.status();

{

        "set" : "res1",

        "date" : ISODate("2013-12-06T08:44:01Z"),

        "myState" : 1,

        "members" : [

                {

                        "_id" : 0,

                        "name" : "192.168.1.248:27017",

                        "health" : 0,

                        "state" : 8,

                        "stateStr" : "(not reachable/healthy)",

                        "uptime" : 0,

                        "optime" : Timestamp(1386118280, 1),

                        "optimeDate" : ISODate("2013-12-04T00:51:20Z"),

                        "lastHeartbeat" : ISODate("2013-12-06T08:44:00Z"),

                        "lastHeartbeatRecv" : ISODate("2013-12-06T08:41:32Z"),

                        "pingMs" : 0

                },

                {

                        "_id" : 1,

                        "name" : "192.168.1.247:27018",

                        "health" : 1,

                        "state" : 2,

                        "stateStr" : "SECONDARY",

                        "uptime" : 3790,

                        "optime" : Timestamp(1386118280, 1),

                        "optimeDate" : ISODate("2013-12-04T00:51:20Z"),

                        "lastHeartbeat" : ISODate("2013-12-06T08:44:00Z"),

                        "lastHeartbeatRecv" : ISODate("2013-12-06T08:44:01Z"),

                        "pingMs" : 0,

                        "syncingTo" : "192.168.1.250:27019"

                },

                {

                        "_id" : 2,

                        "name" : "192.168.1.250:27019",

                        "health" : 1,

                        "state" : 1,

                        "stateStr" : "PRIMARY",

                        "uptime" : 4958,

                        "optime" : Timestamp(1386118280, 1),

                        "optimeDate" : ISODate("2013-12-04T00:51:20Z"),

                        "self" : true

                }

        ],

        "ok" : 1

}

res1:PRIMARY> 

可以看到name 192.168.1.248 这台server的不正常,另外两台的LOG也是在不断的输出无法连接到host 192.168.1.248 27017 这个端口,

当你host 192.168.1.248 mongodb进程重新运行起来之后,就会自动切换为primary

Fri Dec  6 16:48:35.325 [conn246] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [192.168.1.247:27047] 

Fri Dec  6 16:48:35.388 [rsHealthPoll] replSet member 192.168.1.248:27017 is now in state PRIMARY

[root@test02 bin]# /usr/local/mongodb/bin/mongo --port 27017

MongoDB shell version: 2.4.8

connecting to: 127.0.0.1:27017/test

res1:PRIMARY> 

而如果你host 192.168.1.248宕机时,host 192.168.1.250 担当primary,进行写数据

db.appstore.save({'e_name':'xiaowang','e_id':1103,'class_id':2});

res1:PRIMARY> db.appstore.find();db.appstore.find();

{ "_id" : ObjectId("529e7c88d4d317e4bd3eece9"), "e_name" : "frank", "e_id" : 1101, "class_id" : 1 }

{ "_id" : ObjectId("52a18f3bd36b29b9c78be267"), "e_name" : "xiaowang", "e_id" : 1103, "class_id" : 2 }

之后当host 192.168.1.248 担当primary时,新增加的数据也会进行同步的,类似mysql的master-slave 同步




示列二:replica set 节点的增、删、改操作


现在呢,假设我primary host 192.168.1.248 宕机了,想把这个节点给删掉

先ps -aux | grep mongodb ,然后kill掉进程

现在 host 192.168.20.250 已经被置为primary

[root@anenjoy ~]# /usr/local/mongodb/bin/mongo --port 27019

MongoDB shell version: 2.4.8

connecting to: 127.0.0.1:27019/test

res1:PRIMARY> 

通过rs.conf()查看节点配置

res1:PRIMARY> rs.conf();

{

        "_id" : "res1",

        "version" : 1,

        "members" : [

                {

                        "_id" : 0,

                        "host" : "192.168.1.248:27017",

                        "priority" : 2

                },

                {

                        "_id" : 1,

                        "host" : "192.168.1.247:27018",

                        "priority" : 0

                },

                {

                        "_id" : 2,

                        "host" : "192.168.1.250:27019"

                }

        ]

res1:PRIMARY> rs.remove('192.168.1.248:27017');

Fri Dec  6 16:59:01.480 DBClientCursor::init call() failed

Fri Dec  6 16:59:01.482 Error: error doing query: failed at src/mongo/shell/query.js:78

Fri Dec  6 16:59:01.482 trying reconnect to 127.0.0.1:27019

Fri Dec  6 16:59:01.482 reconnect 127.0.0.1:27019 ok

再次查看,ok 节点已经被删除掉了

res1:PRIMARY> rs.conf();

{

        "_id" : "res1",

        "version" : 2,

        "members" : [

                {

                        "_id" : 1,

                        "host" : "192.168.1.247:27018",

                        "priority" : 0

                },

                {

                        "_id" : 2,

                        "host" : "192.168.1.250:27019"

                }

        ]

}

LOG日志中也就不会有:[rsHealthPoll] couldn't connect to 192.168.1.248:27017: couldn't connect to server 192.168.1.248:27017 日志的输出


增加节点:


通过oplog直接进行增加节点操作简单且不需要人过多的参与,但oplog是capped collection,会循环使用的,所以如果只是简单的使用oplog来进行增加节点,有可能导致数据的不一致,因为日志中存储的信息有可能已经刷新过了。


可以通过使用数据库快照(--fastsync)和oplog结合的方式来增加节点,一般的操作步骤是:


先取某一个复制集成员的物理文件作为初始化数据,然后剩余的部分用oplog日志来追加,从而最终达到数据一致性


最新准备的步骤都是一样的:


建DB存储的目录,key文件、权限的600


第一步:配置存储路径,--dbpath的参数


均放在/data/mon_db下,目录权限赋予mongodb用户


mkdir -p /data/mon_db

chown -R mongodb:mongodb /data/mon_db/

创建日志文件,--logpath的参数,位置自己定义

就放在mkdir -p /usr/local/mongodb/log

 touch /usr/local/mongodb/log/mongodb.log

chown -R mongodb:mongodb /usr/local/mongodb/

第二步:创建主从的key文件,用于标识集群的私钥的完整路径,如果各个实例的key file内容不一致,程序将不能正常用

[root@test02 ~]# mkdir -p /data/mon_db/key

[root@test02 ~]# echo "this is res key" > /data/mon_db/key/res1

Chmod +R 600 /data/mon_db/key/res1 权限赋予600,否则会提示error message 

Wed Dec  4 06:22:36.413 permissions on /data/mon_db/key/res1 are too open 

 更改不同的名字就好了

假设说同步host 192.168.1.247的物理文件吧

Scp -r /data/mongodb/res2/  root@ip:/data/mon_db/res4

之后呢,可以在primary插入新数据(验证使用)

启动mongodb

/usr/local/mongodb/bin/mongod --port 27020  --replSet res1  --keyFile /data/mon_db/key/res4 --oplogSize 100  --dbpath=/data/mon_db/res4/ --logpath=/usr/local/mongodb/log/mongodb.log --logappend   --fastsync --fork

之后


在primary上执行添加节点:


Rs.add(‘192.168.1.x:27020’)


之后在新添加的节点上,登录到mongodb,获取读的权限,查看数据是不是同步成功


节点的更改

何为节点的更改,其实不外乎对节点host、port、priority进行更改,这边文章简单的描述下如何进行更改

目前我的replica set 有三个节点

/usr/local/mongodb/bin/mongo --port 27019
rs.status();
{
        "set" : "res1",
        "date" : ISODate("2013-12-06T11:56:42Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 1,
                        "name" : "192.168.1.247:27018",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 10661,
                        "optime" : Timestamp(1386330980, 1),
                        "optimeDate" : ISODate("2013-12-06T11:56:20Z"),
                        "lastHeartbeat" : ISODate("2013-12-06T11:56:42Z"),
                        "lastHeartbeatRecv" : ISODate("2013-12-06T11:56:40Z"),
                        "pingMs" : 0,
                        "syncingTo" : "192.168.1.250:27019"
                },
                {
                        "_id" : 2,
                        "name" : "192.168.1.250:27019",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 16519,
                        "optime" : Timestamp(1386330980, 1),
                        "optimeDate" : ISODate("2013-12-06T11:56:20Z"),
                        "self" : true
                },
                {
                        "_id" : 3,
                        "name" : "192.168.1.248:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 22,
                        "optime" : Timestamp(1386330980, 1),
                        "optimeDate" : ISODate("2013-12-06T11:56:20Z"),
                        "lastHeartbeat" : ISODate("2013-12-06T11:56:42Z"),
                        "lastHeartbeatRecv" : ISODate("2013-12-06T11:56:41Z"),
                        "pingMs" : 0,
                        "lastHeartbeatMessage" : "syncing to: 192.168.1.250:27019",
                        "syncingTo" : "192.168.1.250:27019"
                }
        ],
        "ok" : 1
}
我想更改节点直接的优先级,现在host 192.168.1.250 是primary ,priority为2 ,我想让host:192.168.1.248 作为primary,只要它的priority 为3 大于2 即可
res1:PRIMARY> cfg=rs.conf();
{
        "_id" : "res1",
        "version" : 3,
        "members" : [
                {
                        "_id" : 1,
                        "host" : "192.168.1.247:27018",
                        "priority" : 0
                },
                {
                        "_id" : 2,
                        "host" : "192.168.1.250:27019"
                },
                {
                        "_id" : 3,
                        "host" : "192.168.1.248:27017"
                }
        ]
}
res1:PRIMARY>cfg.members[2].priority=3;
res1:PRIMARY> rs.reconfig(cfg);rs.reconfig() 类似重新初始化
Fri Dec  6 20:00:29.788 DBClientCursor::init call() failed
Fri Dec  6 20:00:29.792 trying reconnect to 127.0.0.1:27019
Fri Dec  6 20:00:29.793 reconnect 127.0.0.1:27019 ok
reconnected to server after rs command (which is normal)

多敲两次回车,就会发现之前是primary,就变成了secondary,而你的host 192.168.1.248 就变成了primary






转载于:https://blog.51cto.com/caibird/1337622