1、实验环境:
Node1:192.168.1.17(RHEL5.8_32bit,web server)
Node2:192.168.1.18(RHEL5.8_32bit,web server)
SteppingStone:192.168.1.19(RHEL5.8_32bit)
VIP:192.168.1.20
2、准备工作
<1> 配置主机名
节点名称使用/etc/hosts解析;节点名称必须跟uname -n命令的执行结果一致
Node1:
# hostname node1.ikki.com # vim /etc/sysconfig/network HOSTNAME=node1.ikki.com
Node2:
# hostname node1.ikki.com # vim /etc/sysconfig/network HOSTNAME=node2.ikki.com
<2> 配置节点ssh基于密钥方式互相通信
Node1:
# ssh-keygen -t rsa # ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2
Node2:
# ssh-keygen -t rsa # ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1
<3> 配置各节点基于主机名互相通信
Node1&Node2:
# vim /etc/hosts 192.168.1.17 node1.ikki.com node1 192.168.1.18 node2.ikki.com node2
<4> 配置各节点时间同步
Node1&Node2:
# crontab -e */5 * * * * /sbin/ntpdate 202.120.2.101 &> /dev/null
<5> 配置跳板机(SteppingStone)
与Node1和Node2建立ssh互信,且基于主机名通信:
# ssh-keygen -t rsa # ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1 # ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2 # vim /etc/hosts 192.168.1.17 node1.ikki.com node1 192.168.1.18 node2.ikki.com node2
制作同步远程执行命令的step脚本工具:
# vim step #!/bin/bash if [ $# -eq 1 ]; then for I in {1..2}; do ssh node$I $1; done else echo "Usage:step 'COMMANDs'" fi # chmod +x step # mv step /usr/sbin
<6> Node1和Node2两个节点上各提供了一个大小相同的分区作为drbd设备
为各个节点上创建LVM逻辑卷,大小为1G
# fdisk /dev/sda n --> e --> n --> +1G --> w # partprobe /dev/sda
3、安装内核模块和管理工具
安装最新的8.3的版本:
drbd83-8.3.15-2.el5.centos.i386.rpm
kmod-drbd83-8.3.15-3.el5.centos.i686.rpm
在SteppingStone上执行远程安装:
# step 'yum -y --nogpgcheck localinstall drbd83-8.3.8-1.el5.centos.i386.rpm kmod-drbd83-8.3.8-1.el5.centos.i686.rpm'
4、配置drbd(Node1)
<1> 复制样例文件为配置文件:
# cp /usr/share/doc/drbd83-8.3.8/drbd.conf /etc
<2> 配置/etc/drbd.d/global-common.conf
global { usage-count no; # 禁用信息统计 # minor-count dialog-refresh disable-ip-verification } common { protocol C; # 默认使用同步协议 handlers { # These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb } disk { on-io-error detach; # 当磁盘IO错误时执行分离 # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes # no-disk-drain no-md-flushes max-bio-bvecs } net { # sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-corki cram-hmac-alg "sha1"; # 同步时验证所使用的算法 shared-secret "mydrbd7788"; # 共享的密码 } syncer { rate 200M; # 同步速率 # rate after al-extents use-rle cpu-mask verify-alg csums-alg } }
<3> 定义一个资源/etc/drbd.d/mydrbd.res,内容如下:
resource mydrbd { device /dev/drbd0; disk /dev/sda5; meta-disk internal; on node1.ikki.com { address 192.168.1.17:7789; } on node2.ikki.com { address 192.168.1.18:7789; } }
将以上配置的文件全部同步至另外一个节点
# scp -r /etc/drbd.* node2:/etc
5、在两个节点上初始化已定义的资源并启动服务:
<1> 初始化资源(Node1和Node2):
# drbdadm create-md web
<2> 启动服务(Node1和Node2):
# /etc/init.d/drbd start
<3> 查看启动状态(Node1):
# cat /proc/drbd version: 8.3.15 (api:88/proto:86-97) GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by mockbuild@builder17.centos.org, 2013-03-27 16:04:08 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:987896
<4> 将当前节点设置为主节点(Node1)
# drbdadm -- --overwrite-data-of-peer primary mydrbd
注:适用于初次设置
再次查看状态:
# drbd-overview 0:mydrbd Connected Primary/Secondary UpToDate/UpToDate C r-----
注:Primary/Secondary:当前节点/另一节点
6、创建文件系统并挂载(主节点Node1)
文件系统的挂载只能在Primary节点进行,因此在主节点上对drbd设备进行格式化:
# mke2fs -j /dev/drbd0 # mkdir /mydata # mount /dev/drbd0 /mydata
7、切换主从节点进行测试
Node1:
# cp /etc/inittab /mydata # umount /mydata # drbdadm secondary mydrbd # drbd-overview 0:mydrbd Connected Secondary/Secondary UpToDate/UpToDate C r-----
Node2:
# drbdadm primary mydrbd # drbd-overview 0:mydrbd Connected Primary/Secondary UpToDate/UpToDate C r----- # mkdir /mydata # mount /dev/drbd0 /mydata # ls /mydata
8、配置openais/corosync+pacemaker
<1> 安装corosync和pacemaker(SteppingStone)
# cd /root/corosync/ # ls cluster-glue-1.0.6-1.6.el5.i386.rpm cluster-glue-libs-1.0.6-1.6.el5.i386.rpm corosync-1.2.7-1.1.el5.i386.rpm corosynclib-1.2.7-1.1.el5.i386.rpm heartbeat-3.0.3-2.3.el5.i386.rpm heartbeat-libs-3.0.3-2.3.el5.i386.rpm libesmtp-1.0.4-5.el5.i386.rpm pacemaker-1.1.5-1.1.el5.i386.rpm pacemaker-libs-1.1.5-1.1.el5.i386.rpm resource-agents-1.0.4-1.1.el5.i386.rpm # step 'mkdir /root/corosync' # for I in {1..2};do scp *.rpm node$I:/root/corosync;done # step 'yum -y --nogpgcheck localinstall /root/corosync/*.rpm' # step 'mkdir /var/log/cluster'
<2> 修改corosync配置并密钥认证(Node1)
# cd /etc/corosync/ # cp corosync.conf.example corosync.conf # vim corosync.conf # 修改如下内容: secauth: on threads: 2 bindnetaddr: 192.168.1.0 to_syslog: no # vim corosync.conf # 添加如下内容: service { ver: 0 name: pacemaker } aisexec { user: root group: root } # corosync-keygen # scp -p authkey corosync.conf node2:/etc/corosync/
<3> 启动服务并检查(Node1)
# service corosync start # ssh node2 'service corosync start' # grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log # grep TOTEM /var/log/cluster/corosync.log # grep pcmk_startup /var/log/cluster/corosync.log
<4> 配置集群属性
禁用stonith设备、关闭法定票数策略、设置默认粘性:
# crm configure property stonith-enabled=false # crm configure property no-quorum-policy=ignore # crm configure rsc_defaults resource-stickiness=100
查看集群配置:
# crm configure show node node1.ikki.com node node2.ikki.com property $id="cib-bootstrap-options" \ dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" rsc_defaults $id="rsc-options" \ resource-stickiness="100
9、将已经配置好的drbd设备/dev/drbd0定义为集群服务
<1> 停止drbd服务并关闭自启动(Node1和Node2)
# service drbd stop # chkconfig drbd off
<2> 配置drbd为集群资源(Node1)
添加mydrbd资源并设置为主从资源:
# crm configure primitive mysqldrbd ocf:linbit:drbd params drbd_resource=mydrbd op start timeout=240 op stop timeout=100 op monitor role=Master interval=20 timeout=30 op monitor role=Slave interval=30 timeout=30 # crm configure ms ms_mysqldrbd mysqldrbd meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
注:高可用资源不可与drbd资源重名;crm status如显示出错,则检查配置后重启corosync服务即可
查看当前集群运行状态:
# crm status ============ Last updated: Sat Sep 21 23:27:01 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ node2.ikki.com node1.ikki.com ] Master/Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ node1.ikki.com ] Slaves: [ node2.ikki.com ]
<3> 为主节点上的mydrbd资源创建自动挂载的集群服务(Node1)
# crm configure primitive mystore ocf:heartbeat:Filesystem params device=/dev/drbd0 directory=/mydata fstype=ext3 op start timeout=60 op stop timeout=60 # crm configure colocation mystore_with_ms_mysqldrbd inf: mystore ms_mysqldrbd:Master # crm configure order mystore_after_ms_mysqldrbd mandatory: ms_mysqldrbd:promote mystore:start
查看资源的运行状态:
# crm status ============ Last updated: Sat Sep 21 23:55:01 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ node2.ikki.com node1.ikki.com ] Master/Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ node1.ikki.com ] Slaves: [ node2.ikki.com ] mystore (ocf::heartbeat:Filesystem): Started node1.ikki.com
<4> 模拟故障进行测试
将node1设置为standby,则资源转移至node2
# crm node standby # crm status ============ Last updated: Sat Sep 21 23:59:38 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Node node1.ikki.com: standby Online: [ node2.ikki.com ] Master/Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ node2.ikki.com ] Stopped: [ mysqldrbd:0 ] mystore (ocf::heartbeat:Filesystem): Started node2.ikki.com # ls /mydata/ inittab lost+found
将node1设置为online,显示node2为主节点
# crm node online # crm status ============ Last updated: Sat Sep 21 23:59:59 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ node2.ikki.com node1.ikki.com ] Master/Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ node2.ikki.com ] Slaves: [ node1.ikki.com ] mystore (ocf::heartbeat:Filesystem): Started node2.ikki.com
10、配置高可用MySQL集群服务
<1> 在各个节点上安装MySQL服务(SteppingStone)
这里使用通用二进制安装mysql-5.5.28版本
# for I in {1..2};do scp mysql-5.5.28-linux2.6-i686.tar.gz node$I:/usr/src/;done # step 'tar -xf /usr/src/mysql-5.5.28-linux2.6-i686.tar.gz -C /usr/local' # step 'ln -sv /usr/local/mysql-5.5.28-linux2.6-i686 /usr/local/mysql' # step 'groupadd -g 3306 mysql' # step 'useradd -u 3306 -g mysql -s /sbin/nologin -M mysql # step 'mkdir /mydata/data' # step 'chown -R mysql.mysql /mydata/data' # step 'chown -R root.mysql /usr/local/mysql/*' # step 'cp /usr/local/mysql/support-files/my-large.cnf /etc/my.cnf' # step 'cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysqld' # step 'chkconfig --add mysqld'
<2> 在主节点上初始化MySQL并配置启动测试(Node2)
# cd /usr/local/mysql # scripts/mysql_install_db --user=mysql --datadir=/mydata/data # vim /etc/my.cnf 在[mysqld]下添加如下: datadir = /mydata/data # service mysqld start # service mysqld stop # chkconfig mysqld off
<3> 将Node1设置为主节点并配置MySQL(无需再次初始化)
将node2设置为standby,则资源转移至node1
# crm node standby # crm node online
在node1上配置MySQL服务并启动测试
# vim /etc/my.cnf 在[mysqld]下添加如下内容: datadir = /mydata/data # service mysqld start # service mysqld stop # chkconfig mysqld off
<4> 配置主资源mysqld和vip(Node1)
# crm configure primitive mysqld lsb:mysqld # crm configure colocation mysqld_with_mystore inf: mysqld mystore # crm configure order mysqld_after_mystore mandatory: mystore mysqld # crm configure primitive vip ocf:heartbeat:IPaddr params ip=192.168.1.20 nic=eth0 cidr_netmask=24 # crm configure colocation vip_with_ms_mysqldrbd inf: ms_mysqldrbd:Master vip
查看资源的运行状态:
# crm status ============ Last updated: Sun Sep 22 13:03:27 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 4 Resources configured. ============ Online: [ node2.ikki.com node1.ikki.com ] Master/Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ node1.ikki.com ] Slaves: [ node2.ikki.com ] mystore (ocf::heartbeat:Filesystem): Started node1.ikki.com mysqld (lsb:mysqld): Started node1.ikki.com vip (ocf::heartbeat:IPaddr): Started node1.ikki.com
查看集群配置:
# crm configure show node node1.ikki.com \ attributes standby="off" node node2.ikki.com \ attributes standby="off" primitive mysqld lsb:mysqld primitive mysqldrbd ocf:linbit:drbd \ params drbd_resource="mydrbd" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" \ op monitor interval="20" role="Master" timeout="30" \ op monitor interval="30" role="Slave" timeout="30" primitive mystore ocf:heartbeat:Filesystem \ params device="/dev/drbd0" directory="/mydata" fstype="ext3" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="60" primitive vip ocf:heartbeat:IPaddr \ params ip="192.168.1.20" nic="eth0" cidr_netmask="24" ms ms_mysqldrbd mysqldrbd \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" colocation mysqld_with_mystore inf: mysqld mystore colocation mystore_with_ms_mysqldrbd inf: mystore ms_mysqldrbd:Master colocation vip_with_ms_mysqldrbd inf: ms_mysqldrbd:Master vip order mysqld_after_mystore inf: mystore mysqld order mystore_after_ms_mysqldrbd inf: ms_mysqldrbd:promote mystore:start property $id="cib-bootstrap-options" \ dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" rsc_defaults $id="rsc-options" \ resource-stickiness="100"
11、模拟故障测试
在主节点上配置MysSQL远程访问账号(Node1)
# /usr/local/mysql/bin/mysql mysql> grant all on *.* to root@'%' identified by 'ikki'; mysql> flush privileges;
在跳板机上远程测试访问(SteppingStone)
# mysql -uroot -h192.168.1.20 -p
将node1设置为standby并查看集群状态(Node1)
# crm node standby # crm status ============ Last updated: Sun Sep 22 13:47:00 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 4 Resources configured. ============ Node node1.ikki.com: standby Online: [ node2.ikki.com ] Master/Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ node2.ikki.com ] Stopped: [ mysqldrbd:0 ] mystore (ocf::heartbeat:Filesystem): Started node2.ikki.com mysqld (lsb:mysqld): Started node2.ikki.com vip (ocf::heartbeat:IPaddr): Started node2.ikki.com
在跳板机上远程测试访问(SteppingStone)
# mysql -uroot -h192.168.1.20 -p
将node1设置为online并查看集群状态(Node1)
# crm node online # crm status ============ Last updated: Sun Sep 22 13:52:09 2013 Stack: openais Current DC: node1.ikki.com - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 4 Resources configured. ============ Online: [ node2.ikki.com node1.ikki.com ] Master/Slave Set: ms_mysqldrbd [mysqldrbd] Masters: [ node2.ikki.com ] Slaves: [ node1.ikki.com ] mystore (ocf::heartbeat:Filesystem): Started node2.ikki.com mysqld (lsb:mysqld): Started node2.ikki.com vip (ocf::heartbeat:IPaddr): Started node2.ikki.com