一、MMM概述
MMM (Master-Master Replication Manager for MySQL) is a set of flexible scripts to perform monitoring/failover and management of MySQL Master-Master replication configurations (with only one node writable at any time). The toolset also has the ability to read balance standard master/slave configurations with any number of slaves, so you can use it to move virtual IP addresses around a group of servers depending on whether they are behind in replication. In addition to that, it also has scripts for data backups, resynchronization between nodes etc.
二、MMM的模式
分两种典型模式
两个节点的设置
two masters + one/many slaves
下面以第二种模式介绍:
大致规划:
服务器列表:
服务器 | IP | 主机名 | server id |
monitoring host | 192.168.0.100 | monitor | - |
master 1 | 192.168.0.111 | db1 | 111 |
master 2 | 192.168.0.112 | db2 | 112 |
slave 1 | 192.168.0.113 | db3 | 113 |
slave 2 | 192.168.0.114 | db4 | 114 |
虚拟IP列表:
IP | role | description |
192.168.0.5 | writer | 应用程序连接的VIP进行写操作 |
192.168.0.6 | reader | 应用程序连接的VIP进行读操作 |
192.168.0.7 | reader |
三、安装mysql
首先在master1上安装
[root@db1 ~]# tar xf mysql-5.5.33-linux2.6-x86_64.tar.gz -C /usr/local/ [root@db1 ~]# ln -sv /usr/local/mysql-5.5.33-linux2.6-x86_64 /usr/local/mysql [root@db1 ~]# groupadd -r mysql && useradd -r -g mysql -s /sbin/nologin mysql [root@db1 ~]# cd /usr/local/mysql [root@db1 mysql]# chown -R root.mysql ./* [root@db1 mysql]# scripts/mysql_install_db --user=mysql --datadir=/mydata/data/ [root@db1 mysql]# cp support-files/my-large.cnf /etc/my.cnf [root@db1 mysql]# cp support-files/mysql.server /etc/rc.d/init.d/mysqld [root@db1 mysql]# chmod +x /etc/rc.d/init.d/mysqld # #修改配置文件 [root@db1 mysql]# vim /etc/my.cnf #需要添加和修改的内容;其他内容不变 datadir = /mydata/data #指定数据位置 log-bin=/mydata/logs/master-bin #指定二进制日志位置 binlog_format=mixed # server-id = 111 #server ID relay_log = /var/log/mysql-relay-bin #中继日志 expire_logs_days = 10 max_binlog_size = 100M log_slave_updates = 1 auto_increment_increment = 2 #自动增长步长 auto_increment_offset=1 #起始值 # #配置完成后对应的拷贝配置文件到其他的三台主机上并相应的对其修改即可 #注意的是server ID和自动增长值(只需更改master2);slave的不需要添加 [root@db1 ~]# service mysqld start Starting MySQL............. [ OK ]
创建用户
用户 | description | privileges |
monitor user | 用于mmm检查mysql服务器健康状况的用户 | replication client |
agent user | 用于mmm代理为只读模式和复制等的用户 | super,replication client,process |
replication user | 复制的用户 | replication slave |
mysql> grant replication client on *.* to '3m_mon'@'192.168.0.%' identified by '123456'; mysql> grant super,replication client,process on *.* to '3m_agent'@'192.168.0.%' identified by '123456'; mysql> grant replication slave on *.* to 'rp_user'@'192.168.0.%' identified by '123456';
四、同步数据
复制到各数据库
[root@db1 ~]# mysqldump -uroot --lock-all-tables \ > --all-databases --flush-logs \ > --master-data=2 \ > --events > /tmp/all-databases.sql #备份master1的数据库 [root@db1 ~]# scp /tmp/all-databases.sql db2:/tmp [root@db1 ~]# scp /tmp/all-databases.sql db3:/tmp [root@db1 ~]# scp /tmp/all-databases.sql db4:/tmp #将备份传送到其他三台主机 # #在其他三台主机导入数据库 [root@db2 ~]# mysql < /tmp/all-databases.sql [root@db3 ~]# mysql < /tmp/all-databases.sql [root@db4 ~]# mysql < /tmp/all-databases.sql
设置master1/master2为双主;slave1/slave2为master1的从
#(db2) mysql> change master to master_host='192.168.0.111',\ -> master_user='rp_user',\ -> master_password='123456',\ -> master_log_file='master-bin.000002',\ -> master_log_pos=107; Query OK, 0 rows affected (0.03 sec) #(db3) mysql> change master to master_host='192.168.0.111',\ -> master_user='rp_user',\ -> master_password='123456',\ -> master_log_file='master-bin.000002',\ -> master_log_pos=107; Query OK, 0 rows affected (0.03 sec) #(db4) mysql> change master to master_host='192.168.0.111',\ -> master_user='rp_user',\ -> master_password='123456',\ -> master_log_file='master-bin.000002',\ -> master_log_pos=107; Query OK, 0 rows affected (0.03 sec) # # #三台均启动并查看;注意是每台都需要查看是否正常 mysql> start slave; mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.0.111 Master_User: rp_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 703 Relay_Log_File: mysql-relay-bin.000003 Relay_Log_Pos: 850 Relay_Master_Log_File: master-bin.000002 Slave_IO_Running: Yes #这里为yes就说明正常 Slave_SQL_Running: Yes #这里为yes就说明正常
配置db1的主为db2
#查看master2的master_log_file and master_log_pos: (db2) mysql> show master status; +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | master-bin.000001 | 191 | | | +-------------------+----------+--------------+------------------+ 1 row in set (0.00 sec) #这里的数字跟上面的可能有出入;是因为这步出了点故障;后来重新装的;步骤是没有问题的 #在db1上配置 (db1) mysql> change master to master_host='192.168.0.112',\ -> master_user='rp_uer', -> master_password='123456',\ -> master_log_file='master-bin.000001',\ -> master_log_pos=191; Query OK, 0 rows affected (0.09 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.0.112 Master_User: rp_user Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 191 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 254 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes
五、安装MMM
#在monitor主机上安装 [root@montior ~]# yum -y install mysql-mmm-* [root@montior ~]# rpm -ql mysql-mmm-* /etc/logrotate.d/mysql-mmm /etc/mysql-mmm /etc/mysql-mmm/mmm_common.conf /usr/share/doc/mysql-mmm-2.2.1 #各db主机上安装: [root@db1 ~]# yum -y install mysql-mmm-agent
配置MMM
[root@montior ~]# cat /etc/mysql-mmm/mmm_common.conf active_master_role writer <host default> cluster_interface eth0 pid_path /var/run/mysql-mmm/mmm_agentd.pid bin_path /usr/libexec/mysql-mmm/ replication_user rp_user replication_password 123456 agent_user 3m_agent agent_password 123456 </host> <host db1> ip 192.168.0.111 mode master peer db2 #peer表示db1,db2是同等级别的 </host> <host db2> ip 192.168.0.112 mode master peer db1 </host> <host db3> ip 192.168.0.113 mode slave </host> <host db4> ip 192.168.0.114 mode slave </host> <role writer> hosts db1, db2 ips 192.168.0.5 mode exclusive #exclusive表示排它 </role> <role reader> hosts db3,db4 ips 192.168.0.6, 192.168.0.7 mode balanced #balanced表示可以共用 </role> #配置完成后拷贝到其他主机上各一份
monitor还需配置mmm_mon.conf
[root@montior ~]# vim /etc/mysql-mmm/mmm_mon.conf include mmm_common.conf <monitor> ip 127.0.0.1 pid_path /var/run/mysql-mmm/mmm_mond.pid bin_path /usr/libexec/mysql-mmm status_path /var/lib/mysql-mmm/mmm_mond.status ping_ips 192.168.0.111,192.168.0.112,192.168.0.113,192.168.0.114 auto_set_online 60 # The kill_host_bin does not exist by default, though the monitor will # throw a warning about it missing. See the section 5.10 "Kill Host # Functionality" in the PDF documentation. # # kill_host_bin /usr/libexec/mysql-mmm/monitor/kill_host # </monitor> <host default> monitor_user 3m_mon monitor_password 123456 </host> debug 0
对应的各个db上需要更改/etc/mysql-mmm/mmm_agent.conf
#(db1) include mmm_common.conf # The 'this' variable refers to this server. Proper operation requires # that 'this' server (db1 by default), as well as all other servers, have the # proper IP addresses set in mmm_common.conf. this db1 #db2就更改为this db2;以此类推
六、启动测试
[root@db1 ~]# chkconfig mysql-mmm-agent on #每个db都执行 [root@db1 ~]# service mysql-mmm-agent start Starting MMM Agent Daemon: [ OK ] [root@db1 ~]# [root@db2 ~]# service mysql-mmm-agent start Starting MMM Agent Daemon: [ OK ] [root@db2 ~]# #依次启动 #启动monitor [root@montior ~]# service mysql-mmm-monitor start Starting MMM Monitor Daemon: [ OK ]
查看状态加测试下
[root@montior ~]# mmm_control show db1(192.168.0.111) master/ONLINE. Roles: writer(192.168.0.5) db2(192.168.0.112) master/ONLINE. Roles: db3(192.168.0.113) slave/ONLINE. Roles: reader(192.168.0.6) db4(192.168.0.114) slave/ONLINE. Roles: reader(192.168.0.7) #启动查看是正常的。 #现在把master1离线;测试看下 [root@montior ~]# mmm_control set_offline db1 #为设置离线的 OK: State of 'db1' changed to ADMIN_OFFLINE. Now you can wait some time and check all roles! [root@montior ~]# mmm_control show db1(192.168.0.111) master/ADMIN_OFFLINE. Roles: db2(192.168.0.112) master/ONLINE. Roles: writer(192.168.0.5) db3(192.168.0.113) slave/ONLINE. Roles: reader(192.168.0.6) db4(192.168.0.114) slave/ONLINE. Roles: reader(192.168.0.7) #查看转移正常。
七、常用选项
默认check包含:
ping | 服务器是否可以ping通 |
rep_backlog | 检查复制延迟(replication behind) |
rep_threads | 检查复制线程是否运行正常 |
Mysql | 检测mysql是否可以连接 |
[root@montior ~]# mmm_ mmm_agentd mmm_backup mmm_clone mmm_control mmm_mond mmm_restore #有很多命令可以使用 [root@montior ~]# mmm_control --help 查看每个命令帮助信息和选项 Invalid command '--help' Valid commands are: help - show this message ping - ping monitor show - show status checks [<host>|all [<check>|all]] - show checks status set_online <host> - set host <host> online set_offline <host> - set host <host> offline mode - print current mode. set_active - switch into active mode. set_manual - switch into manual mode. set_passive - switch into passive mode. move_role [--force] <role> <host> - move exclusive role <role> to host <host> (Only use --force if you know what you are doing!) set_ip <ip> <host> - set role with ip <ip> to host <host> #以检测为例 [root@montior ~]# mmm_control checks db4 ping [last change: 2014/04/21 19:20:38] OK db4 mysql [last change: 2014/04/21 19:20:38] OK db4 rep_threads [last change: 2014/04/21 19:20:38] OK db4 rep_backlog [last change: 2014/04/21 19:20:38] OK: Backlog is null db2 ping [last change: 2014/04/21 19:20:38] OK db2 mysql [last change: 2014/04/21 19:20:38] OK db2 rep_threads [last change: 2014/04/21 19:20:38] OK db2 rep_backlog [last change: 2014/04/21 19:20:38] OK: Backlog is null db3 ping [last change: 2014/04/21 19:20:38] OK db3 mysql [last change: 2014/04/21 19:20:38] OK db3 rep_threads [last change: 2014/04/21 19:20:38] OK db3 rep_backlog [last change: 2014/04/21 19:20:38] OK: Backlog is null db1 ping [last change: 2014/04/21 19:20:38] OK db1 mysql [last change: 2014/04/21 19:20:38] OK db1 rep_threads [last change: 2014/04/21 19:28:49] OK db1 rep_backlog [last change: 2014/04/21 19:20:38] OK: Backlog is null #可以看出检测的选项和结果
Agent states
ONLINE
-All is peachy, only state in which a node can have a role assigned.
REPLICATION_DELAY
-Replication backlog is too big(Check rep_backlog failed)
REPLICATION_FAIL
-Replication threads are not running(Check rep_threads and rep_backlog failed)
AWAITING_RECOVERY
-Host is awaiting recovery Entered after HARD_OFFLINE -> all is ok
HARD_OFFLINE
-Host is offline (Check ping and/or mysql failed)
ADMIN_OFFLINE
-Host was set to offline manually
UNKNOWN
-Host is in unknown state
如有错误;恳请纠正。