MariaDB Galera Cluster介绍
Galera Cluster是Codership公司开发的一套免费开源的高可用方案,Galera Cluster即安装了Galera的Mariadb集群。其本身具有multi-master特性,支持多点写入。Galera Cluster的三个(或多个)节点是对等关系,每个节点均支持写入,集群内部会保证写入数据的一致性与完整性。
官网:
http://galeracluster.com 文档:
https://galeracluster.com/library/documentation/index.html
官方给出的特性如下:
- 真正的多主集群,Active-Active架构,即所有节点可以同时读写数据库;
- 同步复制,没有复制延迟;
- 多线程复制;
- 没有主从切换操作,无需使用虚IP;
- 热备份,单个节点故障期间不会影响数据库业务;
- 支持节点自动加入,无需手动拷贝数据,自动的节点成员控制,失效节点自动被清除;新节点加入数据自动复制;
- 支持InnoDB存储引擎;
- 对应用程序透明,原生MySQL接口;
- 无需做读写分离;
- 部署使用简单。
galera集群缺点:
- 加入新节点时开销大,需要复制完整数据;
- 不能有效地解决写扩展的问题,所有的写操作都发生在所有的节点;
- 有多少个节点,就有多少份重复的数据;
- 由于事务提交需要跨节点通信,即涉及分布式事务操作,因此写入会比主从复制慢很多,节点越多,写入越慢,死锁和回滚也会更加频繁;
- 对网络要求比较高,如果网络出现波动不稳定,则可能会造成两个节点失联,Galera Cluster集群会发生脑裂,服务将不可用.
- 仅支持InnoDB/XtraDB存储引擎,任何写入其他引擎的表,包括mysql.*表都不会被复制,DDL语句可以复制,但是insert into mysql.user(MyISAM存储引擎)之类的插入数据不会被复制;
- Delete操作不支持没有主键的表,因为没有主键的表在不同的节点上的顺序不同,如果执行select … limit …将出现不同的结果集;
- 整个集群的写入吞吐量取决于最弱的节点限制,集群要使用同一的配置.
mariadb galera mult-master replication集群
基于认证的复制原理
https://galeracluster.com/library/documentation/tech-desc-introduction.html Galera集群的复制功能基于Galeralibrary实现,为了让MySQL与Galera library通讯,特别针对MySQL开发了wsrep API。
Galera插件保证集群同步数据,保持数据的一致性,靠的就是可认证的复制,工作原理如下图:
Galera集群的复制功能是基于认证的复制,其流程如下:
- 当客户端发出一个commit的命令,在事务被提交之前,所有对数据库的更改都会被write-set收集起来,并且将write-set纪录的内容发送给其他节点。
- write-set将在每个节点进行认证测试,测试结果决定着节点是否应用write-set更改数据。
- 如果认证测试失败,节点将丢弃write-set;如果认证测试成功,则事务提交。
mariadb Galera集群部署
参考:https://mariadb.com/kb/en/getting-started-with-mariadb-galera-cluster/
准备3个centos 7节点部署galera集群。
关闭防火墙和selinxu
systemctl disable --now firewalld
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config && setenforce 0
配置主机名
hostnamectl set-hostname xxx
配置主机名解析
cat > /etc/hosts <<EOF
192.168.93.11 galera1
192.168.93.12 galera2
192.168.93.13 galera3
EOF
安装mariadb
https://mariadb.com/kb/en/yum/#installing-mariadb-galera-cluster-with-yum
#配置yum源
cat > /etc/yum.repos.d/mariadb.repo <<EOF
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.4/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
EOF
#替换为中科大源
sed -i 's#yum\.mariadb\.org#mirrors.ustc.edu.cn/mariadb/yum#' /etc/yum.repos.d/mariadb.repo
#安装mariadb、galera和rsync,其中galera作为依赖自动安装
yum install -y MariaDB-server MariaDB-client rsync
配置mariadb-galera-cluste
https://mariadb.com/kb/en/configuring-mariadb-galera-cluster/
https://galeracluster.com/library/training/tutorials/galera-on-aws.html
galera1节点
cat > /etc/my.cnf.d/server.cnf <<EOF
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
bind-address=0.0.0.0
user=mysql
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=128M
binlog_format=ROW
log-error=/var/log/mysqld.log
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_node_name='galera1'
wsrep_node_address="192.168.93.11"
wsrep_cluster_name='galera-cluster'
wsrep_cluster_address="gcomm://192.168.93.11,192.168.93.12,192.168.93.13"
wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_slave_threads=4
wsrep_sst_method=rsync
EOF
galera2节点
cat > /etc/my.cnf.d/server.cnf <<EOF
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
bind-address=0.0.0.0
user=mysql
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=128M
binlog_format=ROW
log-error=/var/log/mysqld.log
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_node_name='galera2'
wsrep_node_address="192.168.93.12"
wsrep_cluster_name='galera-cluster'
wsrep_cluster_address="gcomm://192.168.93.11,192.168.93.12,192.168.93.13"
wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_slave_threads=4
wsrep_sst_method=rsync
EOF
galera3节点
cat > /etc/my.cnf.d/server.cnf <<EOF
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
bind-address=0.0.0.0
user=mysql
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=128M
binlog_format=ROW
log-error=/var/log/mysqld.log
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_node_name='galera3'
wsrep_node_address="192.168.93.13"
wsrep_cluster_name='galera-cluster'
wsrep_cluster_address="gcomm://192.168.93.11,192.168.93.12,192.168.93.13"
wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_slave_threads=4
wsrep_sst_method=rsync
EOF
当使用xtrabackup方式进行热备时,需要修改wsrep_sst_method并创建同步账号,而rsync方式不需要。
启动集群,在集群任意一个节点上执行
galera_new_cluster
systemctl enable mariadb
另外2个节点执行
systemctl enable --now mariadb
验证集群状态,默认未配置密码直接回车:
[root@galera1 ~]# mysql -uroot -p -e "SHOW STATUS LIKE 'wsrep_cluster_size'"
Enter password:
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
验证数据同步
在galera1节点新建数据库galera_test,然后在galera2和galera3节点查询,如果可以查询到galera_test库,说明数据同步成功,集群运行正常。
[root@galera1 ~]# mysql -e "create database galera_test"
[root@galera2 ~]# mysql -e "show databases;"
+--------------------+
| Database |
+--------------------+
| galera_test |
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
[root@galera3 ~]# mysql -e "show databases;"
+--------------------+
| Database |
+--------------------+
| galera_test |
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
配置lb
参考:
https://galeracluster.com/library/documentation/ha-proxy.htmlhttps://www.shuzhiduo.com/A/QW5YywmY5m/
3个节点安装haproxy和keepalived
yum install -y haproxy keepalived
修改keepalived配置文件,使用非抢占模式,每个节点配置相同,注意修改interface参数和virtual_ipaddress地址
cat > /etc/keepalived/keepalived.conf <<EOF
! Configuration File for keepalived
global_defs {
router_id galera
}
vrrp_script check_haproxy {
script "pidof haproxy"
interval 2
fall 2
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 100
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.93.20
}
track_script {
check_haproxy
}
}
EOF
修改haproxy配置文件,每个节点配置相同
cat > /etc/haproxy/haproxy.cfg <<EOF
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen haproxy_stats
bind *:1080
mode http
balance roundrobin
stats uri /haproxy-stats
stats auth admin:admin
listen galera
bind *:3307
balance roundrobin
mode tcp
option tcpka
option mysql-check user haproxy
server galera1 192.168.93.11:3306 check weight 1
server galera2 192.168.93.12:3306 check weight 1
server galera3 192.168.93.13:3306 check weight 1
EOF
任意节点创建haproxy针对数据库检查用户haproxy
CREATE USER 'haproxy'@'%';
3个节点启动启动haproxy和keepalived服务
systemctl enable --now haproxy keepalived
查看vip在哪个节点,可以停止某个节点或vip所在节点haproxy服务,验证vip会发生漂移:
[root@galera1 ~]# ip a | grep 93
inet 192.168.93.11/24 brd 192.168.93.255 scope global ens33
inet 192.168.93.20/32 scope global ens33
查看haproxy状态:http://192.168.93.20:8084/haproxy-stats,用户名密码admin/admin。
配置maraidb root用户允许远程连接,任意节点执行即可:
#默认未配置密码,直接回车即可
mysql -uroot -p
#配置远程连接
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123456' WITH GRANT OPTION;
客户端通过vip的3307端口可以成功访问galera集群
mysql -uroot -p123456 -h192.168.93.20 -P 3307
重新启动集群
MariaDB galera cluster 所有节点服务全部停止后再次启动会报错,需要按照以下方法启动集群
[root@galera1 ~]# cat /var/lib/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid: 8460aa9c-9c82-11ea-b458-0a6235722174
seqno: -1
safe_to_bootstrap: 1
修改safe_to_bootstrap参数改为1,然后在该节点执行以下命令启动第一个节点,执行以下命令后参数会被重新置为0:
galera_new_cluster
然后启动另外2个节点:
systemctl start mariadb
集群状态监控
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_%';
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
| wsrep_local_state_uuid | 8460aa9c-9c82-11ea-b458-0a6235722174 |
| wsrep_protocol_version | 10 |
| wsrep_last_committed | 10 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_repl_keys | 0 |
| wsrep_repl_keys_bytes | 0 |
| wsrep_repl_data_bytes | 0 |
| wsrep_repl_other_bytes | 0 |
| wsrep_received | 4 |
| wsrep_received_bytes | 568 |
| wsrep_local_commits | 0 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_max | 1 |
| wsrep_local_send_queue_min | 0 |
| wsrep_local_send_queue_avg | 0 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_max | 2 |
| wsrep_local_recv_queue_min | 0 |
| wsrep_local_recv_queue_avg | 0.25 |
| wsrep_local_cached_downto | 4 |
| wsrep_flow_control_paused_ns | 0 |
| wsrep_flow_control_paused | 0 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_cert_deps_distance | 0 |
| wsrep_apply_oooe | 0 |
| wsrep_apply_oool | 0 |
| wsrep_apply_window | 0 |
| wsrep_commit_oooe | 0 |
| wsrep_commit_oool | 0 |
| wsrep_commit_window | 0 |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
| wsrep_cert_index_size | 0 |
| wsrep_causal_reads | 0 |
| wsrep_cert_interval | 0 |
| wsrep_open_transactions | 0 |
| wsrep_open_connections | 0 |
| wsrep_incoming_addresses | AUTO,AUTO,AUTO |
| wsrep_cluster_weight | 3 |
| wsrep_desync_count | 0 |
| wsrep_evs_delayed | |
| wsrep_evs_evict_list | |
| wsrep_evs_repl_latency | 0/0/0/0/0 |
| wsrep_evs_state | OPERATIONAL |
| wsrep_gcomm_uuid | 67444d22-9c91-11ea-9774-cbf0d4795cf0 |
| wsrep_applier_thread_count | 4 |
| wsrep_cluster_capabilities | |
| wsrep_cluster_conf_id | 3 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | 8460aa9c-9c82-11ea-b458-0a6235722174 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_index | 0 |
| wsrep_provider_capabilities | :MULTI_MASTER:CERTIFICATION:PARALLEL_APPLYING:TRX_REPLAY:ISOLATION:PAUSE:CAUSAL_READS:INCREMENTAL_WRITESET:UNORDERED:PREORDERED:STREAMING:NBO: |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <info@codership.com> |
| wsrep_provider_version | 26.4.4(r4599) |
| wsrep_ready | ON |
| wsrep_rollbacker_thread_count | 1 |
| wsrep_thread_count | 5 |
+-------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+
65 rows in set (0.001 sec)
单独查看cluster_status
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_status';
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| wsrep_cluster_status | Primary |
+----------------------+---------+
1 row in set (0.001 sec)
参数说明:
- wsrep_local_index = 2 在集群中的索引值
- wsrep_cluster_status为Primary,表示节点为主节点,正常读写。
- wsrep_ready为ON,表示集群正常运行。
- wsrep_connected: 如果该值为Off,且wsrep_ready的值也为Off,则说明该节点没有连接到集群
- wsrep_cluster_size为3,表示集群有三个节点。
- wsrep_cluster_state_uuid:在集群所有节点的值应该是相同的,有不同值的节点,说明其没有连接入集群。
- wsrep_cluster_conf_id:正常情况下所有节点上该值是一样的.如果值不同,说明该节点被临时”分区”了.当节点之间网络连接恢复的时候应该会恢复一样的值。
- wsrep_flow_control_paused:表示复制停止了多长时间.即表明集群因为Slave延迟而慢的程度.值为0~1,越靠近0越好,值为1表示复制完全停止.可优化wsrep_slave_threads的值来改善.
- wsrep_flow_control_sent:表示该节点已经停止复制了多少次.