18.1 Linux集群介绍
linux集群功能可以分为:HA高可用、负载均衡。
高可用是:有两台机器,一台作为主用工作,一台作为冗余备份。当主用机器宕机后,冗余机器将接替,继续提供服务。
开源的高可用软件有:heartbeat、keepalived,现在基本上都使用keepalived。
负载均衡是:一台机器作为分发器,负责把用户的请求分发给后端真正的处理服务器,除了分发器之外,其他都是给用户提供服务的后端机器,后端机器至少有2台。
开源的负载均衡软件有:LVS、keepalived、haproxy、nginx等。
商业的负载均衡有:F5、Netscaler,优点是稳定,支持高并发,但是价格昂贵。
18.2 keepalived介绍
HA高可用我们重点学习keepalived。heartbeat由于在Centos6上会有一些问题,偶尔会切换不及时,已经比较少使用。
keepalived:是使用VRRP(虚拟路由冗余协议)实现的。
HA高可用:类似是多台功能相同的路由器组成的一个小组,小组中有一台master角色,N>=1个backup角色。
master会组播vrrp数据包给backup,当backup收不到master发送的数据包时,就认为master宕机了,使用backup来充当新的master(通过backup的优先级来判断哪个backup成功新的master)
keepalived包含有3个模块:
1、core:核心模块,是主进程启动和维护,全局配置文件加载解析等作用。
2、check:检查模块,负责健康检查。
3、vrrp:实现vrrp协议的模块。
18.3/18.4/18.5 用keepalived配置高可用集群
准备两台机器:A和B。A为:master角色(128)、B为:backup角色(130)
1、为两台机器都安装上keepalived:实现高可用的工具
[root@nginx ~]# yum install -y keepalived
已加载插件:fastestmirror
Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=stock error was
12: Timeout on http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=stock: (28, 'Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds')
base | 3.6 kB 00:00:00
epel/x86_64/metalink | 7.9 kB 00:00:00
epel | 3.2 kB 00:00:00
extras | 3.4 kB 00:00:00
updates | 3.4 kB 00:00:00
(1/3): epel/x86_64/updateinfo | 925 kB 00:00:01
(2/3): updates/7/x86_64/primary_db | 2.7 MB 00:00:10
(3/3): epel/x86_64/primary | 3.5 MB 00:00:11
Determining fastest mirrors
* base: mirror.lzu.edu.cn
* epel: mirrors.ustc.edu.cn
* extras: mirrors.cqu.edu.cn
* updates: centos.ustc.edu.cn
epel 12607/12607
正在解决依赖关系
--> 正在检查事务
---> 软件包 keepalived.x86_64.0.1.3.5-6.el7 将被 安装
--> 正在处理依赖关系 libnetsnmpmibs.so.31()(64bit),它被软件包 keepalived-1.3.5-6.el7.x86_64 需要
--> 正在处理依赖关系 libnetsnmpagent.so.31()(64bit),它被软件包 keepalived-1.3.5-6.el7.x86_64 需要
--> 正在处理依赖关系 libnetsnmp.so.31()(64bit),它被软件包 keepalived-1.3.5-6.el7.x86_64 需要
--> 正在检查事务
---> 软件包 net-snmp-agent-libs.x86_64.1.5.7.2-33.el7_5.2 将被 安装
---> 软件包 net-snmp-libs.x86_64.1.5.7.2-33.el7_5.2 将被 安装
--> 解决依赖关系完成
依赖关系解决
=======================================================================================================================================================
Package 架构 版本 源 大小
=======================================================================================================================================================
正在安装:
keepalived x86_64 1.3.5-6.el7 base 329 k
为依赖而安装:
net-snmp-agent-libs x86_64 1:5.7.2-33.el7_5.2 updates 705 k
net-snmp-libs x86_64 1:5.7.2-33.el7_5.2 updates 749 k
事务概要
=======================================================================================================================================================
安装 1 软件包 (+2 依赖软件包)
总下载量:1.7 M
安装大小:6.0 M
Downloading packages:
(1/3): keepalived-1.3.5-6.el7.x86_64.rpm | 329 kB 00:00:01
(2/3): net-snmp-agent-libs-5.7.2-33.el7_5.2.x86_64.rpm | 705 kB 00:00:02
(3/3): net-snmp-libs-5.7.2-33.el7_5.2.x86_64.rpm | 749 kB 00:00:02
-------------------------------------------------------------------------------------------------------------------------------------------------------
总计 659 kB/s | 1.7 MB 00:00:02
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
正在安装 : 1:net-snmp-libs-5.7.2-33.el7_5.2.x86_64 1/3
正在安装 : 1:net-snmp-agent-libs-5.7.2-33.el7_5.2.x86_64 2/3
正在安装 : keepalived-1.3.5-6.el7.x86_64 3/3
验证中 : 1:net-snmp-agent-libs-5.7.2-33.el7_5.2.x86_64 1/3
验证中 : keepalived-1.3.5-6.el7.x86_64 2/3
验证中 : 1:net-snmp-libs-5.7.2-33.el7_5.2.x86_64 3/3
已安装:
keepalived.x86_64 0:1.3.5-6.el7
作为依赖被安装:
net-snmp-agent-libs.x86_64 1:5.7.2-33.el7_5.2 net-snmp-libs.x86_64 1:5.7.2-33.el7_5.2
完毕!
2、为两台机器都安装上nginx:高可用的服务对象,可以直接yum安装用做试验。
yum install -y nginx
3、master上修改keepalived的配置文件:
[root@nginx keepalived]# vim /etc/keepalived/keepalived.conf
global_defs {
notification_email {
scause@163.com #定义接收告警的邮箱
}
notification_email_from scause@163.com #定义发送邮件的地址
smtp_server 127.0.0.1 #定义发邮件地址,若为127.0.0.1则使用本机自带邮件服务器进行发送
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx { #自定义检查模块名字
script "/usr/local/sbin/check_ng.sh" #自定义脚本,监控nginx服务的脚本
interval 3 #每3秒执行一次脚本
}
vrrp_instance VI_1 {
state MASTER #角色是master
interface ens33 #指定发送vrrp的网卡,针对哪个网卡监控VIP
virtual_router_id 51
priority 100 #权重,master的要比backup的权重大
advert_int 1
authentication {
auth_type PASS
auth_pass 123456 #定义验证密码
}
virtual_ipaddress {
192.168.87.100 //指定虚拟ip
}
track_script {
chk_nginx #指定上面定义的监控模块
}
}
[root@nginx keepalived]# ip addr //vip虚拟ip用ip addr才能看到,ifconfig看不到
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:7c:0b:e2 brd ff:ff:ff:ff:ff:ff
inet 192.168.87.128/24 brd 192.168.87.255 scope global ens33
valid_lft forever preferred_lft forever
inet 192.168.87.100/32 scope global ens33 //虚拟ip已经绑定在ens33网卡上了
valid_lft forever preferred_lft forever
inet6 fe80::2350:6934:56c7:a6c0/64 scope link
valid_lft forever preferred_lft forever
4、编写chk_nginx.sh监控脚本:修改权限
[root@nginx keepalived]# vim /usr/local/sbin/check_ng.sh
#!/bin/bash
#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
/etc/init.d/nginx start
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived //停止keepalived服务,不发送vrrp包给backup了
fi
fi
[root@nginx keepalived]# chmod 755 /usr/local/sbin/check_ng.sh
5、开启keepalived服务:开启监控的服务对象nginx
[root@nginx keepalived]# systemctl start keepalived
[root@nginx keepalived]# ps aux|grep keepalived //查看keepalived是否启动
root 1388 0.0 0.0 118652 1396 ? Ss 20:53 0:00 /usr/sbin/keepalived -D
root 1389 0.0 0.1 122852 2392 ? S 20:53 0:00 /usr/sbin/keepalived -D
root 1390 0.0 0.1 122852 2448 ? S 20:53 0:00 /usr/sbin/keepalived -D
root 1396 0.0 0.0 112720 972 pts/0 S+ 20:53 0:00 grep --color=auto keepalived
[root@nginx keepalived]# ps aux |grep nginx //查看nginx是否启动
root 863 0.0 0.0 46040 1276 ? Ss 20:08 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 868 0.0 0.2 48528 3912 ? S 20:08 0:00 nginx: worker process
nobody 869 0.0 0.2 48528 3912 ? S 20:08 0:00 nginx: worker process
root 896 0.0 0.0 115432 1712 ? S 20:08 0:00 /bin/sh /usr/local/mysql//bin/mysqld_safe --datadir=/data/mysql/ --pid-file=/data/mysql//nginx.pid
mysql 1119 0.3 24.5 1300896 458628 ? Sl 20:08 0:09 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/data/mysql --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/data/mysql/nginx.err --pid-file=/data/mysql//nginx.pid
root 1418 0.0 0.0 112724 972 pts/0 S+ 20:53 0:00 grep --color=auto nginx
6、关闭nginx,查看是否会自动重新启动nginx服务:因为keepalived会3秒执行检查脚本,如果nginx没进程就会自动启动。
[root@nginx keepalived]# /etc/init.d/nginx stop
Stopping nginx (via systemctl): [ 确定 ]
[root@nginx keepalived]# ps aux |grep nginx //3秒后再ps一下,nginx又被启动了,证明keepalived生效了。
root 896 0.0 0.0 115432 1712 ? S 20:08 0:00 /bin/sh /usr/local/mysql//bin/mysqld_safe --datadir=/data/mysql/ --pid-file=/data/mysql//nginx.pid
mysql 1119 0.3 24.5 1300896 458628 ? Sl 20:08 0:09 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/data/mysql --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/data/mysql/nginx.err --pid-file=/data/mysql//nginx.pid
root 1567 0.0 0.0 46040 1276 ? Ss 20:54 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 1569 0.0 0.2 48528 3912 ? S 20:54 0:00 nginx: worker process
nobody 1570 0.0 0.2 48528 3912 ? S 20:54 0:00 nginx: worker process
root 1598 0.0 0.0 112720 972 pts/0 R+ 20:55 0:00 grep --color=auto nginx
7、配置backup机器的keepalived配置文件:检查主和从上的selinux和iptables都要关闭。
[root@lgs keepalived]# vim keepalived.conf
global_defs {
notification_email {
scause@163.com
}
notification_email_from scause@163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/usr/local/sbin/check_ng.sh" //监控nginx服务的脚本
interval 3
}
vrrp_instance VI_1 {
state BACKUP //backup角色
interface ens33 //指定
virtual_router_id 51 //虚拟路由id,与master必须一样
priority 90 //权重比master小
advert_int 1
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.87.100 //虚拟ip
}
track_script {
chk_nginx //指定监控模块
}
}
8、定义监控脚本:并改文件权限
[root@lgs keepalived]# vim /usr/local/sbin/check_ng.sh
#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
systemctl start nginx
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi
[root@lgs keepalived]# chmod 755 /usr/local/sbin/check_ng.sh
9、启动keepalived服务:
[root@lgs keepalived]# systemctl start keepalived
[root@lgs keepalived]# ps aux|grep keepalived
root 1522 0.0 0.0 118608 1384 ? Ss 21:12 0:00 /usr/sbin/keepalived -D
root 1523 0.4 0.1 127468 3296 ? S 21:12 0:00 /usr/sbin/keepalived -D
root 1524 0.1 0.1 127408 2820 ? S 21:12 0:00 /usr/sbin/keepalived -D
root 1556 0.0 0.0 112676 988 pts/0 R+ 21:12 0:00 grep --color=auto keepalived
[root@lgs keepalived]# ps aux|grep nginx
root 906 0.0 0.0 20504 628 ? Ss 20:08 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 910 0.0 0.1 22948 3468 ? S 20:08 0:00 nginx: worker process
nobody 911 0.0 0.1 22948 3220 ? S 20:08 0:00 nginx: worker process
root 1598 0.0 0.0 112676 988 pts/0 R+ 21:12 0:00 grep --color=auto nginx
10、验证高可用是否生效:
1、单独访问master:192.168.87.128的nginx页面,显示:"This is HA master !"
2、单独访问backup:192.168.87.130的nginx页面,显示:This is HA backup !
3、访问VIP:192.168.87.100,默认是访问到master上,显示:This is master!
4、把master的keepalived关闭,模拟master宕机
[root@nginx conf]# systemctl stop keepalived
5、master宕机后,访问VIP:192.168.87.100,则backup接替了原来的master。成为新的master,显示:This is HA backup !