在互联网时代的早期,计算机普及程度较低,业务简单,并发量相对较小,单体应用常常足以支撑业务量。随着互联网红利来临,并发量的增大,也对单体服务提出了较大的挑战,常见的解决方式是**增加服务器性能**(磁盘、内存、CPU),**集群部署**等。但单机并不能无限制增加资源且利用率会大幅度下降,集群部署需要前置的网关进行路由,网关层仍旧需要处理高并发与单点问题。本文将就`Nginx`反向代理服务器讲解网关层(流量网关,非应用网关,如`springcloud gateway`等)的**负载均衡算法**与基于**Keepalived+VIP**的高可用方案。
## 一、什么是负载均衡
负载均衡,英文名称为Load Balance,其含义就是指将负载(工作任务)进行平衡、分摊到多个操作单元上进行运行,简而言之就是充当流量统一入口,调度后方部署在多台机器的应用。
## 二、负载均衡分类
按软硬件分类:
- **硬件负载均衡**,基于`ASIC`实现,性能高,如常用的F5等,成本较高。
- **软件负载均衡**,如反向代理服务器`Nginx`等,适用于中小型企业,成本低廉。
按网络分类:
- **四层负载均衡**,维持同一个TCP连接,性能高,如`LVS`。
- **七层负载均衡**, 基于各类应用层协议,功能较为丰富,但性能不如四层负载均衡,如`Nginx`。
## 三、负载均衡算法
这里我们列举常用的负载均衡算法:
- **轮循均衡**(Round Robin):每次客户端请求轮流分配给内部服务器,不断循环。这种算法适合于服务器软硬件配置大致相同的场景。
- **权重轮循均衡**(Weighted Round Robin):类似与轮询算法,但会根据服务器的不同处理能力,给每个服务器分配不同的权值,使请求按比例打算到内部服务器。如服务器 A、B、C 的权值被设计成 1、2、2,则服务器 A、B、C 将分别接收到 20%、40%、40%的服务请求。此种均衡算法适合服务器配置不均的场景。
- **随机均衡**(Random):把客户端的请求随机分配给内部服务器,理论上在数据足够大的场景下能达到相对均衡的分布。
- **一致性哈希均衡**(Consistency Hash):构建一个环形hash表,根据请求中某一些数据(可以是 MAC、IP 地址,也可以是更上层,如应用层HTTP报文中的某些参数信息)作为特征值来计算需要落在的节点上,为保证服务均与打散与节点宕机后仍能命中服务,会创建多个虚拟节点。
- ...
## 四、Keepalived+VIP+DNS轮询方案
部署环境如下:
**VIP** | **内网IP** | **主机名** | **Nginx端口** |
| ----------- | ------------ | ----------- | ---------------
**192.168.16.11** | **192.168.16.16** | **keepalive-nginx-1** | **8031** |
**192.168.16.11** | **192.168.16.17** | **keepalive-nginx-2** | **8031** |
参考架构图:
**1. 安装nginx**
- 将`nginx`添加到`yum repro`库中
```
rpm -Uvh http://nginx.org/packages/centos/7/noarch/RPMS/nginx-release-centos-7-0.el7.ngx.noarch.rpm
```
- 安装`nginx`
`yum -y install nginx`
- 验证
```
[root@localhost ~]# nginx -v
nginx version: nginx/1.20.2
```
- 配置Nginx端口
```
vi /etc/nginx/conf.d/default.conf
# 192.168.16.10
server {
listen 8001; #修改default端口为8031
server_name localhost;
...
}
# 192.168.16.11
server {
listen 8031; #修改default端口为8031
server_name localhost;
...
}
```
- 启动 Nginx,并设置开机启动
`systemctl start nginx & systemctl enable nginx`
如果报权限错误,关闭SELINUX
`vi /etc/selinux/config`,将`SELINUX=enforcing`改为`SELINUX=disabled`
- 查看 Nginx 启动状态
```
[root@localhost ~]# systemctl status nginx
● nginx.service - nginx - high performance web server
Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2022-04-27 15:54:19 CST; 1min 40s ago
Docs: http://nginx.org/en/docs/
Main PID: 1050 (nginx)
CGroup: /system.slice/nginx.service
├─1050 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
├─1051 nginx: worker process
├─1052 nginx: worker process
├─1053 nginx: worker process
└─1054 nginx: worker process
Apr 27 15:54:19 localhost.localdomain systemd[1]: Starting nginx - high performance web server...
Apr 27 15:54:19 localhost.localdomain systemd[1]: Started nginx - high performance web server.
```
- 页面验证
```
C:\Users\86189>curl http://192.168.16.11:8031/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
成功返回`nginx`欢迎页。
**2. 安装keepalived**
- 下载 Keepalived
wget https://www.keepalived.org/software/keepalived-2.1.5.tar.gz
- 安装 Keepalived
```
# 安装依赖
$ yum -y install gcc-c++
$ yum -y install openssl-devel
# 安装keepalived
$ tar -xvzf keepalived-2.1.5.tar.gz
$ cd keepalived-2.1.5
$ ./configure --prefix=/usr/local/keepalived
$ make & make install
```
- 配置 Keepalived
```
# 创建/etc/keepalived目录
$ mkdir /etc/keepalived
$ cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf
$ cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived
```
- 修改EnvironmentFile配置
`vi /lib/systemd/system/keepalived.service`
```
[Unit]
Description=LVS and VRRP High Availability Monitor
After=network-online.target syslog.target
Wants=network-online.target
[Service]
Type=forking
PIDFile=/run/keepalived.pid
KillMode=process
EnvironmentFile=-/etc/sysconfig/keepalived # 此处修改为/etc/sysconfig/keepalived
ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
```
- 配置keepalive
两台机器执行`vi /etc/keepalived/keepalived.conf`
使用 `ip addr`查看网卡信息:
```
# 182.168.16.16
[root@localhost app]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:01:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:32:f8:bd brd ff:ff:ff:ff:ff:ff
inet 192.168.16.16/24 brd 192.168.16.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet 192.168.16.11/32 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::f219:afff:106:5f5f/64 scope link noprefixroute
valid_lft forever preferred_lft forever
# 182.168.16.16
[root@localhost app]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:d0:71:85 brd ff:ff:ff:ff:ff:ff
inet 192.168.16.17/24 brd 192.168.16.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet6 fe80::d296:dcd5:28ce:e88a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
```
主机 192.168.16.17 配置:
```
# 192.168.16.16主机
# 全局定义,定义全局的配置选项
global_defs {
# 指定keepalived在发生切换操作时发送email,发送给哪些email
# 建议在keepalived_notify.sh中发送邮件
notification_email {
acassen@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc # 发送email时邮件源地址
smtp_server 192.168.200.1 # 发送email时smtp服务器地址
smtp_connect_timeout 30 # 连接smtp的超时时间
router_id nginx-16-1 # 机器标识,通常可以设置为hostname
vrrp_skip_check_adv_addr # 如果接收到的报文和上一个报文来自同一个路由器,则不执行检查。默认是跳过检查
vrrp_garp_interval 0 # 单位秒,在一个网卡上每组gratuitous arp消息之间的延迟时间,默认为0
vrrp_gna_interval 0 # 单位秒,在一个网卡上每组na消息之间的延迟时间,默认为0
}
# 检测脚本配置
vrrp_script checkhaproxy
{
script "/etc/keepalived/check_nginx.sh" # 检测脚本路径
interval 5 # 检测时间间隔(秒)
weight 0 # 根据该权重改变priority,当值为0时,不改变实例的优先级
}
# VRRP实例配置
vrrp_instance VI_1 {
state BACKUP # 设置初始状态为'备份'
interface ens3 # 设置绑定VIP的网卡,例如ens3
virtual_router_id 51 # 配置集群VRID,互为主备的VRID需要是相同的值
nopreempt # 设置非抢占模式,只能设置在state为backup的节点上
priority 100 # 设置优先级,值范围0~254,值越大优先级越高,最高的为master
advert_int 1 # 组播信息发送时间间隔,两个节点必须设置一样,默认为1秒
# 验证信息,两个节点必须一致
authentication {
auth_type PASS # 认证方式,可以是PASS或AH两种认证方式
auth_pass 1111 # 认证密码
}
unicast_src_ip 192.168.16.16 # 设置本机内网IP地址
unicast_peer {
192.168.16.17 # 对端设备的IP地址
}
# VIP,当state为master时添加,当state为backup时删除
virtual_ipaddress {
192.168.16.11 # 设置高可用虚拟VIP,如果是腾讯云的CVM,需要填写控制台申请到的HAVIP地址。
}
# 要执行的检查脚本
track_script {
checkhaproxy
}
notify_master "/etc/keepalived/keepalived_notify.sh MASTER" # 当切换到master状态时执行脚本
notify_backup "/etc/keepalived/keepalived_notify.sh BACKUP" # 当切换到backup状态时执行脚本
notify_fault "/etc/keepalived/keepalived_notify.sh FAULT" # 当切换到fault状态时执行脚本
notify_stop "/etc/keepalived/keepalived_notify.sh STOP" # 当切换到stop状态时执行脚本
garp_master_delay 1 # 设置当切为主状态后多久更新ARP缓存
garp_master_refresh 5 # 设置主节点发送ARP报文的时间间隔
# 跟踪接口,里面任意一块网卡出现问题,都会进入故障(FAULT)状态
track_interface {
ens3
}
}
```
备机 192.168.16.17 配置:
```
# 全局定义,定义全局的配置选项
global_defs {
# 指定keepalived在发生切换操作时发送email,发送给哪些email
# 建议在keepalived_notify.sh中发送邮件
notification_email {
acassen@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc # 发送email时邮件源地址
smtp_server 192.168.200.1 # 发送email时smtp服务器地址
smtp_connect_timeout 30 # 连接smtp的超时时间
router_id nginx-17-2 # 机器标识,通常可以设置为hostname
vrrp_skip_check_adv_addr # 如果接收到的报文和上一个报文来自同一个路由器,则不执行检查。默认是跳过检查
vrrp_garp_interval 0 # 单位秒,在一个网卡上每组gratuitous arp消息之间的延迟时间,默认为0
vrrp_gna_interval 0 # 单位秒,在一个网卡上每组na消息之间的延迟时间,默认为0
}
# 检测脚本配置
vrrp_script checkhaproxy
{
script "/etc/keepalived/check_nginx.sh" # 检测脚本路径
interval 5 # 检测时间间隔(秒)
weight 0 # 根据该权重改变priority,当值为0时,不改变实例的优先级
}
# VRRP实例配置
vrrp_instance VI_1 {
state BACKUP # 设置初始状态为'备份'
interface ens3 # 设置绑定VIP的网卡,例如ens3
virtual_router_id 51 # 配置集群VRID,互为主备的VRID需要是相同的值
nopreempt # 设置非抢占模式,只能设置在state为backup的节点上
priority 50 # 设置优先级,值范围0~254,值越大优先级越高,最高的为master
advert_int 1 # 组播信息发送时间间隔,两个节点必须设置一样,默认为1秒
# 验证信息,两个节点必须一致
authentication {
auth_type PASS # 认证方式,可以是PASS或AH两种认证方式
auth_pass 1111 # 认证密码
}
unicast_src_ip 192.168.16.17 # 设置本机内网IP地址
unicast_peer {
192.168.16.16 # 对端设备的IP地址
}
# VIP,当state为master时添加,当state为backup时删除
virtual_ipaddress {
192.168.16.11 # 设置高可用虚拟VIP,如果是腾讯云的CVM,需要填写控制台申请到的HAVIP地址。
}
# 要执行的检查脚本
track_script {
checkhaproxy
}
notify_master "/etc/keepalived/keepalived_notify.sh MASTER" # 当切换到master状态时执行脚本
notify_backup "/etc/keepalived/keepalived_notify.sh BACKUP" # 当切换到backup状态时执行脚本
notify_fault "/etc/keepalived/keepalived_notify.sh FAULT" # 当切换到fault状态时执行脚本
notify_stop "/etc/keepalived/keepalived_notify.sh STOP" # 当切换到stop状态时执行脚本
garp_master_delay 1 # 设置当切为主状态后多久更新ARP缓存
garp_master_refresh 5 # 设置主节点发送ARP报文的时间间隔
# 跟踪接口,里面任意一块网卡出现问题,都会进入故障(FAULT)状态
track_interface {
ens3
}
}
```
定义检测脚本:
`vi /etc/keepalived/check_nginx.sh`
```
#!/usr/bin/env bash
NGINXPID="/run/nginx.pid"
if [ ! -f $NGINXPID ];then
killall keepalived
fi
```
定义告警脚本:
```
#!/usr/bin/env bash
# Use of this source code is governed by a MIT style
# license that can be found in the LICENSE file.
# /etc/keepalived/keepalived_notify.sh
log_file=/var/log/keepalived.log
iam::keepalived::mail() {
# 这里可以添加email逻辑,当keepalived变动时及时告警
:
}
iam::keepalived::log() {
echo "[`date '+%Y-%m-%d %T'`] $1" >> ${log_file}
}
[ ! -d /var/keepalived/ ] && mkdir -p /var/keepalived/
case "$1" in
"MASTER" )
iam::keepalived::log "notify_master"
;;
"BACKUP" )
iam::keepalived::log "notify_backup"
;;
"FAULT" )
iam::keepalived::log "notify_fault"
;;
"STOP" )
iam::keepalived::log "notify_stop"
;;
*)
iam::keepalived::log "keepalived_notify.sh: state error!"
;;
esac
```
- 启动 Keepalived,并设置开机启动
```
$ systemctl start keepalived
$ systemctl enable keepalived
```
- 检查 Keepalived 状态
```
systemctl status keepalived
* keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-04-28 11:11:52 CST; 3s ago
Process: 236527 ExecStart=/usr/local/keepalived/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 236528 (keepalived)
Tasks: 3
CGroup: /system.slice/keepalived.service
|-236528 /usr/local/keepalived/sbin/keepalived -D
|-236529 /usr/local/keepalived/sbin/keepalived -D
`-236530 /usr/local/keepalived/sbin/keepalived -D
Apr 28 11:11:52 localhost.localdomain Keepalived_vrrp[236530]: (VI_1) Entering ...
Apr 28 11:11:52 localhost.localdomain Keepalived_vrrp[236530]: VRRP sockpool: [...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Gained...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Gained...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Gained...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Activa...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Activa...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Activa...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Activa...
Apr 28 11:11:52 localhost.localdomain Keepalived_healthcheckers[236529]: Activa...
Hint: Some lines were ellipsized, use -l to show in full.
```
提示`Active: active (running)`即可。
- 配置文件解析
配置文件,大致分为下面 4 个部分。
1. global_defs:全局定义,定义全局的配置选项。
2. vrrp_script checkhaproxy:检测脚本配置。
3. vrrp_instance VI_1:VRRP 实例配置。
4. virtual_server:LVS 配置。如果没有配置 LVS+Keepalived,不需要该配置。
- 验证虚拟ip
使用`systemctl restart keepalived`重启keepalive服务,两台机器分分别执行`ip addr`,可以看到:
```
# 192.168.16.16
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:82:f8:bd brd ff:ff:ff:ff:ff:ff
inet 192.168.16.16/24 brd 192.168.16.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet 192.168.16.11/32 scope global ens3
valid_lft forever preferred_lft forever
```
主机增加了一个虚拟IP,`192.168.16.11`。如有异常可查看日志:`tailf var/log/messages`。
**3. 部署实践**
- 两台服务器均部署测试应用服务
`nohup java -jar -Dserver.port=8083 -Xms1024m -Xmx1024m springboot-web-demo-1.0-SNAPSHOT.jar &`
- 创建测试服务
`vi /etc/nginx/conf.d/cqbdri.conf`
```
# 192.168.16.16
server {
listen 8033;
server_name 192.168.16.16;
root /usr/share/nginx/html;
location / {
proxy_set_header X-Forwarded-Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://test;
client_max_body_size 5m;
}
error_page 404 /404.html;
location = /40x.html {
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}
```
```
# 192.168.16.17
server {
listen 8033;
server_name 192.168.16.17;
root /usr/share/nginx/html;
location / {
proxy_set_header X-Forwarded-Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://test;
client_max_body_size 5m;
}
error_page 404 /404.html;
location = /40x.html {
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}
```
- 服务测试
设置服务器名称:`hostnamectl set-hostname nginx1/nginx2`
DOS窗口下执行:`curl http://192.168.16.11:8033/hello?name=winson`
返回结果:`Hello winson! I'm Edge controller!`
nginx日志查看:`/var/log/nginx/access.log`
- 配置nginx负载均衡
`vi /etc/nginx/nginx.conf`
添加如下内容:
```
# 192.168.16.16
upstream test {
server 127.0.0.1:8083 weight=2;
server 192.168.16.17:8083 weight =1;
}
# 192.168.16.17
upstream test {
server 127.0.0.1:8083 weight=1;
server 192.168.16.16:8083 weight=2;
}
```
- 负载均衡测试
DOS窗口下执行:`curl http://192.168.16.11:8033/hello?name=winson`
根据权重策略循环返回:
2次`Hello winson! I'm Edge controller! nginx-1`
一次`Hello winson! I'm Edge controller! nginx-2`
- Keepalive测试
1. 杀掉主机nginx进程
`systemctl stop nginx`
2. 查看nginx pid
执行`cat /run/nginx.pid`,返回:
```
#主机
[root@localhost keepalived]# cat /run/nginx.pid
cat: /run/nginx.pid: No such file or directory
#备机
[root@localhost keepalived]# cat /run/nginx.pid
8873
```
3. 查看ip漂移情况
执行ip addr:
```
# 192.168.16.16
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:01:82:f8:bd brd ff:ff:ff:ff:ff:ff
inet 192.168.16.16/24 brd 192.168.16.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet6 fe80::f269:aeff:106:5f5f/64 scope link noprefixroute
valid_lft forever preferred_lft forever
# 192.168.16.17
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:d0:71:85 brd ff:ff:ff:ff:ff:ff
inet 192.168.16.17/24 brd 192.168.16.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
inet 192.168.16.11/32 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::da96:dcd5:28ca:e88a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
```
可以看到虚拟`192.168.16.11`已经漂移到了机器`192.168.16.17`上。
4. 测试keepalive
DOS窗口下执行:`curl http://192.168.16.11:8033/hello?name=winson`
根据权重策略循环返回:
2次`Hello winson! I'm Edge controller! nginx-1`
1次`Hello winson! I'm Edge controller! nginx-2`
nginx日志查看:`/var/log/nginx/access.log`
## 五、总结
以上我们就完成了了基于keepalive+VIP的高可用负载均衡方案,但仍旧存在一些问题:
1. 仅有一个VIP,备机始终处于闲置状态,如何提高使用率?
可以配置两个虚拟IP,上游使用**智能DNS**或**HTTPDNS**轮询,提高资源使用率。
参考架构图:
2. keepalive主备所在的交换机故障如何实现高可用?
可以使用交换机**堆叠模式**,服务器分别接在两个不同的交换机上,也可以在不通机架做冷备,手动切换。