概述
SmartX 的分布式块存储 ZBS 可以与 VMware 虚拟化在同一节点上以超融合方式部署。
原理:
- 使用该种架构时,SMTX OS 运行在 ESXi 上的虚拟机内,称为 SCVM
- 计算虚拟化服务则由 VMware ESXi 提供
- ESXi 通过将 RAID 卡(HBA/JBOD 模式)直通给 SCVM,服务器上的所有硬盘将被 SCVM 接管并提供分布式块存储服务
实验环境
使用的软件:
SMTXOS-5.0.2-el7-2201071258-x86_64 — SCVM
VMware-VMvisor-Installer-7.0U3c-19193900.x86_64 — ESXi
SMX-ZBSNasPlugin_2.1-2 — ESXi NFS driver
硬件:
配件名称 | 配件型号 | 固件版本 |
CPU | Intel® Xeon® Gold 6234 CPU @ 3.30GHz(*1),Intel® Xeon® Silver 4210R CPU @ 2.40GHz(*2) | N/A |
MEM | 128 GB*3 | N/A |
Raid卡 | Dell HBA330 Mini | 16.17.00.05 |
SSD | INTEL:SSDSC2KG960G8R | N/A |
HDD | TOSHIBA:AL15SEB24EQY | N/A |
节点 | IPMI | Kernel-ESXi管理(vmk0) | Kernel-Storage/vMotion(vmk1) | Kernel-NFS(vmk2) |
esx-01 | 192.168.87.201/20 | 192.168.87.211/20 | 10.10.87.211/24 | 192.168.33.1/24 |
esx-02 | 192.168.87.202/20 | 192.168.87.212/20 | 10.10.87.212/24 | 192.168.33.1/24 |
esx-03 | 192.168.87.203/20 | 192.168.87.213/20 | 10.10.87.213/24 | 192.168.33.1/24 |
VM | SMTX管理 | Storage | SMTX NFS |
SCVM-01 | 192.168.87.216/20 | 10.10.87.216/24 | 192.168.33.2/24 |
SCVM-02 | 192.168.87.217/20 | 10.10.87.217/24 | 192.168.33.2/24 |
SCVM-03 | 192.168.87.218/20 | 10.10.87.218/24 | 192.168.33.2/24 |
实验逻辑图
逻辑架构
ESXi 拓扑
实验步骤
安装ESXi
- 按照设计安装ESXi,配置DNS Server:192.168.95.206;内部域名 smartx.lab
- 需要打开ESXi的ssh和shell
安装vCenter Server
在某一台ESXi上安装vCenter,过程 略
修改ESXi直通模式
- 在每一个 ESXi 上运行如下命令:
esxcfg-advcfg -s 30 /NFS/HeartbeatTimeout
esxcfg-advcfg -s 64 /NFS/MaxVolumes
esxcfg-advcfg -s 0 /Misc/APDHandlingEnable
esxcfg-advcfg -s 32 /Net/TcpipHeapSize
esxcfg-advcfg -s 1536 /Net/TcpipHeapMax
esxcfg-advcfg -s 1 /UserVars/SuppressShellWarning
- 查看磁盘控制器编号,在每台 ESXi 主机上执行
[root@esx-01:~] lspci -v |grep "HBA" -A 1 -B 1
0000:3b:00.0 Mass storage controller Serial Attached SCSI controller: Avago (LSI Logic) Dell HBA330 Mini [vmhba2]
Class 0107: 1000:0097
- 将上述编号添加到 /etc/vmware/passthru.map中
#Avago (LSI Logic) Dell HBA330 Mini [vmhba2]
1000 0097 d3d0 false
- 使用 ESXi Web 连接到 ESXi。选择:配置-> 硬件-> PCI 设备-> 编辑-> (筛选HBA)勾选相应的HBA设备,选择更换直通,系统会提醒失败,重新引导。
- 重启以后,直通模式已经改成功
安装NFS插件
- 使用 VMware ESXi 7.0 U2 以及后续的版本,直接安装或升级至 2.1-2 版本
- 检查当前 ESXi 主机是否已安装 VAAI-NAS 插件
esxcli software vib list | grep -i zbs
- 安装插件
[root@esx-01:~] /etc/init.d/vaai-nasd stop
vaai-nasd is not running
[root@esx-01:~] ls /
SMX-ZBSNasPlugin_2.1-2.zip etc opt tardisks.noauto
altbootbank include proc tmp
bin lib productLocker usr
bootbank lib64 sbin var
bootpart.gz local.tgz scratch vmfs
bootpart4kn.gz local.tgz.ve store vmimages
dev locker tardisks
[root@esx-01:~] esxcli software component apply -d /SMX-ZBSNasPlugin_2.1-2.zip
Installation Result
Components Installed: SMX-ZBSNasPlugin_1.0-0.0.0006
Components Removed:
Components Skipped:
Message: Operation finished successfully.
Reboot Required: false
- 启动插件
[root@esx-01:~] /etc/init.d/vaai-nasd start
vaai-nasd started
- 查看插件运行情况,并核对版本信息
[root@esx-01:~] esxcli software vib list | grep -i zbs
SMX-ESX-ZBSNasPlugin 2.1-2 SMX VMwareAccepted 2022-04-27
配置ESXi网络
正如前面ESXi拓扑所示,我们需要在每个ESXi配置单个标准交换机来提供网络服务,其中管理用交换机延用vswitch0。
- 使用命令行配置交换机并用页面查看
- 生产环境尽量使用冗余上联链路
- 创建标准交换机
#创建标准交换机
esxcli network vswitch standard add --vswitch-name=vSS-NFS
esxcli network vswitch standard add --vswitch-name=vSS-ZBS
#创建vmkernel 设置ip地址
esxcli network vswitch standard portgroup add --portgroup-name=NFS_VMkernel --vswitch-name=vSS-NFS
esxcli network ip interface add --interface-name=vmk1 --portgroup-name=NFS_VMkernel
esxcli network ip interface ipv4 set --interface-name=vmk1 --ipv4=192.168.33.1 --netmask=255.255.255.0 --type=static
esxcli network vswitch standard portgroup add --portgroup-name=ZBS_VMkernel --vswitch-name=vSS-ZBS
esxcli network ip interface add --interface-name=vmk2 --portgroup-name=ZBS_VMkernel
esxcli network ip interface ipv4 set --interface-name=vmk2 --ipv4=10.10.87.211 --netmask=255.255.255.0 --type=static #按照ESXi修改地址
#添加上行链路(万兆vmnic)
esxcli network vswitch standard uplink add --uplink-name=vmnic2 --vswitch-name=vSS-ZBS
#创建SCVM端口组
esxcli network vswitch standard portgroup add --portgroup-name=NFS --vswitch-name=vSS-NFS
esxcli network vswitch standard portgroup add --portgroup-name=ZBS --vswitch-name=vSS-ZBS
#在vmk2上打开vMotion功能
vim-cmd hostsvc/vmotion/vnic_set vmk2
- 查看结果
在 ESXi 节点上创建 SCVM 虚拟机
在集群的每个节点上安装并配置 ESXi 完毕后,需要在每个节点上创建一个虚拟机 SCVM 用于安装 SMTX OS 。
资源名称 | 资源需求 | 备注 |
vCPU | 6 | 若需开启 RDMA,则建议分配 7 * vCPU |
内存 | 18 GB | 若开启 RDMA,则最少需要 18 GB 内存和 3 GB 大页内存 |
硬盘 | 最少1 GB | 安装位置: SATA DOM |
网络 | 管理网络、存储网络、NFS 网络 | |
磁盘 | 控制器直通模式 |
- Guest OS选择Linux CentOS 7
- CPU预留为主频*6,我们使用的CPU:
Intel® Xeon® Gold 6234 CPU @ 3.30GHz(*1)
Intel® Xeon® Silver 4210R CPU @ 2.40GHz(*2)
所以预留的CPU为 19752/14364 (系统会提示) - RAM:18G全部预留,在选择PCI设备–直通HBA卡的时候系统会自动预留(仅vCenter web client)
- 三块网卡选择不同的子网
- Iso选择SMTXOS-5.0.2-el7-2201071258-x86_64,需上传
- SCSI控制器类型选择:LSI Logic 并行
- 启动SCVM后,选择 Automatic Installation,系统将自动完成安装SMTX OS
小结SCVM配置
- 生成VM时配置:
- 修改VM配置:
几点注意事项:
- 先不设置敏感度高和cpu预留,vm基本配置第一步设置好,生成vm
- 改敏感度高,系统会提示cpu预留值
- 改cpu预留值
- 硬盘需要精简模式
- 选择pci直通时,系统会自动保留全部内存(仅vCenter web client)
配置ZBS Cluster
- 选择任意SCVM IP地址,https://IP
- 如果没有DHCP,使用CentOS命令配置IP地址
- 如果打不开,登录SCVM确认nginx和zbs-deploy-server服务是否开启;如果没打开用systemctl重启服务
systemctl status nginx
systemctl status zbs-deploy-server
- 输入Cluster名称并选择vSphere
- Scan发现主机,这一步改主机名
- 配置存储,在混合模式下,都是用分层结构
- 修改设备各个网卡的IP地址
- 进入系统自动安装集群阶段。一段时间后如果界面没刷新,可以直接F5
- 根据系统提示配置管理员密码等
- 最后一步配置CloudTower管理平台,选择不配置。
为vSphere配置NFS存储
- 登录任意一台SCVM的IP地址,此平台成为Fisheys。
首页可以观察到现在可以提供的最大存储值 - 选择设置图标
- 关联vCenter
- 如果关联失败,刷新界面后再次关联
- 点击NFS数据存储,创建NFS存储,需要选择挂载带所有主机
- 此时在vCenter上可以看到已经挂载了一个DataStore,大小为整个存储池。此时可以查证刚才在Fisheys首页看到的存储大小
- 也可以在ESXi上通过命令查看
[root@esx-01:~] esxcli storage nfs list
Volume Name Host Share Accessible Mounted Read-Only isPE Hardware Acceleration
------------- ------------ -------- ---------- ------- --------- ----- ---------------------
zbs-domain-c8 192.168.33.2 /nfs/zbs true true false false Supported
至此,vSphere上面的NFS DataStore已经建成,可以使用了。
为SMTX OS 配置高可用
配置 SCVM 在 ESXi 开机时自动启动
在vCenter上选择ESXi为每台SCVM配置开关机自动启动/关闭
开启 vCenter 集群 HA
- 直接开启HA会报错
这是由于APD参数导致,修改参数后再启动HA - 编辑所有ESXi的高级系统配置
将Misc.APDHandlingEnable 参数的值修改为 1 - 启动HA,注意在vCenter上禁用Datastore的相关HA
- 开启 vSphere HA 选项后,将每个 ESXi 主机的 Misc.APDHandlingEnable 参数的值重新设置为0
配置IO Reroute
在 SMTX OS 集群部署完并正常运⾏后,我们可以看到ZBS 提供的NFS存储的访问地址为192.168.33.2。
正常情况下,其为挂载与vSS 上的自连子网;但是在NFS 交换机、VM网卡或网络发生故障时,NFS存储将不能访问。
为了保证 SMTX OS 的⾼可⽤性,考虑到在ZBS环境,网络上还有一份副本,可以配置IO重路由得到第二路由,在发生本地不可访问的时候,依然有副本可以使用。
IO 重路由是指在本地 SCVM 失效的情况下,将业务虚拟机的 IO 流量重新路由到远程的 SCVM 进⾏恢复,确保业务的⾼可⽤性。
正常使⽤ IO 重路由功能,需要在 vSphere Center 中继续完成 ESXi 主机的部分参数配置,并在 SCVM 中部署 IO 重路由脚本。
- 配置ESXi主机和SCVM之间双向SSH
SCVM对SSH无禁止,整理只需要配置ESXi主机的SSH。
在安装ESXi的时候,我们已经启用了SSH,我们还需要再vCenter上配置ESXi的防火墙,放行SSH流量
编辑:
注意:此处不要选择简体中文界面(vCenter 7.0 U3c),会出现看不到选项的情况。繁体中文没这个问题。 - 登录到任意SCVM,运行以下命令
[root@scvm-211 18:13:55 ~]$zbs-deploy-manage deploy-hypervisor --gen_ssh_key
2022-04-27 18:15:53,266 INFO deploy_hypervisor: Collect node info
2022-04-27 18:15:54,838 INFO get_si: create new service instance for host: 192.168.87.211, no exist session
2022-04-27 18:15:54,838 INFO _create_new_si: create new service instance for host: 192.168.87.211
2022-04-27 18:15:57,136 INFO deploy_hypervisor: Start deploy hypervisor
2022-04-27 18:15:57,151 INFO deploy_hypervisor: hypervisor platform: vmware
2022-04-27 18:15:57,152 INFO deploy_vmware: create key pairs for cluster
2022-04-27 18:15:57,840 INFO remote_ssh: command: ssh root@192.168.87.211 "esxcli vm process list | grep -A 4 '56 4d 40 c9 31 5a ac 51-1e 98 5a 84 e5 48 a9 8c' | grep 'Config File' | awk -F"volumes" '{print $2}' | awk -F '/' '{print $2}'"
2022-04-27 18:15:58,780 INFO get_current_esxi_account: datastore path in esxi 192.168.87.211 : /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872
2022-04-27 18:15:58,781 INFO deploy_vmware_configuration: add reroute public key to authorized_keys of current scvm
2022-04-27 18:15:58,787 INFO deploy_vmware_configuration: generate scripts esxi need
2022-04-27 18:15:58,814 INFO deploy_vmware_configuration: scp scripts to esxi 192.168.87.211
2022-04-27 18:15:58,814 INFO remote_scp: scp command: scp -r /usr/share/tuna/script/vmware_scvm_failure root@192.168.87.211:/vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure
2022-04-27 18:15:59,197 INFO remote_scp:
local.sh 0% 0 0.0KB/s --:local.sh 100% 217 216.9KB/s 00:00
timeout.sh 0% 0 0.0KB/s --:timeout.sh 100% 885 834.1KB/s 00:00
change_route.sh 0% 0 0.0KB/s --:change_route.sh 100% 8851 4.4MB/s 00:00
scvm_failure_loop.sh 0% 0 0.0KB/s --:scvm_failure_loop.sh 100% 21KB 9.7MB/s 00:00
2022-04-27 18:15:59,298 INFO deploy_vmware_configuration: collect cluster ips
2022-04-27 18:15:59,318 INFO deploy_vmware_configuration: write cluster ips to esxi 192.168.87.211
2022-04-27 18:15:59,318 INFO remote_ssh: command: ssh root@192.168.87.211 "printf 'local_scvm_data_ip=10.10.87.216
active_scvm_data_ip=10.10.87.216,10.10.87.218,10.10.87.217
local_scvm_manage_ip=192.168.87.216
active_scvm_manage_ip=192.168.87.216,192.168.87.218,192.168.87.217
local_scvm_nfs_ip=192.168.33.2
' > /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/scvm_ip"
2022-04-27 18:15:59,749 INFO deploy_vmware_configuration: config crontab job of esxi 192.168.87.211
2022-04-27 18:15:59,749 INFO remote_ssh: command: ssh root@192.168.87.211 "cat /var/run/crond.pid | xargs /bin/kill; sed -i '/scvm_failure_loop.sh/d' /var/spool/cron/crontabs/root; echo '* * * * * /bin/sh /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/scvm_failure_loop.sh &' >> /var/spool/cron/crontabs/root; crond"
2022-04-27 18:16:00,194 INFO deploy_vmware_configuration: clear exist running scvm_failure_loop script
2022-04-27 18:16:00,195 INFO remote_ssh: command: ssh root@192.168.87.211 "ps -c| grep scvm_failure_loop.sh | grep -v grep | grep -v vi | awk '{print $1}' | xargs /bin/kill"
2022-04-27 18:16:00,673 INFO deploy_vmware_configuration: config rc.local of esxi 192.168.87.211
2022-04-27 18:16:00,673 INFO remote_scp: scp command: scp -r /tmp/boot_reroute.sh root@192.168.87.211:/etc/rc.local.d/
2022-04-27 18:16:01,035 INFO remote_scp:
boot_reroute.sh 100% 271 1.3MB/s 00:00
2022-04-27 18:16:01,136 INFO deploy_vmware_configuration: dispatch reroute_key pair to esxi 192.168.87.211
2022-04-27 18:16:01,136 INFO remote_ssh: command: ssh root@192.168.87.211 "([ -d /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/.smartx_key ] || ([ ! -d /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/.smartx_key ] && mkdir -p /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/.smartx_key))"
2022-04-27 18:16:01,590 INFO remote_scp: scp command: scp -r /root/.ssh/smartx_reroute_id_rsa /root/.ssh/smartx_reroute_id_rsa.pub root@192.168.87.211:/vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/.smartx_key
2022-04-27 18:16:01,948 INFO remote_scp:
smartx_reroute_id_rsa 100% 2602 2.9MB/s 00:00
smartx_reroute_id_rsa.pub 100% 567 680.8KB/s 00:00
2022-04-27 18:16:02,049 INFO deploy_hypervisor: Finish deploy hypervisor
- 在其他SCVM上运行
[root@scvm-212 18:18:17 ~]$zbs-deploy-manage deploy-hypervisor
2022-04-27 18:18:23,157 INFO deploy_hypervisor: Collect node info
2022-04-27 18:18:24,808 INFO get_si: create new service instance for host: 192.168.87.212, no exist session
2022-04-27 18:18:24,808 INFO _create_new_si: create new service instance for host: 192.168.87.212
2022-04-27 18:18:31,388 INFO deploy_hypervisor: Start deploy hypervisor
2022-04-27 18:18:31,402 INFO deploy_hypervisor: hypervisor platform: vmware
2022-04-27 18:18:31,596 INFO remote_ssh: command: ssh root@192.168.87.212 "esxcli vm process list | grep -A 4 '42 08 8e 3b 49 aa 1a 6d-e7 42 ff 28 de 5b 6b 5c' | grep 'Config File' | awk -F"volumes" '{print $2}' | awk -F '/' '{print $2}'"
2022-04-27 18:18:32,537 INFO remote_ssh: command: ssh root@192.168.87.212 "[ -f /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/.smartx_key/smartx_reroute_id_rsa.pub ] && [ -f /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/.smartx_key/smartx_reroute_id_rsa ] && echo -n 1"
2022-04-27 18:18:32,983 INFO remote_ssh: command: ssh root@192.168.87.211 "esxcli vm process list | grep -A 4 '56 4d 40 c9 31 5a ac 51-1e 98 5a 84 e5 48 a9 8c' | grep 'Config File' | awk -F"volumes" '{print $2}' | awk -F '/' '{print $2}'"
2022-04-27 18:18:33,927 INFO remote_ssh: command: ssh root@192.168.87.211 "[ -f /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/.smartx_key/smartx_reroute_id_rsa.pub ] && [ -f /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/.smartx_key/smartx_reroute_id_rsa ] && echo -n 1"
2022-04-27 18:18:34,361 INFO deploy_vmware: scp reroute key pair from esxi 192.168.87.211 to local
2022-04-27 18:18:34,362 INFO remote_scp: scp command: scp -r root@192.168.87.211:/vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/.smartx_key/smartx_reroute_id_rsa* /root/.ssh/
2022-04-27 18:18:34,729 INFO remote_scp:
smartx_reroute_id_rsa 100% 2602 6.5MB/s 00:00
smartx_reroute_id_rsa.pub 100% 567 1.3MB/s 00:00
2022-04-27 18:18:34,919 INFO remote_ssh: command: ssh root@192.168.87.212 "esxcli vm process list | grep -A 4 '42 08 8e 3b 49 aa 1a 6d-e7 42 ff 28 de 5b 6b 5c' | grep 'Config File' | awk -F"volumes" '{print $2}' | awk -F '/' '{print $2}'"
2022-04-27 18:18:35,794 INFO get_current_esxi_account: datastore path in esxi 192.168.87.212 : /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934
2022-04-27 18:18:35,795 INFO deploy_vmware_configuration: add reroute public key to authorized_keys of current scvm
2022-04-27 18:18:35,800 INFO deploy_vmware_configuration: generate scripts esxi need
2022-04-27 18:18:35,826 INFO deploy_vmware_configuration: scp scripts to esxi 192.168.87.212
2022-04-27 18:18:35,826 INFO remote_scp: scp command: scp -r /usr/share/tuna/script/vmware_scvm_failure root@192.168.87.212:/vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure
2022-04-27 18:18:36,214 INFO remote_scp:
timeout.sh 100% 885 953.4KB/s 00:00
scvm_failure_loop.sh 100% 21KB 7.9MB/s 00:00
change_route.sh 100% 8851 2.1MB/s 00:00
local.sh 100% 217 234.6KB/s 00:00
2022-04-27 18:18:36,314 INFO deploy_vmware_configuration: collect cluster ips
2022-04-27 18:18:36,331 INFO deploy_vmware_configuration: write cluster ips to esxi 192.168.87.212
2022-04-27 18:18:36,332 INFO remote_ssh: command: ssh root@192.168.87.212 "printf 'local_scvm_data_ip=10.10.87.217
active_scvm_data_ip=10.10.87.216,10.10.87.218,10.10.87.217
local_scvm_manage_ip=192.168.87.217
active_scvm_manage_ip=192.168.87.216,192.168.87.218,192.168.87.217
local_scvm_nfs_ip=192.168.33.2
' > /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/scvm_ip"
2022-04-27 18:18:36,770 INFO deploy_vmware_configuration: config crontab job of esxi 192.168.87.212
2022-04-27 18:18:36,771 INFO remote_ssh: command: ssh root@192.168.87.212 "cat /var/run/crond.pid | xargs /bin/kill; sed -i '/scvm_failure_loop.sh/d' /var/spool/cron/crontabs/root; echo '* * * * * /bin/sh /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/scvm_failure_loop.sh &' >> /var/spool/cron/crontabs/root; crond"
2022-04-27 18:18:37,221 INFO deploy_vmware_configuration: clear exist running scvm_failure_loop script
2022-04-27 18:18:37,222 INFO remote_ssh: command: ssh root@192.168.87.212 "ps -c| grep scvm_failure_loop.sh | grep -v grep | grep -v vi | awk '{print $1}' | xargs /bin/kill"
2022-04-27 18:18:37,690 INFO deploy_vmware_configuration: config rc.local of esxi 192.168.87.212
2022-04-27 18:18:37,690 INFO remote_scp: scp command: scp -r /tmp/boot_reroute.sh root@192.168.87.212:/etc/rc.local.d/
2022-04-27 18:18:38,040 INFO remote_scp:
boot_reroute.sh 100% 271 1.3MB/s 00:00
2022-04-27 18:18:38,140 INFO deploy_vmware_configuration: dispatch reroute_key pair to esxi 192.168.87.212
2022-04-27 18:18:38,140 INFO remote_ssh: command: ssh root@192.168.87.212 "([ -d /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/.smartx_key ] || ([ ! -d /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/.smartx_key ] && mkdir -p /vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/.smartx_key))"
2022-04-27 18:18:38,584 INFO remote_scp: scp command: scp -r /root/.ssh/smartx_reroute_id_rsa /root/.ssh/smartx_reroute_id_rsa.pub root@192.168.87.212:/vmfs/volumes/6268a5c2-9fd4be4e-20b0-54bf64f90934/vmware_scvm_failure/.smartx_key
2022-04-27 18:18:38,938 INFO remote_scp:
smartx_reroute_id_rsa 100% 2602 2.8MB/s 00:00
smartx_reroute_id_rsa.pub 100% 567 766.2KB/s 00:00
2022-04-27 18:18:39,039 INFO deploy_hypervisor: Finish deploy hypervisor
- 检查配置是否成功
- 在所有 ESXi 节点上执⾏命令:ps -c | grep scvm_failure | grep -v grep,界⾯显⽰有相关脚本在后台运⾏,如下
[root@esx-01:~] ps -c | grep scvm_failure | grep -v grep
2106418 2106418 sh /bin/sh /vmfs/volumes/6268a4b5-ed2ade8a-b645-54bf64f90872/vmware_scvm_failure/scvm_failure_loop.sh
- 在所有 ESXi 节点上执⾏命令:tail -f /var/log/scvm_failure.log。检查所有的scvm_failure 日志。如果⽇志在持续输出,则该步骤测试为通过
[root@esx-01:~] tail -f /var/log/scvm_failure.log
Wed Apr 27 10:24:22 UTC 2022 [+] start a round
Wed Apr 27 10:24:22 UTC 2022 [+] update session list: new_local_zone_data_ips: 10.10.87.216,10.10.87.218,10.10.87.217
new_remote_zone_data_ips: empty new_local_zone_manage_ips = 192.168.87.216,192.168.87.218,192.168.87.217
new_remote_zone_manage_ips: empty
Wed Apr 27 10:24:22 UTC 2022 [+] ------------------------------------------------
Wed Apr 27 10:24:25 UTC 2022 [+] start a round
Wed Apr 27 10:24:25 UTC 2022 [+] update session list: new_local_zone_data_ips: 10.10.87.216,10.10.87.218,10.10.87.217
new_remote_zone_data_ips: empty new_local_zone_manage_ips = 192.168.87.216,192.168.87.218,192.168.87.217
new_remote_zone_manage_ips: empty
Wed Apr 27 10:24:25 UTC 2022 [+] ------------------------------------------------
Wed Apr 27 10:24:28 UTC 2022 [+] start a round
Wed Apr 27 10:24:28 UTC 2022 [+] update session list: new_local_zone_data_ips: 10.10.87.216,10.10.87.218,10.10.87.217
new_remote_zone_data_ips: empty new_local_zone_manage_ips = 192.168.87.216,192.168.87.218,192.168.87.217
new_remote_zone_manage_ips: empty
Wed Apr 27 10:24:28 UTC 2022 [+] ------------------------------------------------
- 在 ESXi节点上运⾏命令:esxcfg-route -l。显⽰ 192.168.33.2 被路由重定向到本地 SCVM 的存储⽹络地址,则该步骤测试为通过
[root@esx-01:~] esxcfg-route -l
VMkernel Routes:
Network Netmask Gateway Interface
192.168.33.2 255.255.255.255 10.10.87.216 vmk2
10.10.87.0 255.255.255.0 Local Subnet vmk2
192.168.33.0 255.255.255.0 Local Subnet vmk1
192.168.80.0 255.255.240.0 Local Subnet vmk0
default 0.0.0.0 192.168.80.1 vmk0
- 多了一条路由:192.168.33.2/32指向vmk2,即SCVM的存储网络端口
- 而192.168.33.0/24为直连路由,指向vmk1,优先级高
- 当vmk1不可用时,192.168.33.2/32路由重定向到vmk2
配置完成。