服务 | IP |
Prometheus、grafana、alertmanager | 192.168.209.133 |
node_exporter、blackbox_exporter | 192.168.209.132 |
ipmi_exporter (没有物理服务器的忽略,虚拟机不行) | 10.254.254.109(物理服务器)10.254.254.108(BMC地址) |
#如果下面需要安装的包下载不下来, 就去这里下载
git clone https://gitee.com/deqxr/prometheus.git
一、部署Prometheus(在192.168.209.133执行)
wget https://github.com/prometheus/prometheus/releases/download/v2.37.2/prometheus-2.37.2.linux-amd64.tar.gz
echo "如果下载慢试试下面这个"
wget https://githubfast.com/prometheus/prometheus/releases/download/v2.37.2/prometheus-2.37.2.linux-amd64.tar.gz
tar -xzvf prometheus-2.37.2.linux-amd64.tar.gz -C /usr/local
cd /usr/local
mv prometheus-2.37.2.linux-amd64 prometheus
echo "查看Prometheus 版本"
cd /usr/local/prometheus
./prometheus --version
echo "检查Prometheus 配置文件是否有错误"
cd /usr/local/prometheus
./promtool check config prometheus.yml
echo "创建 prometheus 本地 TSDB 数据存储目录"
mkdir -p /var/lib/prometheus
echo "使用systemctl 管理 Prometheus"
cat /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --web.enable-lifecycle
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
systemctl enable prometheus
systemctl start prometheus
systemctl status prometheus
浏览器输入 http://192.168.209.133:9090
二、部署node_exporter(在192.168.209.132执行)
yum install wget -y
wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz
tar -zvxf node_exporter-1.4.0.linux-amd64.tar.gz -C /usr/local/
cd /usr/local
mv node_exporter-1.4.0.linux-amd64 node_exporter
echo "如果下载慢试试下面这个"
wget https://githubfast.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz
echo "systemctl 管理 node_exporter"
vim /usr/lib/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
ExecStart=/usr/local/node_exporter/node_exporter
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
systemctl enable node_exporter
systemctl start node_exporter
三、配置Prometheus监控node节点(在192.168.209.133执行)
echo "添加如下内容"
vi /usr/local/prometheus/prometheus.yml
- job_name: "node1"
static_configs:
- targets: ['192.168.209.132:9100']
echo "检查语法"
cd /usr/local/prometheus
/usr/local/prometheus/promtool check config prometheus.yml
echo "热加载 prometheus 配置"
curl -X POST http://127.0.0.1:9090/-/reload
#查看http metrics 采集指标
#Prometheus中验证
四、部署blackbox_exporter(在192.168.209.132部署)
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.22.0/blackbox_exporter-0.22.0.linux-amd64.tar.gztar -zvxf blackbox_exporter-0.22.0.linux-amd64.tar.gz -C /usr/local/
cd /usr/local/
mv blackbox_exporter-0.22.0.linux-amd64 blackbox_exporter
echo "如果下载慢试试下面这个"
wget https://githubfast.com/prometheus/blackbox_exporter/releases/download/v0.22.0/blackbox_exporter-0.22.0.linux-amd64.tar.gz
cd /usr/local/blackbox_exporter
./blackbox_exporter --version
vi /usr/lib/systemd/system/blackbox_exporter.service
[Unit]
Description=blackbox_exporter
After=network.target
[Service]
User=root
Type=simple
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
systemctl start blackbox_exporter && systemctl enable blackbox_exporter
ps -ef | grep blackbox_exporter
五、配置监控blackbox_exporter(在192.168.209.133部署)
ICMP监控主机存活状态配置
vi /usr/local/prometheus/prometheus.yml
#icmp ping 监控
- job_name: icmp_ping
metrics_path: /probe
params:
module: [icmp]
file_sd_configs:
- refresh_interval: 10s
files:
- "/usr/local/prometheus/conf.d/ping_status.yml" #具体的配置文件
# static_configs:
# - targets: ['223.5.5.5','114.114.114.114']
# labels:
# instance: node_status
# group: 'icmp-node'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.209.132:9115
vi /usr/local/prometheus/conf.d/ping_status.yml
- targets: ['220.181.38.150','14.215.177.39','180.101.49.12','14.215.177.39','180.101.49.11','14.215.177.38','14.215.177.38']
labels:
group: '一线城市-电信网络监控'
- targets: ['112.80.248.75','163.177.151.109','61.135.169.125','163.177.151.110','180.101.49.11','61.135.169.121','180.101.49.11']
labels:
group: '一线城市-联通网络监控'
- targets: ['183.232.231.172','36.152.44.95','182.61.200.6','36.152.44.96','220.181.38.149']
labels:
group: '一线城市-移动网络监控'
TCP 监控端口配置
vi /usr/local/prometheus/prometheus.yml
#监控tcp端口
- job_name: tcp_port
metrics_path: /probe
params:
module: [tcp_connect]
file_sd_configs:
- files: ['/usr/local/prometheus/conf.d/tcp_port/*.yml']
refresh_interval: 10s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.209.132:9115
mkdir /usr/local/prometheus/conf.d/tcp_port/
vi /usr/local/prometheus/conf.d/tcp_port/tcp_port.yml
- targets: ['192.168.209.132:80','192.168.209.132:22']
labels:
group: 'tcp port'
HTTP GET 监控的配置
vi /usr/local/prometheus/prometheus.yml
# http get 监控
- job_name: http_get
metrics_path: /probe
params:
module: [http_2xx]
file_sd_configs:
- files: ['/usr/local/prometheus/conf.d/http_get/*.yml']
refresh_interval: 10s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 安装blackbox_expoter的ip地址:9115
mkdir -p /usr/local/prometheus/conf.d/http_get
vi /usr/local/prometheus/conf.d/http_get/http_get.yml
- targets:
- 192.168.209.132:80
- 192.168.209.131:80
- 192.168.209.133:80
- https://www.jd.com
- https://www.baidu.com
- https://www.zhihu.com
- https://www.pinduoduo.com
labels:
name: 'http_get'
systemctl restart prometheus
六、部署ipmi_exporter(在10.254.254.109部署)没有物理服务器的不要安装,虚拟机没有dmc
wget https://github.com/prometheus-community/ipmi_exporter/releases/download/v1.8.0/ipmi_exporter-1.8.0.linux-amd64.tar.gz
tar xf ipmi_exporter-1.8.0.linux-amd64.tar.gz
mv /opt/ipmi_exporter-1.8.0.linux-amd64 /opt/ipmi_exporter
ln -sf /opt/ipmi_exporter/ipmi_exporter /usr/local/bin/ipmi_exporter
mkdir /opt/ipmi_exporter
ipmi_exporter -h
vim /opt/ipmi_exporter/ipmi_remote.yml
modules:
default:
# These settings are used if no module is specified, the
# specified module doesn't exist, or of course if
# module=default is specified.
user: "IPMI/BMC用户名" # 无特殊需求,填写这两行账号密码即可,IPMI/BMC用户名
pass: "IPMI/BMC密码" # 无特殊需求,填写这两行账号密码即可,IPMI/BMC密码
# The below settings correspond to driver-type, privilege-level, and
# session-timeout respectively, see `man 5 freeipmi.conf` (and e.g.
# `man 8 ipmi-sensors` for a list of driver types).
driver: "LAN_2_0"
privilege: "user"
# The session timeout is in milliseconds. Note that a scrape can take up
# to (session-timeout * #-of-collectors) milliseconds, so set the scrape
# timeout in Prometheus accordingly.
# Must be larger than the retransmission timeout, which defaults to 1000.
timeout: 10000
# Available collectors are bmc, bmc-watchdog, ipmi, chassis, dcmi, sel,
# and sm-lan-mode
# If _not_ specified, bmc, ipmi, chassis, and dcmi are used
collectors:
- bmc
- ipmi
- chassis
- dcmi
# Got any sensors you don't care about? Add them here.
exclude_sensor_ids:
- 2
- 29
- 32
- 50
- 52
- 55
dcmi:
# Use these settings when scraped with module=dcmi.
user: "IPMI/BMC用户名"
pass: "IPMI/BMC密码"
privilege: "admin"
driver: "LAN_2_0"
collectors:
- dcmi
thatspecialhost:
# Use these settings when scraped with module=thatspecialhost.
user: "IPMI/BMC用户名"
pass: "IPMI/BMC密码"
privilege: "admin"
driver: "LAN"
collectors:
- ipmi
- sel
# Need any special workaround flags set? Add them here.
# Workaround flags might be needed to address issues with specific vendor implementations
# e.g. https://www.gnu.org/software/freeipmi/freeipmi-faq.html#Why-is-the-output-from-FreeIPMI-different-than-another-software_003f
# For a full list of flags, refer to:
# https://www.gnu.org/software/freeipmi/manpages/man8/ipmi-sensors.8.html#lbAL
workaround_flags:
- discretereading
# If you require additional command line arguments (e.g. --bridge-sensors for ipmimonitoring),
# you can specify them per collector - BE CAREFUL, you can easily break the exporter with this!
custom_args:
ipmi:
- "--bridge-sensors"
advanced:
# Use these settings when scraped with module=advanced.
user: "IPMI/BMC用户名"
pass: "IPMI/BMC密码"
privilege: "admin"
driver: "LAN"
collectors:
- ipmi
- sel
# USING ANY OF THE BELOW VOIDS YOUR WARRANTY! YOU MAY GET BITTEN BY SHARKS!
# You can override the command to be executed for a collector. Paired with
# custom_args, this can be used to e.g. execute the IPMI tools with sudo:
collector_cmd:
ipmi: sudo
sel: sudo
custom_args:
ipmi:
- "ipmimonitoring"
sel:
- "ipmi-sel"
useradd prometheus -M -s /sbin/nologin
vim /etc/systemd/system/ipmi_exporter.service
[Unit]
Description=IPMI Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Type=simple
ExecStart=/usr/local/bin/ipmi_exporter --config.file=/opt/ipmi_exporter/ipmi_remote.yml
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable ipmi_exporter --now
journalctl -u ipmi_exporter.service -f
安装FreeIPMI
发行版 | 安装命令 |
Archlinux | pacman -Sy extra/freeipmi |
Centos/Redhat | yum install freeipmi -y |
Debian/Ubuntu | apt install freeipmi -y |
Gentoo | emerge --ask freeipmi |
安装后,通过tab补全,可以看到这些依赖组件都是独立的命令:
七、配置Prometheus监控ipmi_exporter(在192.168.209.133部署)
vim /usr/local/prometheus/ipmi_targets.yml
- targets:
- 10.254.254.108 # 被监控的IPMI主机IP
labels:
job: ipmi_exporter
vim /usr/local/prometheus/prometheus.yml
#服务器硬件信息
- job_name: ipmi_exporter
params:
module: ['default']
scrape_interval: 1m
scrape_timeout: 30s
metrics_path: /ipmi
scheme: http
file_sd_configs:
- files:
- /usr/local/prometheus/ipmi_targets.yml
refresh_interval: 5m
relabel_configs:
- source_labels: [__address__]
separator: ;
regex: (.*)
target_label: __param_target
replacement: ${1}
action: replace
- source_labels: [__param_target]
separator: ;
regex: (.*)
target_label: instance
replacement: ${1}
action: replace
- separator: ;
regex: .*
target_label: __address__
replacement: 10.254.254.109:9290 # 这里写ipmi_exporter服务所在的主机,如果运行在同一机器,则写本机即可
systemctl restart prometheus
八、部署Grafana(在192.168.209.133部署)
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.2.2-1.x86_64.rpm
yum install grafana-enterprise-9.2.2-1.x86_64.rpm
systemctl enable grafana-server
systemctl start grafana-server
访问 地址:http://192.168.133:3000
默认账号密码:admin/admin
第一次登录后会要求更改密码
配置数据源
导入node_exporter模版
模版ID 8919 监控node_exporter数据(有些数据收集不上来不知道是怎么回事)
导入blackbox_exporter数据模版
模版ID 9965
导入ipmi_exporter数据模版
模版ID 15765
附:
#DNS解析监控
- job_name: blackbox_all
metrics_path: /probe
params:
module: [ http_2xx ] # Look for a HTTP 200 response.
dns_sd_configs:
- names:
- www.bilibili.com
- prometheus.io
type: A
port: 443
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
replacement: https://$1/ # Make probe URL be like https://1.2.3.4:443/
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.209.132:9115 # The blackbox exporter's real hostname:port.
- source_labels: [__meta_dns_name]
target_label: __param_hostname # Make domain name become 'Host' header for probe requests
- source_labels: [__meta_dns_name]
target_label: vhost # and store it in 'vhost' label
grafana模版ID 13230