java 阿普米修斯普罗米修斯阿尔忒弥斯

转载

数据大侠客 2023-08-23 16:04:56

文章标签 java 阿普米修斯运维数据服务器配置文件 文章分类 Java 后端开发

普罗米修斯使用详解

prometheus 监控原理

1、prometheus ：虽然说是监控平台，但是实际上是一套数据库和数据的调度指令
2、mysql_exporter: 可以理解成程序或者软件，他是工作在我们要监控的目标服务器上，主要是用于监控mysql的数据。
3、node_exporter: 他的作用主要是收集性能测试的数据，如cpu、内存磁盘网络等信息，然后将数据保存到prometheus，相当于将数据存入到数据库中。
4、prometheus 只能用于做数据存储，不能做展示，因此我们需要用到grafana组件。
5、grafana 主要是用于数据展示，并且可以做到定时读取数据

java 阿普米修斯普罗米修斯阿尔忒弥斯_服务器

prometheus server

下载、安装并运行普罗米修斯。您还将下载并安装exporter，这是一种在主机和服务上公开时间序列数据的工具。我们的第一个出口商将是普罗米修斯本身，它提供了关于内存使用、垃圾收集等多种主机级指标。

下載

官网指南：https://prometheus.io/docs/introduction/first_steps/

配置

示例配置文件中有三个配置块:global、rule_files和scrape_configs。

rule_files块指定了我们希望Prometheus服务器加载的任何规则的位置。现在我们没有规则。最后一个块scrape_configs控制Prometheus监视的资源。由于Prometheus也将自己的数据作为HTTP端点公开，因此它可以收集并监控自己的健康状况。在默认配置中，有一个名为prometheus的作业，它将刮除由prometheus服务器公开的时间序列数据。该作业包含一个静态配置的目标，即端口909e上的本地主机。Prometheus希望度量标准能够在/度量的道路上为目标提供。所以这个默认的工作是通过URL抓取:

配置文件

# my global config global块控制Prometheus服务器的全局配置。我们现在有两个选择。全局设置是每15秒刮一次。
global:
  scrape_interval: 15s # 第一个是scrape_interval，它控制普罗米修斯收集目标的频率。您可以针对单个目标重写此设置。
  evaluation_interval: 15s # evaluation_interval选项控制了Prometheus计算规则的频率。Prometheus使用规则来创建新的时间序列并生成警报。
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ["localhost:9090"]

  # 设置node
  - job_name: "node"
    static_configs:
     - targets: ['localhost:9091']

  # GPU 节点
  - job_name: "gpu"
    static_configs:
     - targets: ['localhost:9445']

--storage.tsdb.retention # 更改数据保存时间默认15天

运行

./prometheus --config.file=prometheus.yml

自帶的界面张这样，后面会改grefana

java 阿普米修斯普罗米修斯阿尔忒弥斯_运维_02

命令檢索

promhttp_metric_handler_requests_total
promhttp_metric_handler_requests_total{code="200"}
count(promhttp_metric_handler_requests_total)

检索语言表达式：

https://prometheus.io/docs/prometheus/latest/querying/basics/

面板模式

rate(promhttp_metric_handler_requests_total{code="200"}[1m])

java 阿普米修斯普罗米修斯阿尔忒弥斯_配置文件_03

检测其他目标

node_exporter 监控节点

作用

用来接受服务器的信息

安装

nohup ./node_exporter --web.listen-address=":9091" & # 默认9100

测试

curl http://localhost:9091/metrics

接入prometheus

更改prometheus 配置文件

vim prometheus.yml
# 加入节点
global:
  scrape_interval: 15s

scrape_configs:
加入
- job_name: “node”
  static_configs:
  - targets: ['localhost:9091']

再登入看看

通过 Prometheus 表达式浏览器探索 Node Exporter 指标

现在 Prometheus 正在从正在运行的 Node Exporter 实例中抓取指标，您可以使用 Prometheus UI（又名表达式浏览器）探索这些指标。在浏览器中导航到localhost:9090/graph并使用页面顶部的主表达式栏输入表达式。表达式栏如下所示：

特定于节点导出器的指标以和为前缀并node_包括指标。node_cpu_seconds_total``node_exporter_build_info

单击下面的链接以查看一些示例指标：

公制	意义
rate(node_cpu_seconds_total{mode="system"}[1m\])	过去一分钟内每秒在系统模式下花费的平均 CPU 时间（以秒为单位）
node_filesystem_avail_bytes	非 root 用户可用的文件系统空间（以字节为单位）
rate(node_network_receive_bytes_total[1m\])	过去一分钟内每秒接收的平均网络流量（以字节为单位）

GPU 监控节点

nvidia_gpu_prometheus_exporter

安装编译（需要go编译环境）

go get github.com/mindprince/nvidia_gpu_prometheus_exporter

没有安装go 环境先安装

sudo snap install go         # version 1.17.8, or
sudo apt  install golang-go
sudo apt  install gccgo-go

下载好的文件在一般在 ~/go/bin/ 目录中

运行

nohup ./nvidia_gpu_prometheus_exporter &

配置

普罗米修斯配置文件中加入

# GPU 节点
  - job_name: "gpu"
    static_configs:
     - targets: ['localhost:9445']

可视化面板 grafana

安装

https://grafana.com/grafana/download

sudo apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/enterprise/release/grafana-enterprise_8.4.4_amd64.deb
sudo dpkg -i grafana-enterprise_8.4.4_amd64.deb

安装完成显示

Adding new user `grafana' (UID 111) with group `grafana' ...
Not creating home directory `/usr/share/grafana'.
### NOT starting on installation, please execute the following statements to configure grafana to start automatically using systemd
 sudo /bin/systemctl daemon-reload
 sudo /bin/systemctl enable grafana-server
### You can start grafana-server by executing
 sudo /bin/systemctl start grafana-server
Processing triggers for systemd (237-3ubuntu10.49) ...
Processing triggers for ureadahead (0.100.0-21) ...

配置

配置文件 /etc/grafana/grafana.ini

vim /etc/grafana/grafana.ini

默认使用 3000 端口，可更改

java 阿普米修斯普罗米修斯阿尔忒弥斯_配置文件_04

运行

首次运行使用systemctl 运行

systemctl 介绍

输入命令

root@iZwz94i2x6mfttevff926kZ:~# sudo /bin/systemctl daemon-reload
root@iZwz94i2x6mfttevff926kZ:~# sudo /bin/systemctl enable grafana-server
sudo /bin/systemctl start grafana-server

输出结果

Synchronizing state of grafana-server.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable grafana-server
Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.

查看端口

java 阿普米修斯普罗米修斯阿尔忒弥斯_java 阿普米修斯_05

接入prometheus

参考文件：https://prometheus.io/docs/visualization/grafana/#installing

导入制作好的仪表盘

链接：https://pan.baidu.com/s/1-QRttUDDly7XfZNc9dPfOQ
提取码：0e2u
–来自百度网盘超级会员V5的分享

最后这样

java 阿普米修斯普罗米修斯阿尔忒弥斯_配置文件_06

告警

Prometheus 的警报分为两部分。Prometheus 服务器中的警报规则将警报发送到警报管理器。然后，Alertmanager 管理这些警报，包括静音、抑制、聚合和通过电子邮件、待命通知系统和聊天平台等方法发送通知。

设置警报和通知的主要步骤是：

设置和配置警报管理器
配置 Prometheus与 Alertmanager 对话
在 Prometheus 中创建警报规则

报警管理器

💺 https://prometheus.io/docs/alerting/latest/alertmanager/

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：python编写飞机大战资源包 python制作飞机大战代码

下一篇：vs codepython open创建文件 vs code新建python

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯