VictoriaMetrics vmalert 使用

以下是关于vmalert 的使用,主要是测试下各个组件的集成

环境准备

注意环境集成了vmauth,vmagent 等好多VictoriaMetrics的组件,基本上就是一个比较完备的prometheus集成环境了

  • docker-compose 文件

    说明目前vmalert 通过vmauth 会有错误异常,应该属于编码问题

version:  "3"
services: 
  vmstorage:
    image: victoriametrics/vmstorage
    ports:
      - 8482:8482
      - 8400:8482
      - 8401:8482
    volumes:
      - ./strgdata:/storage
    command:
      - '--storageDataPath=/storage'
  vmagent:
    image: victoriametrics/vmagent
    volumes: 
    - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
    - 8429:8429
    command:  
    - -promscrape.config=/etc/prometheus/prometheus.yml 
    - -remoteWrite.basicAuth.username=dalong-insert-account-1
    - -remoteWrite.basicAuth.password=dalong
    - -remoteWrite.url=http://vmauth:8427
  alertmanager:
    image: prom/alertmanager:latest
    volumes: 
    - "./alertmanager.yaml:/etc/alertmanager.yaml"
    command: 
    - --config.file=/etc/alertmanager.yaml
    - --storage.path=/tmp/alertmanager1
    ports:
    - 9093:9093
  vmalert:
    image: victoriametrics/vmalert
    volumes: 
    - "./alert.rules:/etc/victoriametrics/alert.rules"
    ports:
    - 8880:8880
    command: 
    - -rule=/etc/victoriametrics/alert.rules
    - -datasource.url=http://vmselect:8481/select/1/prometheus
    # - -datasource.url=http://vmauth:8427
    # - -datasource.basicAuth.password=dalong
    # - -datasource.basicAuth.username=dalong-select-account-1
    - -notifier.url=http://alertmanager:9093
  vmauth:
    image: victoriametrics/vmauth
    volumes: 
    - "./config.yaml:/etc/victoriametrics/config.yaml"
    command:
      - '-auth.config=/etc/victoriametrics/config.yaml'
    ports:
      - 8427:8427
  vminsert:
    image: victoriametrics/vminsert
    command:
      - '--storageNode=vmstorage:8400'
    ports:
      - 8480:8480
  vmselect:
    image: victoriametrics/vmselect
    command:
      - '--storageNode=vmstorage:8401'
    ports:
      - 8481:8481
  grafana:
    image: grafana/grafana
    ports:
      - 3000:3000
  • 配置说明
    vmauth 配置:
 
users:
- username: "dalong-select-account-1"
  password: "dalong"
  url_prefix: "http://vmselect:8481/select/1/prometheus"
- username: "dalong-insert-account-1"
  password: "dalong"
  url_prefix: "http://vminsert:8480/insert/1/prometheus"

vmagent 配置(就是prometheus 的配置)

global:
  scrape_interval:     1s
  evaluation_interval: 1s
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['prometheus:9090']
  - job_name: 'vminsert'
    static_configs:
      - targets: ['vminsert:8480']
  - job_name: 'vmselect'
    static_configs:
      - targets: ['vmselect:8481']
  - job_name: 'vmstorage'
    static_configs:
      - targets: ['vmstorage:8482']

vmalert 配置 (alert.rules 文件,主要测试)

groups:
  - name: groupGorSingleAlert
    rules:
      - alert: VMRows
        for: 10s
        expr: vm_rows > 0
        labels:
          label: bar
          host: "{{ $labels.instance }}"
        annotations:
          summary: "{{ $value|humanize }}"
          description: "{{$labels}}"
  - name: TestGroup
    rules:
      - alert: Conns
        expr: sum(vm_tcplistener_conns) by(instance) > 1
        annotations:
          summary: "Too high connection number for {{$labels.instance}}"
          description: "It is {{ $value }} connections for {{$labels.instance}}"
      - alert: ExampleAlertAlwaysFiring
        expr: sum by(job)
          (up == 1)

alertmanager 配置

global:
  resolve_timeout: 30s
route:
  group_by: ["alertname"]
  group_wait: 5s
  group_interval: 10s
  repeat_interval: 999h
  receiver: "default"
  routes:
    - receiver: "default"
      group_by: []
      match_re:
        alertname: .*
      continue: true
    - receiver: "pagination"
      group_by: ["alertname", "instance"]
      match_re:
        alertname: Pagination Test
      continue: false
    - receiver: "by-cluster-service"
      group_by: ["alertname", "cluster", "service"]
      match_re:
        alertname: .*
      continue: true
    - receiver: "by-name"
      group_by: [alertname]
      match_re:
        alertname: .*
      continue: true
    - receiver: "by-cluster"
      group_by: [cluster]
      match_re:
        alertname: .*
      continue: true
inhibit_rules:
  - source_match:
      severity: "critical"
    target_match:
      severity: "warning"
    # Apply inhibition if the alertname and cluster is the same in both
    equal: ["alertname", "cluster"]
receivers:
  - name: "default"
  - name: "pagination"
  - name: "by-cluster-service"
  - name: "by-name"
  - name: "by-cluster"
  • 支持的命令
vmalert-20200521-152717-tags-v1.35.6-cluster-0-gdcbdc009f
Usage of /vmalert-prod:
  -datasource.basicAuth.password string
      Optional basic auth password for -datasource.url
  -datasource.basicAuth.username string
      Optional basic auth username for -datasource.url
  -datasource.url string
      Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
  -enableTCP6
      Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
  -envflag.enable
      Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
  -envflag.prefix string
      Prefix for environment variables if -envflag.enable is set
  -evaluationInterval duration
      How often to evaluate the rules. Default 1m (default 1m0s)
  -external.url string
      External URL is used as alert's source for sent alerts to the notifier
  -http.disableResponseCompression
      Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
  -http.maxGracefulShutdownDuration duration
      The maximum duration for graceful shutdown of HTTP server. Highly loaded server may require increased value for graceful shutdown (default 7s)
  -http.pathPrefix string
      An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
  -http.shutdownDelay duration
      Optional delay before http server shutdown. During this dealy the servier returns non-OK responses from /health page, so load balancers can route new requests to other servers
  -httpListenAddr string
      Address to listen for http connections (default ":8880")
  -loggerFormat string
      Format for logs. Possible values: default, json (default "default")
  -loggerLevel string
      Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
  -loggerOutput string
      Output for the logs. Supported values: stderr, stdout (default "stderr")
  -memory.allowedPercent float
      Allowed percent of system memory VictoriaMetrics caches may occupy. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
  -notifier.url string
      Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
  -remoteRead.basicAuth.password string
      Optional basic auth password for -remoteRead.url
  -remoteRead.basicAuth.username string
      Optional basic auth username for -remoteRead.url
  -remoteRead.lookback duration
      Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
  -remoteRead.url vmalert
      Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428
  -remoteWrite.basicAuth.password string
      Optional basic auth password for -remoteWrite.url
  -remoteWrite.basicAuth.username string
      Optional basic auth username for -remoteWrite.url
  -remoteWrite.maxQueueSize int
      Defines the max number of pending datapoints to remote write endpoint (default 10000)
  -remoteWrite.url string
      Optional URL to Victoria Metrics or VMInsert where to persist alerts state in form of timeseries. E.g. http://127.0.0.1:8428
  -rule value
      Path to the file with alert rules. 
      Supports patterns. Flag can be specified multiple times. 
      Examples:
       -rule /path/to/file. Path to a single file with alerting rules
       -rule dir/*.yaml -rule /*.yaml. Relative path to all .yaml files in "dir" folder, 
      absolute path to all .yaml files in root.
  -rule.validateTemplates
      Indicates to validate annotation and label templates (default true)
  -version
      Show VictoriaMetrics version
  • 启动
docker-compose up -d

集成效果

VictoriaMetrics vmalert 使用_分享

 

 

说明

集成vmauth 的错误信息(属于编码问题)

error   VictoriaMetrics/app/vmalert/group.go:148        failed to execute rule "TestGroup"."ExampleAlertAlwaysFiring": failed to execute query "sum by(job) (up == 1)": error parsing metrics for http://vmauth:8427/api/v1/query?query=sum+by%28job%29+%28up+%3D%3D+1%29:invalid character '\x1f' looking for beginning of value

参考资料

https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/vmalert
https://github.com/prometheus/alertmanager