通过prometheus-operator可以简化k8s监控布署,但是也存在一些问题,prometheus通过crd固化了一些配置,使初学者使用起来有些困难,比如一些配置文件参数的修改不能通过config-map配置,只能改crd文件,而crd是固化在配置里面,跟本没法改。
现阶段配置修改可以通过helm修改,支持的可配置参数最全面,但是也有坑,通过value.yaml可以灵活配置prometheus的各种安装参数,安装过程如下:
注意在国内下载helm各种源的时候经常会卡住,先将dns设为阿里dns到国外会比较快
vim /etc/resolv.conf
nameserver 223.5.5.5
nameserver 223.6.6.6
安装helm
wget https://storage.googleapis.com/kubernetes-helm/helm-v2.14.0-linux-amd64.tar.gz
解压缩 tar -zxvf helm-v2.10.0-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm
helm version
切换到啊里云
所有节点安装: yum install socat
rm -rf /root/.helm/*
helm init --client-only --stable-repo-url https://aliacs-app-catalog.oss-cn-hangzhou.aliyuncs.com/charts/
helm init --client-only --stable-repo-url https://kubernetes-charts-incubator.storage.googleapis.com/charts/
helm repo add incubator https://aliacs-app-catalog.oss-cn-hangzhou.aliyuncs.com/charts-incubator/
helm repo update
安装tiller服务端 (不能下载可以 使用国内镜像)
1、docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.10.0
2、 权限配置不然会报这种错 no release found
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}' (此项高版本或许不起效)
3、 helm init -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.10.0
或
helm init --service-account tiller --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.10.0 --skip-refresh
4.增加每个节点端口固定配置
更改svc为 nodeport
ports:
- nodePort: 44134
port: 44134
protocol: TCP
type: NodePort
指定tiller的nodeport地址
export HELM_HOST=172.20.0.101:44134
helm init常用配置项如下:
--canary-image:安装金丝雀build
--tiller-image:安装指定image
--kube-context:安装到指定的kubernetes集群
--tiller-namespace:安装到指定的namespace中
--upgrade:如果tiller server已经被安装了,可以使用此选项更新镜像
--service-account:用于指定运行tiller server的serviceaccount,该account需要事先在kubernetes集群中创建,且需要相应的rbac授权
helm install . --dry-run --debug #显示yaml文件
#helm tiller自定义修改,输入yaml文件
helm init --output yaml
下载charts文件,找到版本号
wget https://github.com/helm/charts/archive/master.zip
unzip master.zip
cd stable/prometheus-operator
当执行下面helm安装时会报错,找不到grafana kube-state-metrics prometheus-node-exporter
[root@k8s-test-master-1 prometheus-operator]# helm install . -n test
Error: found in requirements.yaml, but missing in charts/ directory: kube-state-metrics, prometheus-node-exporter, grafana
首先编辑依赖文件,找到依赖版本,
vim requirements.yaml 找到grafna kube-state-metrics prometheus-node-exporter 查看版本
dependencies:
- name: kube-state-metrics
version: 2.0.*
repository: https://kubernetes-charts.storage.googleapis.com/
condition: kubeStateMetrics.enabled
- name: prometheus-node-exporter
version: 1.5.*
repository: https://kubernetes-charts.storage.googleapis.com/
condition: nodeExporter.enabled
- name: grafana
version: 3.7.*
repository: https://kubernetes-charts.storage.googleapis.com/
condition: grafana.enabled
此时看到每个应用的version 即charts.tgz的包名
下载helm-charts的索引文件,找到tgz下载charts
wget https://kubernetes-charts.storage.googleapis.com/index.yaml
[root@k8s-test-master-1 tmp]# grep grafana-3.7*.tgz index.yaml.1
- https://kubernetes-charts.storage.googleapis.com/grafana-3.7.2.tgz
找到 grafana 3.7.* 的taz其它包一样
下载tgz包
9142 wget https://kubernetes-charts.storage.googleapis.com/grafana-3.7.2.tgz
9143 wget https://kubernetes-charts.storage.googleapis.com/prometheus-node-exporter-1.5.2.tgz
9144 wget https://kubernetes-charts.storage.googleapis.com/kube-state-metrics-2.0.0.tgz
tgz里面的charts可以改为本地的镜像,或改里面的参数
现阶段harbor已支持charts仓库,直接将tgz包上传到harbor仓库
harbor在安装进注意加载charts支持 --with-chartmuseum
./install.sh --with-chartmuseum
创建一个helm的仓库
通过web界面上传tgz包
将harbor仓库加入到helm repo
helm repo add --username=admin --password=Admin_dsfdsd harbor-repo http://harbor-test.ai.com/chartrepo/helm
检查仓库加入是否完成
[root@k8s-test-master-1 tmp]# helm repo list
NAME URL
stable https://kubernetes-charts.storage.googleapis.com
local http://127.0.0.1:8879/charts
apphub https://apphub.aliyuncs.com
harbor-repo http://harbor-test.ai.com/chartrepo/helm
将 prometheus-operator 下面 requirements.yaml 里面改为本地仓库地址
dependencies:
- name: kube-state-metrics
version: 2.0.*
#repository: https://kubernetes-charts.storage.googleapis.com/
repository: http://harbor-test.ai.com/chartrepo/helm/
condition: kubeStateMetrics.enabled
- name: prometheus-node-exporter
version: 1.5.*
repository: http://harbor-test.ai.com/chartrepo/helm/
condition: nodeExporter.enabled
- name: grafana
version: 3.7.*
repository: http://harbor-test.ai.com/chartrepo/helm/
condition: grafana.enabled
~
自定义value.yaml 参数化布署prometheus
vim value.yaml
#自定义prometheus 镜像地址
image:
#repository: quay.io/prometheus/prometheus
repository: harbor-test.ai.com/public/prometheus
tag: v2.10.0
#自定义远程存储
#remoteWrite: []
remoteWrite:
- url: http://172.20.0.104:9201/write
布署
kubectl create namespace monitoring
helm install --name pro . --namespace monitoring -f values.yaml
查看
[root@k8s-test-master-1 prometheus-operator]# helm list --all
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
pro 1 Mon Aug 5 14:19:36 2019 DEPLOYED prometheus-operator-6.4.0 0.31.1 monitoring
查看布署参数是否生效,通过dashboard登陆容器可以看到配置的远程存储生效了
访问grafana
[root@k8s-test-master-1 prometheus-operator]# kubectl get pod -n monitoring -o wide |grep grafana
pro-grafana-548cbf4699-5w6bh 2/2 Running 8 6h6m 10.12.38.9
注意, pods面板的network 显示不了,更改1m 为5m显示
sort_desc(sum by (pod_name) (rate(container_network_receive_bytes_total{job="kubelet", cluster="$cluster", namespace="$namespace", pod_name="$pod"}[1m])))
sort_desc(sum by (pod_name) (rate(container_network_receive_bytes_total{job="kubelet", cluster="$cluster", namespace="$namespace", pod_name="$pod"}[5m])))
sort_desc(sum by (pod_name) (rate(container_network_transmit_bytes_total{job="kubelet", cluster="$cluster", namespace="$namespace", pod_name="$pod"}[1m])))
sort_desc(sum by (pod_name) (rate(container_network_transmit_bytes_total{job="kubelet", cluster="$cluster", namespace="$namespace", pod_name="$pod"}[5m])))
helm布署会由crd监控,会不断重新创建,要清理helm布署的prometheus,执行以下操作
[root@k8s-test-master-1 tmp]# helm list
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
pro 1 Mon Aug 5 14:19:36 2019 DEPLOYED prometheus-operator-6.4.0 0.31.1 monitoring
pro是指布署的release版本
删除helm布署
helm delete pro --purge
删除 crd
kubectl delete crd prometheuses.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com