所谓应用监控,更多的是基于java jvm的监控,因为公司运行的中间件大部分都是基于tomcat,Springboot,SpringCloud,当然也必须支持WebLogic.在Kubernetes现有方案中,监控那块主要是通过cAdvisor,Heapster的组件获取Pod消耗的memory,CPU和网络的信息,但如果需要更深入的了解Pod中运行的应用的信息就基本没有提供缺省的方案。
那么到底应用监控涉及什么的指标,我整理一下大致包括:
- JVM Heap
- JVM Non Heap Memory
- GC
- Thread montoring
- DataSource
- Transactions Per Seconds
- Throughput 等等等等。。。
我这篇是基于JMX+InfluxDB+Grafana的尝试。
- 1.整体架构
主要技术点在于:
- 在tomcat下打开jmx监控选项,暴露另一个端口35135
- 通过jmxtrans基于JSON文件获取需要的jmx数据,然后将数据传入InfluxDB
- 基于Grafana选择InfluxDB数据进行展现。
- 2.实施步骤
- 2.1 镜像准备
InfuxDB和Grafana都基于Pod方式部署,首先拉取镜像
docker pull docker.io/influxdb:1.4.2
docker pull docker.io/grafana/grafana:4.6.3
[root@k8s-master tomcatjmx]# cat grafana.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: grafana
spec:
replicas: 1
template:
metadata:
labels:
app: "grafana"
spec:
containers:
- name: grafana
image: docker.io/grafana/grafana:4.6.3
ports:
- containerPort: 3000
---
apiVersion: v1
kind: Service
metadata:
name: grafanasvc
labels:
app: grafana
spec:
ports:
- port: 3000
protocol: TCP
targetPort: 3000
name: http
type: NodePort
selector:
app: grafana
[root@k8s-master tomcatjmx]# cat influxdb.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: influxdb
spec:
replicas: 1
template:
metadata:
labels:
app: "influxdb"
spec:
containers:
- name: influxdb
image: docker.io/influxdb:1.4.2
ports:
- containerPort: 8086
name: influxdbport
- containerPort: 8083
name: influxadmin
---
apiVersion: v1
kind: Service
metadata:
name: influxdbsvc
labels:
app: influxdb
spec:
ports:
- port: 8086
protocol: TCP
targetPort: 8086
name: http
- port: 8083
protocol: TCP
targetPort: 8083
name: admin
type: NodePort
selector:
app: influxdb
- 2.2 应用镜像构建
整个结构如下:
jmx的打开问题
首先需要修改catalina.sh,打开jmx选项
JAVA_OPTS="$JAVA_OPTS $JSSE_OPTS -Xms512m -Xmx1024m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=35135"
jmx的获取问题
下载jmxtrans-all.jar包
jmxtrans运行命令如下:
java -Djmxtrans.log.dir='/usr/local/log' -jar /usr/local/jmxtrans-268-all.jar -j /usr/local/
其中-j指到需要获取指标的配置文件,比如tomcat.json文件。
[root@node1 caas]# cat tomcat.json
{
"servers" : [ {
"port" : "35135",
"host" : "HOSTNAME",
"queries" : [
{
"obj" : "java.lang:type=Memory",
"attr" : [ "HeapMemoryUsage", "NonHeapMemoryUsage" ],
"resultAlias":"jvmMemory",
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
}]
},
{
"obj" : "java.lang:type=Threading",
"attr" : [ "ThreadCount", "PeakThreadCount", "TotalStartedThreadCount", "DaemonThreadCount"],
"resultAlias":"jvmThreading",
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
}]
},
{
"obj" : "java.lang:type=Memory",
"attr" : [ "HeapMemoryUsage"],
"resultAlias":"HeapMemoryUsage",
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
}]
},
{
"obj" : "Catalina:type=GlobalRequestProcessor,name=\"http-apr-8080\"",
"attr" : [ "bytesSent" ],
"resultAlias":"tomcatByteSent",
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
}]
},
{
"obj" : "java.lang:name=CMS Old Gen,type=MemoryPool",
"resultAlias": "cmsoldgen",
"attr" : [ "Usage" ],
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
}]
},
{
"obj" : "java.lang:type=GarbageCollector,name=*",
"resultAlias": "gc",
"attr" : [ "CollectionCount", "CollectionTime" ],
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
}]
},
{
"obj" : "Catalina:type=Engine",
"attr" : [ "backgroundProcessorDelay"],
"resultAlias":"EngineDelay",
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
}]
}
]
}]
}
详解
"outputWriters" : [ {
"@class" : "com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url" : "http://influxdbsvc:8086/",
"username" : "root",
"password" : "root",
"database" : "caasdb"
这一段主要是访问influxdb的服务,同时创建caasdb数据库,用户名密码都是root
"servers" : [ {
"port" : "35135",
"host" : "HOSTNAME",
port主要是tomcat暴露出来的jmx端口
host是指tomcat位于的主机位置或者主机名称
在容器中运行多个进程问题
这样jmxtrans会启动另一个进程,也就是说在这个Pod中会同时启动俩个进程。所以我们要通过supervisor来启动。
supervisor配置
[root@node1 caas]# cat supervisord.conf
[supervisord]
nodaemon=true
[program:tomcat]
command=/usr/local/apache-tomcat-8.5.6/bin/catalina.sh run
[program:jmxtrans]
command=/usr/local/updatejson.sh
为什么不是直接java -jar而是一个叫updatejson.sh的脚本呢?
因为我们需要在这个脚本中将配置json文件的host字段替换成为Pod的主机名或者ip
[root@node1 caas]# cat updatejson.sh
#!/bin/bash
sed -i "s/HOSTNAME/${HOSTNAME}/" /usr/local/tomcat.json
java -Djmxtrans.log.dir='/usr/local/log' -jar /usr/local/jmxtrans-268-all.jar -j /usr/local/
Supervisord不显示tomcat日志问题
需要在supervisord.conf下加入loglevel和redirect_stderr选项
[root@node1 caas]# cat supervisord.conf
[supervisord]
nodaemon=true
loglevel=debug
[program:tomcat]
command=/usr/local/apache-tomcat-8.5.6/bin/catalina.sh run
stdout_logfile=/usr/local/apache-tomcat-8.5.6/logs/catalina.log
redirect_stderr=true
[program:jmxtrans]
command=/usr/local/updatejson.sh
基本OK了,整个Dockerfile如下
FROM linux7-jre:8u151
RUN mkdir -p "/usr/local"
COPY caas/*.* /usr/local/
WORKDIR /usr/local
RUN yum install -y python-setuptools && \
easy_install supervisor && \
yum clean all && \
tar -xvf apache-tomcat-8.5.6.tar.gz && \
cp /usr/local/catalina.sh /usr/local/apache-tomcat-8.5.6/bin/catalina.sh && \
mv /usr/local/supervisord.conf /etc/ && \
chmod +x /usr/local/updatejson.sh
ENV CATALINA_HOME /usr/local/apache-tomcat-8.5.6
ENV PATH $CATALINA_HOME/bin:$PATH
ENV JAVA_HOME /usr/java/default
WORKDIR $CATALINA_HOME
EXPOSE 8080 35135
CMD ["supervisord"]
启动以后看到pod运行
服务如下:
比较核心的需要kubectl logs influxdb-4115439627-rm9zb查看一下,确认有数据写入。
- 2.3 Grafana配置
datasource配置
新建一个Dashboard
点击Panel Title,然后选择编辑Edit
选择相应的指标,形成报告。
grafana配置详情参考下图
- 3.其他问题
- 为什么jmxtrans不放在host中进行对多个tomcat的收集
每个jmxtrans需要指定一个配置文件,在这个配置文件中指定监控的主机名,如果都共用一个的话,那所有的指标都会汇集到一起,如果要区分的话,就必须放到Pod中。
- grafana每次当pod发生变化时,图标无法自动更新,必须手工配置,这个比较难搞。
- Q. How do I use the second y axis, secondYAxis function does not work
A. You can switch any series to the second y axis by clicking on the colored line to left of the series name in the legend below the graph. Alternately, use the "Display Styles" > "Series Specific overrides" to define an alias or regex + "Y-axis: 2" to move metrics to the right Axis
- 4.例子程序
消耗CPU的小程序
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<%@page import="java.util.*"%>
<%@ page contentType="text/html;charset=windows-1252"%>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252"/>
<title>Random</title>
</head>
<html>
<body>
<h3>
<%
java.util.Random generator = new java.util.Random();
generator.ints(1000000, 0, 100).sorted();
%>
random sort finished .........
</h3>
</body>
</html>
消耗内存的小程序
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<%@page import="java.util.*"%>
<%@ page contentType="text/html;charset=windows-1252"%>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252"/>
<title>Outofmemory</title>
</head>
<html>
<body>
<h3>
<%
List<String> heapList = new ArrayList<String>();
for (int i=0;i<10000;i++) {
heapList.add(new String("Shobhna"));
}
%>
put 10000 heapList success.........
</h3>
</body>
</html>