Kibana 数据可视化平台
特点:灵活的分析和可视化平台
实时总结流量和数据的图表
为不同的用户显示直观的界面
即时分享和嵌入的仪表板
部署kibana
[root@kibana ~]# yum -y install java-1.8.0
[root@kibana ~]# java -version
[root@kibana ~]# yum -y install kibana
[root@kibana ~]# vim /opt/kibana/config/kibana.yml
...
2 server.port: 5601
...
5 server.host: "0.0.0.0"
...
15 elasticsearch.url: "http://es5:9200"
...
23 kibana.index: ".kibana"
...
26 kibana.defaultAppId: "discover"
...
53 elasticsearch.pingTimeout: 1500
...
57 elasticsearch.requestTimeout: 30000
...
64 elasticsearch.startupTimeout: 5000
...
[root@kibana ~]# systemctl restart kibana
[root@kibana ~]# ss -pntul | grep :5601
tcp LISTEN 0 128 *:5601 *:* users:(("node",pid=1479,fd=10))
查看状态
批量导入数据
使用_bulk批量导入数据,批量导入用POST方式,数据格式为json,url编码使用data-binary
[root@kibana ~]# gzip -d accounts.json.gz
[root@kibana ~]# gzip -d shakespeare.json.gz
[root@kibana ~]# gzip -d logs.jsonl.gz
[root@kibana ~]# ls
accounts.json bin logs.jsonl shakespeare.json
导入有index配置的json文件
[root@kibana ~]# curl -XPOST "http://192.168.1.15:9200/_bulk" --data-binary @logs.jsonl
[root@kibana ~]# curl -XPOST "http://192.168.1.15:9200/_bulk" --data-binary @shakespeare.json
导入没有index和type的json文件
[root@kibana ~]# curl -XPOST "http://192.168.1.15:9200/a/b/_bulk" --data-binary @accounts.json
查看数据是否打入成功
使用GET查询结果
[root@kibana ~]# curl -XGET 'http://192.168.1.61:9200/_mget?pretty' -d '{
"docs":[
{
"_index":"shakespeare",
"_type:":"act",
"_id":0
},
{
"_index":"shakespeare",
"_type:":"line",
"_id":0
},
{
"_index":"a",
"_type:":"b",
"_id":25
}
]
}'
Logstash-一个数据采集,加工处理以及传输的工具
语法:
section(区段) | 用{}定义区域,可以定义多个插件,插件区域可以定义键值对 |
数据类型 | 支持bool值,string字符串,number数值,array数组,hash值 |
field references(字段引用) | 字段类似于键值对,写法:[字段名] |
特点:所有类型的数据集中处理
不同模式和格式数据的正常化
自定义日志格式的迅速扩展
为自定义数据源轻松添加插件
Logstash工作结构 |
{数据源} == > input { } == > filter { } == > output { } == > {ES} 注:input(类似于餐馆的采购) filter(类似于后厨) output(前台 |
Logstash类型及条件判断
类型 | 条件判断 | 条件判断 |
布尔值 ssl_enable => true | 等于 == | 包含 in |
字节类型 bytes => "1MiB" | 不等于 != | 不包含 not in |
字符串类型 name => "xkops" | 小于 < | 与 and |
数值类型 port => 22 | 大于 > | 或 or |
数组 match => ["datetime","UNIX"] | 小于等于 <= | 非与 nand |
哈希 options => {k => "v",k2 => "v2"} | 大于等于 >= | 非或 xor |
编码解码 codec => "json" | 匹配正则 =~ | 符合表达式 () |
路径 file_path => "/tmp/filename" | 不匹配正则 !~ | 取反符合 !() |
Logstash命令行参数 ---提供了一个shell脚本叫做logstash方便快速运行
-e即执行 bin/logstash -e 可以不接任何配置 默认值为
input {
stdin { }
}
output {
stdout { }
}
--config或-f
--configtest或-t
--log或l 即日志,可以通过bin/logstash -l logs/logstash.log命令来统一存储日志
--filterworkers或-w
--verbose
--debug
Logstash的配置文件需要手写
默认路径在/etc/logstash/logstash.conf
[root@logstash ~]# vim /opt/logstash/bin/
input{
stdin{}
}
filter{ }
output{
stdout{}
}
[root@logstash ~]# alias logstash="/opt/logstash/bin/logstash" //定义启动别名
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf //启动logstash
Settings: Default pipeline workers: 2
Pipeline main started
ninhao //输入ninhao
2019-01-17T06:27:11.918Z logstash ninhao //输出ninhao
上面的配置文件使用了logstash-input-stdin和logstash-output-stdout两个插件,他还有filter和codec类插件
插件目录 /opt/logstash/bin/logstash-plugin list
logstash官方文件(全是英文哟) https://www.elastic.co/guide/en/logstash/current/index.html
例如:
在 input plugins 里面找到file插件,打开
找到这个配置内容
然后这里面的id 对应的就是上面的这里面的setting value 对应的就是input type
常用选项
- path 是必须的选项,每一个file配置,都至少有一个path
- exclude 是不监听的文件,logstash会自动忽略该文件的监听。配置的规则与path类似,必须是绝对路径。
- start_position 是监听的位置,默认是end,即一个文件如果没有记录它的读取信息,则从文件的末尾开始读取,仅仅读取新添加的内容,beginning就会从一个文件的头开始读取。
- sincedb_path 这个选项配置了默认的读取文件信息记录在哪个文件中,默认是按照文件的inode等信息自动生成。默认存在 /root/.sincedb_e9a1772295a869da80134b5c4e75816e 记录每个被监听文件的inode ,major number,minor number ,pos
- add_field 就是增加一个字段
- discover_interval 每个多久去检查一次被监听的path下是否有新文件 默认值是15s
- sincedb_write_interval 每隔多久写一次sincedb文件,默认是15s
- start_interval 每隔多久检查一次被监听文件状态(是否有更新),默认是1s
- start_position 从什么时候开始读取文件数据,默认是结束位置。
file插件演示
[root@logstash ~]# /opt/logstash/bin/logstash-plugin list
Ignoring ffi-1.9.13 because its extensions are not built. Try: gem pristine ffi --version 1.9.13
...
logstash-output-email //输出到邮件
...
logstash-output-file //输出到文件
...
codec类插件 常用的常见有 plain、json、json_lines 、rubydebug、multiline等
[root@logstash ~]# vim /etc/logstash/logstash.conf
input{
stdin{ codec => "json"}
}
filter{ }
output{
stdout{ codec => "rubydebug"}
}
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf
Settings: Default pipeline workers: 2
Pipeline main started
{"a": 1, "b": 2, "c": 3}
{
"a" => 1,
"b" => 2,
"c" => 3,
"@version" => "1",
"@timestamp" => "2019-01-17T06:49:28.154Z",
"host" => "logstash"
}
运用 file文件显示日志文件
[root@logstash ~]# vim /etc/logstash/logstash.conf
input {
file {
path => ["/tmp/a.log", "/var/tmp/b.log"]
}
}
filter{ }
output{
stdout{ codec => "rubydebug"}
}
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf
Settings: Default pipeline workers: 2
Pipeline main started
另开一终端写入数据在上面的文件中
[root@logstash ~]# echo 123 > /tmp/a.log
[root@logstash ~]# echo abc > /var/tmp/b.log
查看输出
{
"message" => "123",
"@version" => "1",
"@timestamp" => "2019-01-17T07:30:13.334Z",
"path" => "/tmp/a.log",
"host" => "logstash"
}
{
"message" => "abc",
"@version" => "1",
"@timestamp" => "2019-01-17T07:30:43.359Z",
"path" => "/var/tmp/b.log",
"host" => "logstash"
}
自定义初始位置,以及读取方式和类型
[root@logstash ~]# vim /etc/logstash/logstash.conf
input {
file {
path => ["/tmp/a.log", "/var/tmp/b.log"]
sincedb_path => "/var/lib/logstash/since.db"
start_position => "beginning"
type => "apache log"
}
}
filter{ }
output{
stdout{ codec => "rubydebug"}
}
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf //它会自动读取之前的日志
Settings: Default pipeline workers: 2
Pipeline main started
{
"message" => "123",
"@version" => "1",
"@timestamp" => "2019-01-17T07:49:26.507Z",
"path" => "/tmp/a.log",
"host" => "logstash",
"type" => "apache log"
}
{
"message" => "abc",
"@version" => "1",
"@timestamp" => "2019-01-17T07:49:26.524Z",
"path" => "/var/tmp/b.log",
"host" => "logstash",
"type" => "apache log"
}
注:Logstash会给事件添加一些额外的信息,最重要的就是@timestamp,用来标明时间发生的时间。这个字段涉及Logstash的内部流传,所以必须是一个joda对象,如果尝试自己命名的华,会报错。这里常用logstash-filter-date插件来管理。
还有比如:host 标记事件发生在哪里,type标记事件的唯一类型,tags标记事件的某方面属性(数组)
tcp插件
格式:
input {
tcp {
id => "my_plugin_id"
}
}
常用参数
port 端口号 如果是server模式,就是监听的端口号;如果是client模式,就是连接的目标端口号
mode 常用的模式有server,client 默认的是服务端server
server就是把logstash看做是日志的服务器,接收主机端生成的日志消息。
client则是把logstash看做是tcp的发起者,请求主机返回日志消息。
host 默认为0.0.0.0 如果是server模式,就是监听的主机地址 如果是client模式,就是连接的目标地址
udp插件
参数 : port 端口号
[root@logstash ~]# vim /etc/logstash/logstash.conf
input {
tcp {
mode => "server"
host => "0.0.0.0"
port => 8888
type => "tcplog"
}
udp {
port => 8888
type => "udplog"
}
}
filter{ }
output{
stdout{ codec => "rubydebug"}
}
[root@logstash ~]# netstat -pntul | grep 888 //新开终端查看
tcp6 0 0 :::8888 :::* LISTEN 10881/java
udp6 0 0 :::8888 :::* 10881/java
[root@logstash ~]# echo "xxhh" > /dev/tcp/192.168.1.25/8888
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf
Settings: Default pipeline workers: 2
Pipeline main started
{
"message" => "xxhh",
"@version" => "1",
"@timestamp" => "2019-01-17T08:55:42.238Z",
"host" => "192.168.1.25",
"port" => 34536,
"type" => "tcplog"
syslog 插件
配置logstash端
[root@logstash ~]# vim /etc/logstash/logstash.conf //沿用上面的例子添加
...
syslog {
type => "syslog"
}
...
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf
Settings: Default pipeline workers: 2
Pipeline main started
配置web端syslog日志
[root@web ~]# vim /etc/rsyslog.conf
74 local0.info @@192.168.1.25:514
[root@web ~]# systemctl restart rsyslog
[root@web ~]# logger -p local0.info -t weblog "test web log" //模拟发送
查看logstash端
{
"message" => "test web log\n",
"@version" => "1",
"@timestamp" => "2019-01-17T09:33:39.000Z",
"type" => "syslog",
"host" => "192.168.1.18",
"priority" => 134,
"timestamp" => "Jan 17 17:33:39",
"logsource" => "web",
"program" => "weblog",
"severity" => 6,
"facility" => 16,
"facility_label" => "local0",
"severity_label" => "Informational"
}
filter模块
grok插件
解析各种非结构化的日志数据文件
grok使用正则表达式把非结构化的数据结构化
在分组匹配,正则表达式需要根据具体数据结构编写,虽然编写难,但实用性广,可以用于各类数据
案例:把 192.168.1.18上的日志写入到192.168.1.25的/tem/a.log
[root@web ~]# cat /var/log/httpd/access_log
192.168.1.18 - - [17/Jan/2019:11:37:53 +0800] "GET / HTTP/1.1" 200 10 "-" "curl/7.29.0"
[root@logstash ~]# vim /tmp/a.log
[root@logstash ~]# vim /etc/logstash/logstash.conf
input {
file {
path => ["/tmp/a.log"]
sincedb_path => "/var/lib/logstash/since.db"
start_position => "beginning"
type => "apache log"
}
...
}
...
filter{
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"} //其余还是上面的形式
}
}
...
在192.168.1.25这台logstash机器上进行解析
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf
Settings: Default pipeline workers: 2
Pipeline main started
{
"message" => "168.1.18 - - [17/Jan/2019:11:37:53 +0800] \"GET / HTTP/1.1\" 200 10 \"-\" \"curl/7.29.0\"",
"@version" => "1",
"@timestamp" => "2019-01-17T11:31:11.179Z",
"path" => "/tmp/a.log",
"host" => "logstash",
"type" => "apache log",
"clientip" => "168.1.18",
"ident" => "-",
"auth" => "-",
"timestamp" => "17/Jan/2019:11:37:53 +0800",
"verb" => "GET",
"request" => "/",
"httpversion" => "1.1",
"response" => "200",
"bytes" => "10",
"referrer" => "\"-\"",
"agent" => "\"curl/7.29.0\""
filebeat插件
这个插件用来接收beats类软件发来的数据
在we服务器192.168.1.18上安装filebeat
[root@web ~]# yum -y install filebeat
修改配置文件
[root@web ~]# vim /etc/filebeat/filebeat.yml //有两种模块 input 和 output
paths:
- /var/log/httpd/access_log //日志路径(15)可以监控多个目录
document_type: apachelog //去掉注释,修改类型(72)
# elasticsearch: //加上注释(183行)
# hosts: ["localhost:9200"] //加上注释(188)
logstash: //取消注释(278)
hosts: ["192.168.1.25:5044"] //取消注释,地址为logstash主机(280)
[root@web ~]# grep -Pv "^\s*(#|$)" /etc/filebeat/filebeat.yml 显示未注释没有空行的文件
启动服务
[root@web ~]# systemctl start filebeat
修改logstash端配置文件
[root@logstash ~]# vim /etc/logstash/logstash.conf
input {
stdin {codec => "json"}
beats {
port => 5044
}
...
output{
stdout{ codec => "rubydebug"}
elasticsearch {
hosts => ["192.168.1.12:9200", "192.168.1.12:9200"]
index => "weblog"
flush_size => 2000
idle_flush_time => 10
}
}
[root@logstash ~]# logstash -f /etc/logstash/logstash.conf
Settings: Default pipeline workers: 2
另开一终端查看状态
[root@logstash ~]# netstat -pntul | grep 5044
tcp6 0 0 :::5044 :::* LISTEN 11378/java
在web服务器上访问页面
用浏览器访问elasticsearch 有weblog日志