(1) 首先下载软件包(采用二进制包,非编译安装):

Logstash:https://artifacts.elastic.co/downloads/logstash/logstash-7.5.2.tar.gz

(2) 解压Logstash安装包,编辑解压文件中的config/logstash.yml配置文件,添加或调整以下配置:

#每次发送的事件数

pipeline.batch.size: 10000

#发送延时

pipeline.batch.delay: 10

#pipeline线程数,官方建议是等于CPU内核数

pipeline.workers: 6

#监视配置文件的改变,并且当配置文件被修改以后自动重新加载配置文件

config.reload.automatic: true

#为了检查配置文件是否改变,而拉去配置文件的频率。默认3秒

config.reload.interval: 60s

#数据持久化
queue.type: persisted

#队列存储路径;如果队列类型为persisted,则生效
path.queue: /data/ELK/logstash/data

#队列为持久化,单个队列大小
queue.page_capacity: 1024mb

#当启用持久化队列时,队列中未读事件的最大数量,0为不限制
queue.max_events: 0

#队列最大容量100G
queue.max_bytes: 102400mb

#在启用持久队列时强制执行检查点的最大数量,0为不限制
queue.checkpoint.acks: 1024

#在启用持久队列时强制执行检查点之前的最大数量的写入事件,0为不限制
queue.checkpoint.writes: 1024

#当启用持久队列时,在头页面上强制一个检查点的时间间隔,单位毫秒
queue.checkpoint.interval: 1000 

#绑定的主机ip地址

http.host: "192.168.193.157"

#绑定的服务端口

http.port: 9600

#启用logstash的x-pack监控,默认为禁用

xpack.monitoring.enabled: true

#传输logstash监控数据时候用于通过elasticsearch认证的账号
xpack.monitoring.elasticsearch.username: logstash_system

#传输logstash监控数据时候用于通过elasticsearch认证的账号密码,这里使用预先存储在密钥文件keystore中的变量LOGSTASH_PWD的值来代替
xpack.monitoring.elasticsearch.password: "${LOGSTASH_PWD}"

#将logstash的监控数据发送到所指定的elasticsearch集群中,也可以是单个es主机
xpack.monitoring.elasticsearch.hosts: ["http://192.168.193.154:9200","http://192.168.193.155:9200","http://192.168.193.156:9200"]

#logstash数据存储目录

path.data: /data/ELK/logstash/data 

#包含的子配置文件目录
path.config: /data/ELK/logstash/config/conf.d/*.conf

#logstash日志目录
path.logs: /data/ELK/logstash/logs

(3) 在config/conf.d目录下新增文件logstash_kafka_to_es.conf(文件名可自定义,结尾后缀需为.conf),该文件为日志管道pipeline的配置文件,用于控制日志数据流和数据清洗。接着添加或调整以下配置(以单个项目接入为例,多个项目可平行扩展,具体参考配置文件模板):

input {
    kafka {
        bootstrap_servers => ["192.168.145.109:9092,192.168.145.110:9092,192.168.145.111:9092"]
        client_id => "ELKlogstash01"
        group_id => "HDPRO_ELK_JAVA_idr"
        codec => "plain"
        fetch_max_bytes => "10485760"
        auto_offset_reset => "latest"

##适用于多logstash节点时的负载均衡策略轮询RoundRobin配置,配合线程数consumer_threads使用,同时需合理安排kafka中topic的分区partition总数 ## 
        partition_assignment_strategy => "org.apache.kafka.clients.consumer.RoundRobinAssignor"

## 消费者线程数consumer_threads 根据 kafka中topic的分区partition总数,除以logstash的个数 决定,否则会出现性能过剩或负载不均衡的情况,这句不用写到配置文件里  ##
        consumer_threads => 4
        decorate_events => true
        topics => ["idr-attendance"]
        type => "idr-attendance" 
    }

}

filter {
    if [type] == "idr-attendance" {
            grok {
                    patterns_dir => ["/data/ELK/logstash/config/pattern/JavaProdlog"]
                    match => {
                        "message" => ["%{JAVAAPPLOG}","%{ANY:message}"]
                    }
                    overwrite => ["message"]
            }
            mutate {
                                gsub => ["message","\\n","
"]
                                gsub => ["message","\\t","    "]
                                gsub => ["message","\\u001b\[m",""]
                                gsub => ["message","\\u001b\[1\;31m",""]
                                remove_field => "timestamp"
                                remove_field => "@version"
                                remove_field => "text_format"
                        }
            date {
                       match => ["timestamp","yyyy-MM-dd HH:mm:ss.SSS"]
        }
    }

}

output {
     if [type] == "idr-attendance" {
         elasticsearch {
            hosts => ["192.168.193.154:9200","192.168.193.155:9200","192.168.193.156:9200"]
            index => "idr-attendance_%{+YYYY.MM.dd}"
            user => "elastic"
            password => "${ES_PWD}"
         }
    }

}

以下为filter模块中的匹配模式pattern的文件内容,根据filter模块中所定义的pattern路径config/conf.d/pattern下(logstash默认自带的pattern在vendor/bundle/jruby/x.x.x/gems/logstash-pattern-core-x.x.x/patterns/下,无需重复自定义,可根据实际直接选用),新建该文件JavaProdlog并把以下内容写入文件中,该文件用于自定义多个正则表达式匹配规则,用于所接入的日志字段切割

ANY [\s\S]*
APPTIMESTAMP 20%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:?%{MINUTE}(?::?%{SECOND})
JAVAAPPLOG %{ANY:text_format}%{APPTIMESTAMP:timestamp}\^\^%{ANY:threadname}\^\^%{ANY:loglevel}\^\^%{ANY:javaclass}\^\^%{ANY:information}

(4) 由于配置文件中不宜出现明文密码,密码信息需要加密和隐藏,所以创建密码存储文件keystore,创建对应的密码变量,配置文件里使用变量的形式来表示:

bin/logstash-keystore create

可以不设置keystore密码,输入Y跳过,之后在config目录下生成keystore文件

logstash Java异常 logstash.yml_elasticsearch

接着使用

bin/logstash-keystore add LOGSTASH_PWD

创建要加密的密码变量LOGSTASH_PWD,对应步骤(2)中的配置文件中"xpack.monitoring.elasticsearch.password"项的密码变量

logstash Java异常 logstash.yml_配置文件_02

使用

bin/logstash-keystore add LOGSTASH_PWD

创建要加密的密码变量ES_PWD,对应步骤(3)中的配置文件中output模块的" password =>"项的密码变量

logstash Java异常 logstash.yml_JAVA_03

(5) 更改文件从属,以便使用elk运行服务的时候不会出现读写权限问题,请自行更改logstash所在目录:

chown -R elk:elk /data/ELK/logstash

(6) 设置开机启动,这里采用添加系统systemd服务的方式,由于logstash没有自带启停脚本,首先需要编写一个启停脚本service.sh,放置到logstash的bin目录下,内容如下:

#!/bin/sh
#chkconfig: 2345 80 05
#description: logstash
source /etc/profile

#export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.272.b10-1.el7_9.x86_64/jre
#export JAVA_BIN=${JAVA_HOME}/bin
#export PATH=$PATH:$JAVA_HOME/bin
#export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
#export JAVA_HOME JAVA_BIN PATH CLASSPATH
logstash=/data/ELK/logstash
PID=""
if [ "$1" = "" ];
then
    echo -e "\033[0;31m 未输入操作名 \033[0m  \033[0;34m {start|stop|restart|status} \033[0m"
    exit 1
fi
function query()
{
    PID=`ps aux |grep elk|grep logstash|grep -v $0 | grep -v grep | awk '{print $2}'`
}
function start()
{
    query 
    if [ x"$PID" != x"" ]; then
        echo "logstash is running..."
    else
        #su elk<<!
        nohup $logstash/bin/logstash -f $logstash/config/conf.d/logstash_kafka_to_es.conf > /dev/null 2>&1 &
#!
        echo "Start logstash success..."
    fi
}
function stop()
{
    echo "Stop logstash"
    query
    echo "WO $PID"
    if [ x"$PID" != x"" ]; then
        kill -TERM $PID
        echo "logstash (pid:$PID) exiting..."
        while [ x"$PID" != x"" ]
        do
            sleep 1
            query
        done
        echo "logstash exited."
    else
        echo "logstash already stopped."
    fi
}
function restart()
{
    stop
       sleep 2
       start
}
case $1 in
start)
        start;;
stop)
        stop;;
restart)
        restart;;
*)

esac

然后给脚本添加可执行权限

chmod u+x service.sh

logstash Java异常 logstash.yml_elasticsearch_04

接着编写systemd服务配置文件vim /usr/lib/systemd/system/logstash.service,内容如下,请根据实际服务部署位置修改对应路径:

[Unit]
Description=logstash.service
After=network.target
[Service]
Type=forking
#使用这个账号操作
User=elk
Group=elk
LimitCORE=infinity
LimitMEMLOCK=infinity
LimitNOFILE=65536
LimitNPROC=65536
ExecStart=/data/ELK/logstash/bin/service.sh start
ExecReload=/data/ELK/logstash/bin/service.sh restart
ExecStop=/data/ELK/logstash/bin/service.sh stop
KillMode=process
Restart=always
[Install]
WantedBy=multi-user.target

保存后使用命令

systemctl enable logstash

激活logstash服务并启用开机启动,之后使用命令

systemctl start logstash

启动服务,观察服务启动日志是否异常。