Elasticsearch 配置
Elasticsearch不仅仅是Lucene和全文搜索,我们还能这样去描述它:
- 分布式的实时文件存储,每个字段都被索引并可被搜索
- 分布式的实时分析搜索引擎
- 可以扩展到上百台服务器,处理PB级结构化或非结构化数据
Elastic Search
- 索引 基本单位是 文档 文档类型, ES 分片 横向扩展。 分片可以做备份,节点,一个es的实例就是一个节点(集群使用)
git clone https://github.com/elasticsearch/elasticsearch-servicewrapper.git 管理脚本
Elasticsearch ⇒ 索引 ⇒ 类型 ⇒ 文档 ⇒ 字段(Fields)
本文基于elasticsearch最新版搭建的 下载地址.
开发环境:centos7 :
192.168.20.153 zookeeper1
192.168.20.154 zookeeper2
192.168.20.155 zookeeper3
192.168.20.206 kafka1
192.168.20.207 kafka2
192.168.20.208 kafka3
192.168.20.204 logstashserver
192.168.20.205 kibana
192.168.20.201 es1
192.168.20.202 es2
192.168.20.203 es3
路径位置:
/data/elasticsearch
/data/java/
elasticsearch的集群搭建相当简单,不像solrcloud的搭建那么复杂(需要自己安装zookeeper),
在es1上操作:
下载elasticsearch:
wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.5/elasticsearch-2.3.5.tar.gz
解压:
tar -zxvf elasticsearch-2.3.5.tar.gz
mv elasticsearch-2.3.5 elasticsearch
接下来创建el用户,因为elasticsearch不允许以root运行(其实也可以运行,需要配置)。
- 设置虚拟 –Xms -Xmx 内存大小
vim /data/elasticsearch/bin/elasticsearch.in.sh
- 设置虚拟内存
echo “vm.max_map_count=262144” >> /etc/sysctl.conf
- 关闭swap分区
Swapoff –a
修改配置文件fstab
#/dev/mapper/centos-swap swap swap defaults 0 0
- 设置mlocall 为true
vim /data/elasticsearch/config/elasticsearch.yml
bootstrap.memory_lock: true
5.设置elastrisearch用户名密码
useradd el
passwd el
123456
chown -R el:el elasticsearch
su el
cd elasticsearch/conf/
修改配置文件:
vim elasticsearch.yml
cluster.name: feng
node.name: es1
network.host: 192.168.20.201
#####以下配置是防止脑裂###########
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping_timeout:120s
client.transport.ping_timeout: 60s
discovery.zen.ping.unicast.hosts: [“es1”, “es2″,”es3”]
配置信息的具体讲解先放在后边:直接按上边配置即可
启动
su - el
/data/elasticsearch/bin/elasticsearch -d
安装Marvel()
Marvel是Elasticsearch的管理和监控工具,在开发环境下免费使用。它包含了一个叫做Sense的交互式控制台,使用户方便的通过浏览器直接与Elasticsearch进行交互。
bin/plugin install license
bin/plugin install marvel-agent
接下来,我们安装head插件:
cd elasticsearch/bin/
通过如下命令安装head插件
./plugin install mobz/elasticsearch-head
如果下载失败,可以自己手动下载,下载地址:
并上传到elasticsearch所在目录:
使用如下方式进行安装:
./plugin install file:/java/elasticsearch-head-master.zip
安装中文分词插件:ikik是在github上,需要自己下载并使用maven编译:github地址如下:
git clone https://github.com/medcl/elasticsearch-analysis-ik cd elasticsearch-analysis-ik mvn clean mvn compile mvn package
需要对比elasticsearch 与 ik 的对应关系
Elasticsearch2.4 与 ik没有找到对应关系, Elasticsearch5.00 的head的插件不支持,所以使用2.3.5
curl -L -O https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.5/elasticsearch-2.3.5.tar.gz
cd elasticsearch-analysis-ik
git tag -l 显示标记
git checkout –b dev1.9.5 v1.9.5 使用1.9.5版本
mvn clean
mvn compile
mvn package
mkdir /data/elasticsearch/plugins/ik -p
unzip /data/tools/elasticsearch-analysis-ik/target/releases/elasticsearch-analysis-ik-1.9.5.zip -d /data/elasticsearch/plugins/ik
su root
给其他两台机器拷贝:
scp -r elasticsearch es2:/data/
scp -r elasticsearch es3:/data/
其他两台机器同样也需要创建新的用户el,并赋予相应的权限,这里不再赘述。
登录es2:
vim elasticsearch/config/elasticsearch.yml
node.name: es2
network.host: 192.168.20.202
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping_timeout:120s
client.transport.ping_timeout: 60s
discovery.zen.ping.unicast.hosts: [“es1”, “es2″,”es3”]
登录es3:
vim elasticsearch/config/elasticsearch.yml
node.name: es3
network.host: 192.168.20.203
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping_timeout:120s
client.transport.ping_timeout: 60s
discovery.zen.ping.unicast.hosts: [“es1”, “es2″,”es3”]
接下来,启动elasticsearch cluster
分别在es1,es2,es3上执行如下操作:
su el
./elasticsearch/bin/elasticsearch -d
到这里,集群已经成功启动:
通过如下地址可验证集群是否启动成功:
http://es1:9200/_plugin/head/
出现如下图所示,则说明,集群启动成功。
接下来,让我们通过如下方式建立并索引文档:
1.create a index
curl -XPUT http://localhost:9200/index
2.create a mapping
curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'
{
"fulltext": {
"_all": {
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word",
"term_vector": "no",
"store": "false"
},
"properties": {
"content": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word",
"include_in_all": "true",
"boost": 8
}
}
}
}'
3.index some docs
curl -XPOST http://localhost:9200/index/fulltext/1 -d'
{"content":"美国留给伊拉克的是个烂摊子吗"}
'
curl -XPOST http://localhost:9200/index/fulltext/2 -d'
{"content":"公安部:各地校车将享最高路权"}
'
curl -XPOST http://localhost:9200/index/fulltext/3 -d'
{"content":"中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"}
'
curl -XPOST http://localhost:9200/index/fulltext/4 -d'
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
'
4.query with highlighting
curl -XPOST http://localhost:9200/index/fulltext/_search -d'
{
"query" : { "match" : { "content" : "中国" }},
"highlight" : {
"pre_tags" : ["<tag1>", "<tag2>"],
"post_tags" : ["</tag1>", "</tag2>"],
"fields" : {
"content" : {}
}
}
}
'
Result
{
"took": 14,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2,
"hits": [
{
"_index": "index",
"_type": "fulltext",
"_id": "4",
"_score": 2,
"_source": {
"content": "中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"
},
"highlight": {
"content": [
"<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首 "
]
}
},
{
"_index": "index",
"_type": "fulltext",
"_id": "3",
"_score": 2,
"_source": {
"content": "中韩渔警冲突调查:韩警平均每天扣1艘中国渔船"
},
"highlight": {
"content": [
"均每天扣1艘<tag1>中国</tag1>渔船 "
]
}
}
]
}
}
看到如下图说明我们集群搭建成功并可正常使用:
删除索引:
curl -XDELETE 'http://localhost:9200/feng'
获取数据
curl –XGET 'http://localhost:9200/index/fulltext/2'
/index/fulltext/2
索引/类型/id
删除数据
curl –XDELETE 'http://localhost:9200/index/fulltext/2'
例如:删除索引为
[root@es1 nodes]# curl -XDELETE 'http://192.168.20.201:9200/test-system-messages-2016-09'
{"acknowledged":true}
[root@es1 nodes]# curl -XDELETE 'http://192.168.20.201:9200/test-system-messages-2016.09.270'
{"acknowledged":true}
[root@es1 nodes]# curl -XDELETE 'http://192.168.20.201:9200/test-system-messages-2016.09.271
{"acknowledged":true}
logstash 客户端安装(安装在应用服务器上)
1. 下载
wget https://download.elastic.co/logstash/logstash/logstash-2.4.0.tar.gz
tar -zxvf logstash-2.4.0.tar.gz
mv logstash-2.4.0 /data/
ln -s logstash-2.4.0/ logstash
cd /data/logstash
mkdir logs etc
2 提供logstash管理脚本,其中里面的配置路径可根据实际情况修改
#!/bin/bash
FILE='/data/logstash/etc/*.conf' #logstash配置文件
LOGBIN='/data/logstash/bin/logstash agent --verbose --config' #指定logstash配置文件的命令
LOCK='/data/logstash/locks' #用锁文件配合服务启动与关闭
LOGLOG='--log /data/logstash/logs/stdou.log' #日志
START() {
if [ -f $LOCK ];then
echo -e "Logstash is already \033[32mrunning\033[0m, do nothing."
else
echo -e "Start logstash service.\033[32mdone\033[m"
nohup ${LOGBIN} ${FILE} ${LOGLOG} &
touch $LOCK
fi
}
STOP() {
if [ ! -f $LOCK ];then
echo -e "Logstash is already stop, do nothing."
else
echo -e "Stop logstash serivce \033[32mdone\033[m"
rm -rf $LOCK
ps -ef | greplogstash | grep -v "grep" | awk '{print $2}' | xargskill -s 9 >/dev/null
fi
}
STATUS() {
psaux | greplogstash | grep -v "grep" >/dev/null
if [ -f $LOCK ] && [ $? -eq 0 ]; then
echo -e "Logstash is: \033[32mrunning\033[0m..."
else
echo -e "Logstash is: \033[31mstopped\033[0m..."
fi
}
TEST(){
${LOGBIN} ${FILE} --configtest
}
case "$1" in
start)
START
;;
stop)
STOP
;;
status)
STATUS
;;
restart)
STOP
sleep 2
START
;;
test)
TEST
;;
*)
echo "Usage: /etc/init.d/logstash (test|start|stop|status|restart)"
;;
esac
3. logstash 把日志写入到kafka集群
cat /data/logstash/etc/logstash.conf
input { #这里的输入还是定义的是从日志文件输入 file { type => "system-message" path => "/var/log/messages" start_position => "beginning" } } output { #stdout { codec => rubydebug } #这是标准输出到终端,可以用于调试看有没有输出,注意输出的方向可以有多个 kafka { #输出到kafka bootstrap_servers => "192.168.2.22:9092,192.168.2.23:9092,192.168.2.24:9092" #他们就是生产者 topic_id => "system-messages" #这个将作为主题的名称,将会自动创建 compression_type => "snappy" #压缩类型 } }
4. 检查配置文件是否有语法错
/data/logstash/bin/logstash -f logstash.conf --configtest --verbose
5. 启动logstash
/usr/local/logstash/bin/logstash -f logstash.conf
6. 测试
[root@haproxy1 etc]# cat /etc/security/limits.conf >>/var/log/messages
7. 登陆kafka 消息队里服务器
[root@kafka1 ~]# /data/kafka/bin/kafka-topics.sh --list --zookeeper zookeeper1:2181
system-messages #显示topics 为: system-messages
8. 查看system-messages 主题的详情。
[root@kafka1 ~]# /data/kafka/bin/kafka-topics.sh --describe --zookeeper zookeeper1:2181 --topic system-messages
Topic:system-messages PartitionCount:16 ReplicationFactor:1 Configs:
Topic: system-messages Partition: 0 Leader: 3 Replicas: 3 Isr: 3
Topic: system-messages Partition: 1 Leader: 1 Replicas: 1 Isr: 1
Topic: system-messages Partition: 2 Leader: 2 Replicas: 2 Isr: 2
Topic: system-messages Partition: 3 Leader: 3 Replicas: 3 Isr: 3
Topic: system-messages Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: system-messages Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: system-messages Partition: 6 Leader: 3 Replicas: 3 Isr: 3
Topic: system-messages Partition: 7 Leader: 1 Replicas: 1 Isr: 1
Topic: system-messages Partition: 8 Leader: 2 Replicas: 2 Isr: 2
Topic: system-messages Partition: 9 Leader: 3 Replicas: 3 Isr: 3
Topic: system-messages Partition: 10 Leader: 1 Replicas: 1 Isr: 1
Topic: system-messages Partition: 11 Leader: 2 Replicas: 2 Isr: 2
Topic: system-messages Partition: 12 Leader: 3 Replicas: 3 Isr: 3
Topic: system-messages Partition: 13 Leader: 1 Replicas: 1 Isr: 1
Topic: system-messages Partition: 14 Leader: 2 Replicas: 2 Isr: 2
Topic: system-messages Partition: 15 Leader: 3 Replicas: 3 Isr: 3
可以看出,这个主题生成了16个分区,每个分区都有对应自己的Leader,但是我想要有10个分区,3个副本如何办?还是跟我们上面一样命令行来创建主题就行,当然对于logstash输出的我们也可以提前先定义主题,然后启动logstash 直接往定义好的主题写数据就行啦,命令如下:
[root@kafka1 ~]# /usr/local/kafka/bin/kafka-topics.sh --create --zookeeper 192.168.2.22:2181 --replication-factor 3 --partitions 10 --topic system-messages
logstash 服务器端安装(单独服务器上)
1. 下载
wget https://download.elastic.co/logstash/logstash/logstash-2.4.0.tar.gz
tar -zxvf logstash-2.4.0.tar.gz
mv logstash-2.4.0 /data/
ln -s logstash-2.4.0/ logstash
cd /data/logstash
mkdir logs etc
[root@kafka1etc]# more logstash.conf input { kafka { zk_connect => "zookeeper1:2181,zookeeper2:2181,zookeeper3:2181" #消费者们 topic_id => "system-messages" codec => plain reset_beginning => false consumer_threads => 5 decorate_events => true } } output { elasticsearch { hosts => ["es1:9200","es2:9200","es3:9200"] index => "test-system-messages-%{+YYYY.MM.DD}" #为了区分之前实验,我这里新生成的所以名字为“test-system-messages-%{+YYYY-MM}” } }
2 . 启动 logstashserver 服务器进程
[root@logstashserver etc]# /etc/init.d/logstash start
##########################################################
使用redis 替换kafka
logstash 客户端配置文件
input { #这里的输入还是定义的是从日志文件输入
file {
type => "system-messages"
path => "/var/log/messages"
start_position => "beginning"
}
file {
type => "wx-cinyi-com"
path => "/root/wxcinyi.access.log"
start_position => "beginning"
}
}
output {
if [type] == "system-messages" {
redis {
host => "192.168.20.166"
port => "6379"
db => "1"
data_type => "list"
key => "system-messages"
}
}
if [type] == "wx-cinyi-com" {
redis {
host => "192.168.20.166"
port => "6379"
db => "2"
data_type => "list"
key => "wx-cinyi-com"
}
}
}
logstash 服务器端配置文件
input {
if [type] == "system-messages" {
redis {
host => "192.168.20.166"
port => "6379"
db => "1"
data_type => "list"
key => "system-messages"
}
}
if [type] == "wx-cinyi-com" {
redis {
host => "192.168.20.166"
port => "6379"
db => "2"
data_type => "list"
key => "wx-cinyi-com"
}
}
}
output {
if [type] == "system-messages" {
elasticsearch {
hosts => ["es1:9200","es2:9200","es3:9200"]
index => "test-system-messages-%{+YYYY-MM-DD}"
}
}
if [type] == "wx-cinyi-com" {
elasticsearch {
hosts => ["es1:9200","es2:9200","es3:9200"]
index => "wx-cinyi-com-%{+YYYY-MM-DD}"
}
}
}
###########################################################
kibana安装配置
wget https://download.elastic.co/kibana/kibana/kibana-4.6.1-linux-x86_64.tar.gz
[root@kibanai data]# tar -zxvf kibana-4.6.1-linux-x86_64.tar.gz
[root@kibanai data]# mv kibana-4.6.1-linux-x86_64 /data
[root@kibanai data]# ln -s kibana-4.6.1-linux-x86_64 kibana
[root@kibanai data]# /data/kibana/run/
[root@kibanai data]# cd /data/kibana/config/
[root@kibanai config]# cat kibana.yml | grep -v "#" | sed '/^$/d'
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.url: "http://es1:9200"
elasticsearch.username: "user"
elasticsearch.password: "pass"
elasticsearch.startupTimeout: 5000
pid.file: /data/kibana/run/kibana.pid
安装kibana marvel插件
bin/kibana plugin --install elasticsearch/marvel/latest
3. kibana 启动脚本
#!/bin/bash
KIBBIN='/data/kibana/bin/kibana'
LOCK='/data/kibana/locks'
START() {
if [ -f $LOCK ];then
echo -e "kibana is already \033[32mrunning\033[0m, do nothing."
else
echo -e "Start kibana service.\033[32mdone\033[m"
cd /data/kibana/bin
nohup ./kibana & >/dev/null
touch $LOCK
fi
}
STOP() {
if [ ! -f $LOCK ];then
echo -e "kibana is already stop, do nothing."
else
echo -e "Stop kibana serivce \033[32mdone\033[m"
rm -rf $LOCK
ps -ef | grep kibana | grep -v "grep" | awk '{print $2}' | xargs kill -s 9 >/dev/null
fi
}
STATUS() {
Port=$(netstat -tunl | grep ":5602")
if [ "$Port" != "" ] && [ -f $LOCK ];then
echo -e "kibana is: \033[32mrunning\033[0m..."
else
echo -e "kibana is: \033[31mstopped\033[0m..."
fi
}
case "$1" in
start)
START
;;
stop)
STOP
;;
status)
STATUS
;;
restart)
STOP
sleep 2
START
;;
*)
echo "Usage: /etc/init.d/kibana (|start|stop|status|restart)"
;;
esac
3. 添加权限
[root@kibanai config]# chmod +x /etc/init.d/kibana
4. 启动
[root@kibanai config]# /etc/init.d/kibana start
5. 浏览器打开http://192.168.20.205:5601
http://localhost:5601/app/marvel 浏览插件