01
elasticsearch简介
Elasticsearch 是一种实时的分布式搜索引擎,具有对大规模快速检索的能力。是一种面向文档型数据库,相对于传统的关系型数据库,它更快。主要被用作全文检索、结构化搜索、分析以及三个功能的组合。现在服务于很多大型网站,有著名的github和stack overflow等等。
02
Elasticsearch 入门
先对elasticsearch做一个名词解释,如果你学过关系型数据库,对比起来记忆会很方便。
倒排索引:关系型数据库通过增加一个 索引 比如一个 B树(B-tree)索引 到指定的列上,以便提升数据检索速度。Elasticsearch 和 Lucene 使用了一个叫做 倒排索引 的结构来达到相同的目的。
elasticsearch默认开放的端口是9200,我们交互的方式主要是利用RESTful API协议通过web客户端或者linux下的curl命令。
一个简单的curl请求命令参数:
curl -X<VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'
1、创建一个名叫school的索引index和student的类型type
curl -X PUT "localhost:9200/school?pretty" -d
“{
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1
},
"mappings": {
"student": {
"properties": {
"name": {
"type": "string"
},
"age": {
"type": "long"
}
}
}
}
}”
索引创建成功后回显:
{
"acknowledged" : true
}
number_of_shards代表分片数
number_of_replicas代表备份数
1、往类型里插入文档
curl -X PUT '10.1.130.27:9200/school/student/1?pretty' -d '{"name":"hello world","age":18}'
写入doc成功后返回数据
{
"_index" : "school",
"_type" : "student",
"_id" : "1",
"_version" : 1,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"created" : true
}
2、更新一个文档
curl -X POST '10.1.130.27:9200/school/student/1?pretty' -d '{"doc":{"age":17}}'
更新后返回数据
{
"_index" : "school",
"_type" : "student",
"_id" : "1",
"_version" : 2,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"created" : false
}
3、查询一个文档
curl -X GET '10.1.130.27:9200/school/student/1?pretty'
查询后返回数据
{
"_index" : "school",
"_type" : "student",
"_id" : "1",
"_version" : 2,
"found" : true,
"_source" : {
"doc" : {
"age" : 17
}
}
}
4、删除一个文档
curl -X DELETE '10.1.130.27:9200/school/student/1?pretty'
删除后返回数据
{
"found" : true,
"_index" : "school",
"_type" : "student",
"_id" : "1",
"_version" : 3,
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
}
}
轻量搜索
一个简单的搜索查询只需要将请求类型换成GET。
例如查询第一个学生信息:
curl -X GET "10.1.130.27:9200/school/student/1?pretty"
返回结果
{
"_index" : "school",
"_type" : "student",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "hello world",
"age" : 18
}
}
GET请求不通过_id,加上条件查询
curl -X GET "10.1.130.27:9200/school/student/_search?age=18"
返回结果
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.0,
"hits": [{
"_index": "school",
"_type": "student",
"_id": "1",
"_score": 1.0,
"_source": {
"name": "hello world",
"age": 18
}
}]
}
}
特定领域的结构化查询
Curl -X GET "10.1.130.27:9200/school/student/_search?pretty" -d '{"query":{"match":{"name":"hello"}}}'
返回结果
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.19178301,
"hits" : [ {
"_index" : "school",
"_type" : "student",
"_id" : "1",
"_score" : 0.19178301,
"_source" : {
"name" : "hello world",
"age" : 18
}
} ]
}
}
查询语句的以query开头,match字段代表某field包含该值value,类似于关系型结构化查询的like %value%,但是速度比sql快很多,es会将doc里面的field值进行分词,分成不同的term,然后分别建立索引,进行到排序查询,所以速度会很快。
更复杂的带过滤器的查询语句,过滤年龄大于30岁的。
curl -X GET "10.1.130.27:9200/school/student/_search?pretty" -d
{
"query" : {
"bool": {
"must": {
"match" : {
"name" : "hello"
}
},
"filter": {
"range" : {
"age" : { "gt" : 30 }
}
}
}
}
}
返回结果
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.37158427,
"hits" : [ {
"_index" : "school",
"_type" : "student",
"_id" : "2",
"_score" : 0.37158427,
"_source" : {
"name" : "hello world",
"age" : 31
}
}, {
"_index" : "school",
"_type" : "student",
"_id" : "3",
"_score" : 0.19178301,
"_source" : {
"name" : "hello world",
"age" : 35
}
} ]
}
}