概述
本文主要对elasticsearch的常用语法进行整理,以及比较match和term在过滤字段时的不同,分别列举了keyword和text的情形,后续还会继续补充其他的用法。
环境准备
- elasticsearch 7.6.1,参考docker安装elasticsearch
- kibana 7.6.1,参考docker安装kibana
创建索引
- 创建索引名为userdb
- name:keyword类型,不分词
- name2:text类型,分词
PUT userdb
{
"mappings": {
"properties": {
"name":{
"type": "keyword"
},
"name2":{
"type": "text"
},
"age":{
"type": "integer"
}
}
}
}
保存数据
PUT /userdb/_doc/1
{
"name":"张小明",
"name2": "张小明",
"age": 25
}
PUT /userdb/_doc/2
{
"name":"张小红",
"name2":"张小红",
"age": 23
}
注意:以下所有分词使用的是standard分词器,即所有中文都会被拆成单个汉字
match 查询
match在过滤keyword字段时,不会对搜索词拆分!不分词!不分词!不分词!
执行时,会直接把搜索词跟字段值做完全匹配
测试条件:“name”:“张”,结果有0笔
类似SQL:name=‘张’
GET /userdb/_doc/_search
{
"query":{
"match":{
"name":"张"
}
}
}
结果如下:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
测试条件:“name”:“张小明”,结果有1笔
类似SQL:name=‘张小明’
GET /userdb/_doc/_search
{
"query":{
"match":{
"name":"张小明"
}
}
}
结果如下:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.6931471,
"hits" : [
{
"_index" : "userdb",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.6931471,
"_source" : {
"name" : "张小明",
"name2" : "张小明",
"age" : 25
}
}
]
}
}
match在过滤text字段时,会把搜索词进行拆分再查询,只要其中一个搜索词匹配到记录就会返回。
执行时,搜索词拆分了,字段也拆分了,查询粒度最细
测试条件:name2: 张小明,结果有2笔
相当于name2: 张 OR name2: 小 OR name2: 明
GET /userdb/_doc/_search
{
"query":{
"match":{
"name2":"张小明"
}
}
}
结果如下:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0577903,
"hits" : [
{
"_index" : "userdb",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0577903,
"_source" : {
"name" : "张小明",
"name2" : "张小明",
"age" : 25
}
},
{
"_index" : "userdb",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.36464313,
"_source" : {
"name" : "张小红",
"name2" : "张小红",
"age" : 23
}
}
]
}
}
term 精确查询
term属于精确匹配,查询时不会对搜索词进行分词
查询keyword类型字段,测试条件:name: 张小明,结果有1笔
类似SQL:name = ‘张小明’
GET /userdb/_doc/_search
{
"query":{
"term":{
"name":"张小明"
}
}
}
结果如下:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.6931471,
"hits" : [
{
"_index" : "userdb",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.6931471,
"_source" : {
"name" : "张小明",
"name2" : "张小明",
"age" : 25
}
}
]
}
}
查询text类型字段,测试条件:name2: 张小明,结果有0笔
GET /userdb/_doc/_search
{
"query":{
"term":{
"name2":"张小明"
}
}
}
结果如下:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
原因解析:
term不会对搜索词拆分,而es存储的text字段已经分别拆成了[张,小,明] 和 [张,小,红],所以这时候用全名去匹配,结果肯定是空的。
filter 过滤查询
GET /userdb/_doc/_search
{
"query":{
"bool":{
"filter":{
"term":{
"name": "张小明"
}
}
}
}
}
结果如下:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "userdb",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"name" : "张小明",
"name2" : "张小明",
"age" : 25
}
}
]
}
}
大家注意,此时的_score是0分。
因为filter查询不考虑评分
,它适用于精确匹配、范围过滤,拥有更快的性能,推荐适用。
range 区间过滤
查询 age >= 24 且 age <= 25 的数据
GET /userdb/_doc/_search
{
"query":{
"range":{
"age":{
"gte": 24,
"lte": 25
}
}
}
}
sort 排序
查询结果按年龄升序排序
分页
from:从第几笔开始,默认0开始
size:返回记录数,相当于分页大小
must
must类似SQL的and查询
如下例子,等价于 name=张小明 and age=25
should
should类似SQL的or查询
如下例子,等价于 age=23 or age=25
must_not
must_not表示内部的条件全都不能匹配
如下例子,等价于age != 23 and age != 25
must+should
如下查询,must内部嵌入should,相当于SQL:
select * from testdb
where name='张三' and (createdOn>'2020-09-01 00:00:00' or modifiedOn<'2020-09-01 00:00:00')
ES语法如下:
GET /testdb/_doc/_search
{
"query":{
"bool": {
"must":[
{
"match":{
"name":"张三"
}
},
{
"bool":{
"should":[
{
"range":{
"createdOn":{
"gt":"2020-09-01 00:00:00"
}
}
},
{
"range":{
"modifiedOn":{
"lt":"2020-09-01 00:00:00"
}
}
}
]
}
}
]
}
}
}
_update_by_query
- 根据条件局部根据字段
POST /knowledge/_update_by_query
{
"query":{
"match":{
"source":"CenterCollect"
}
},
"script":{
"source":"ctx._source.sourceName='知识库组采集';"
}
}
等价于:
update knowledge
set sourceName = '知识库组采集'
where source = 'CenterCollect'
_source返回字段筛选
例子:只返回以下2个字段
GET /advertising/_doc/_search
{
"query":{
"match_all":{
}
},
"_source":["applyName","adAddress"]
}