概述

本文主要对elasticsearch的常用语法进行整理,以及比较match和term在过滤字段时的不同,分别列举了keyword和text的情形,后续还会继续补充其他的用法。

环境准备

  1. elasticsearch 7.6.1,参考docker安装elasticsearch
  2. kibana 7.6.1,参考docker安装kibana

创建索引

  • 创建索引名为userdb
  • name:keyword类型,不分词
  • name2:text类型,分词
PUT userdb
{
  "mappings": {
    "properties": {
      "name":{
        "type": "keyword"
      },
      "name2":{
        "type": "text"
      },
      "age":{
        "type": "integer"
      }
    }
  }
}

保存数据

PUT /userdb/_doc/1
{
  "name":"张小明",
  "name2": "张小明",
  "age": 25
}
PUT /userdb/_doc/2
{
  "name":"张小红",
  "name2":"张小红",
  "age": 23
}

注意:以下所有分词使用的是standard分词器,即所有中文都会被拆成单个汉字

match 查询

match在过滤keyword字段时,不会对搜索词拆分!不分词!不分词!不分词!
执行时,会直接把搜索词跟字段值做完全匹配

测试条件:“name”:“张”,结果有0笔
类似SQL:name=‘张’

GET /userdb/_doc/_search
{
  "query":{
    "match":{
      "name":"张"
    }
  }
}

结果如下:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

测试条件:“name”:“张小明”,结果有1笔
类似SQL:name=‘张小明’

GET /userdb/_doc/_search
{
  "query":{
    "match":{
      "name":"张小明"
    }
  }
}

结果如下:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.6931471,
    "hits" : [
      {
        "_index" : "userdb",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.6931471,
        "_source" : {
          "name" : "张小明",
          "name2" : "张小明",
          "age" : 25
        }
      }
    ]
  }
}

match在过滤text字段时,会把搜索词进行拆分再查询,只要其中一个搜索词匹配到记录就会返回。
执行时,搜索词拆分了,字段也拆分了,查询粒度最细

测试条件:name2: 张小明,结果有2笔
相当于name2: 张 OR name2: 小 OR name2: 明

GET /userdb/_doc/_search
{
  "query":{
    "match":{
      "name2":"张小明"
    }
  }
}

结果如下:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0577903,
    "hits" : [
      {
        "_index" : "userdb",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0577903,
        "_source" : {
          "name" : "张小明",
          "name2" : "张小明",
          "age" : 25
        }
      },
      {
        "_index" : "userdb",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.36464313,
        "_source" : {
          "name" : "张小红",
          "name2" : "张小红",
          "age" : 23
        }
      }
    ]
  }
}

term 精确查询

term属于精确匹配,查询时不会对搜索词进行分词

查询keyword类型字段,测试条件:name: 张小明,结果有1笔
类似SQL:name = ‘张小明’

GET /userdb/_doc/_search
{
  "query":{
    "term":{
      "name":"张小明"
    }
  }
}

结果如下:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.6931471,
    "hits" : [
      {
        "_index" : "userdb",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.6931471,
        "_source" : {
          "name" : "张小明",
          "name2" : "张小明",
          "age" : 25
        }
      }
    ]
  }
}

查询text类型字段,测试条件:name2: 张小明,结果有0笔

GET /userdb/_doc/_search
{
  "query":{
    "term":{
      "name2":"张小明"
    }
  }
}

结果如下:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

原因解析:term不会对搜索词拆分,而es存储的text字段已经分别拆成了[张,小,明] 和 [张,小,红],所以这时候用全名去匹配,结果肯定是空的。



filter 过滤查询

GET /userdb/_doc/_search
{
  "query":{
    "bool":{
      "filter":{
        "term":{
          "name": "张小明"
        }
      }
    }
  }
}

结果如下:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "userdb",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.0,
        "_source" : {
          "name" : "张小明",
          "name2" : "张小明",
          "age" : 25
        }
      }
    ]
  }
}

大家注意,此时的_score是0分。
因为filter查询不考虑评分,它适用于精确匹配、范围过滤,拥有更快的性能,推荐适用。

range 区间过滤

查询 age >= 24 且 age <= 25 的数据

GET /userdb/_doc/_search
{
  "query":{
    "range":{
      "age":{
        "gte": 24,
        "lte": 25
      }
    }
  }
}

sort 排序

查询结果按年龄升序排序

Java操作es不分词查询 elasticsearch 不分词_SQL

分页

from:从第几笔开始,默认0开始
size:返回记录数,相当于分页大小

Java操作es不分词查询 elasticsearch 不分词_elasticsearch_02

must

must类似SQL的and查询
如下例子,等价于 name=张小明 and age=25

Java操作es不分词查询 elasticsearch 不分词_elasticsearch_03

should

should类似SQL的or查询
如下例子,等价于 age=23 or age=25

Java操作es不分词查询 elasticsearch 不分词_SQL_04

must_not

must_not表示内部的条件全都不能匹配
如下例子,等价于age != 23 and age != 25

Java操作es不分词查询 elasticsearch 不分词_elasticsearch_05

must+should

如下查询,must内部嵌入should,相当于SQL:

select * from testdb 
where name='张三' and (createdOn>'2020-09-01 00:00:00' or modifiedOn<'2020-09-01 00:00:00')

ES语法如下:

GET /testdb/_doc/_search
{
  "query":{
    "bool": {
      "must":[
        {
          "match":{
            "name":"张三"
          }
        },
        {
          "bool":{
            "should":[
              {
                "range":{
                  "createdOn":{
                    "gt":"2020-09-01 00:00:00"
                  }
                }
              },
              {
                "range":{
                  "modifiedOn":{
                    "lt":"2020-09-01 00:00:00"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

_update_by_query

  • 根据条件局部根据字段
POST /knowledge/_update_by_query
{
  "query":{
    "match":{
      "source":"CenterCollect"
    }
  },
  "script":{
    "source":"ctx._source.sourceName='知识库组采集';"
  }
}

等价于:

update knowledge 
set sourceName = '知识库组采集' 
where source = 'CenterCollect'

_source返回字段筛选

例子:只返回以下2个字段

GET /advertising/_doc/_search
{
  "query":{
    "match_all":{
      
    }
  },
  "_source":["applyName","adAddress"]
}