文章目录

  • Bool Query
  • 数据准备
  • must
  • should
  • filter
  • must_not
  • 总结:


Bool Query

布尔查询支持4种组合类型:

类型

说明

must

可包含多个查询条件,每个条件均满足的文档才能被搜索到,每次查询需要计算相关度得分

should

可包含多个查询条件,不存在must和fiter条件时,至少要满足多个查询条件中的一个,文档才能被搜索到,否则需满足的条件数量不受限制,匹配到的查询越多相关度越高

filter

可包含多个过滤条件,每个条件均满足的文档才能被搜索到,每个过滤条件不计算相关度得分,结果在一定条件下会被缓存

must_not

可包含多个过滤条件,每个条件均不满足的文档才能被搜索到,每个过滤条件不计算相关度得分,结果在一定条件下会被缓存

数据准备

索引mapping信息如下:

PUT bool_index
{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "age": {
        "type": "long"
      },
      "description" : {
          "type" : "text",
          "analyzer": "ik_max_word",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
    }
  }
}

索引文档信息如下:

POST /bool_index/_bulk
{"index":{"_id":1}}
{"name":"张三","age":11,"description":"北京故宫圆明园"}
{"index":{"_id":2}}
{"name":"王五","age":15,"description":"南京总统府"}
{"index":{"_id":3}}
{"name":"张三","age":18,"description":"北京市天安门广场"}
{"index":{"_id":4}}
{"name":"富贵","age":22,"description":"南京市中山陵"}
{"index":{"_id":5}}
{"name":"来福","age":8,"description":"山东济南趵突泉"}
{"index":{"_id":6}}
{"name":"憨憨","age":27,"description":"安徽黄山九华山"}
{"index":{"_id":7}}
{"name":"小七","age":31,"description":"上海东方明珠"}
{"index":{"_id":8}}
{"name":"张三","age":11,"description":"南京总统"}

must

DSl: 查询name中存在 “张三”,description中存在 “北京” 的数据

GET bool_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "张三"
          }
        },
        {
          "match": {
            "description": "北京"
          }
        }
      ]
    }
  }
}

返回数据如下:
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 3.3848772,
    "hits" : [
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 3.3848772,
        "_source" : {
          "name" : "张三",
          "age" : 11,
          "description" : "北京故宫圆明园"
        }
      },
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 2.8753755,
        "_source" : {
          "name" : "张三",
          "age" : 18,
          "description" : "北京市天安门广场"
        }
      }
    ]
  }
}

springboot实现:

private static final String INDEX_NAME = "bool_index";

    @Resource
    private RestHighLevelClient client;
    
    @RequestMapping(value = "/mustQuery", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - mustQuery")
    public void mustQuery() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询所有
        searchRequest.source(new SearchSourceBuilder().query(
                QueryBuilders.boolQuery()
                        .must(QueryBuilders.matchQuery("name","张三"))
                        .must(QueryBuilders.matchQuery("description","北京"))
        ));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }
    
    private void printLog(SearchResponse searchResponse) {
        SearchHits hits = searchResponse.getHits();
        System.out.println("返回hits数组长度:" + hits.getHits().length);
        for (SearchHit hit: hits.getHits()) {
            System.out.println(hit.getSourceAsMap().toString());
        }
    }
    
返回数据如下:
返回hits数组长度:2
{name=张三, description=北京故宫圆明园, age=11}
{name=张三, description=北京市天安门广场, age=18}

should

DSL: 查询name中存在 “张三” 或者 description中存在 “北京” 的数据

GET bool_index/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "name": "张三"
          }
        },
        {
          "match": {
            "description": "北京"
          }
        }
      ]
    }
  }
}

查询结果如下:
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 3.3848772,
    "hits" : [
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 3.3848772,
        "_source" : {
          "name" : "张三",
          "age" : 11,
          "description" : "北京故宫圆明园"
        }
      },
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 2.8753755,
        "_source" : {
          "name" : "张三",
          "age" : 18,
          "description" : "北京市天安门广场"
        }
      },
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.8889232,
        "_source" : {
          "name" : "张三",
          "age" : 11,
          "description" : "南京总统"
        }
      }
    ]
  }
}

springboot实现:

@RequestMapping(value = "/shouldQuery", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - shouldQuery")
    public void shouldQuery() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询所有
        searchRequest.source(new SearchSourceBuilder().query(
                QueryBuilders.boolQuery()
                        .should(QueryBuilders.matchQuery("name","张三"))
                        .should(QueryBuilders.matchQuery("description","北京"))
        ));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回数据如下:
返回hits数组长度:3
{name=张三, description=北京故宫圆明园, age=11}
{name=张三, description=北京市天安门广场, age=18}
{name=张三, description=南京总统, age=11}

filter

DSL: 查询name中存在 “张三” 或者 description中存在 “北京” 的数据 且 age > 15 的数据

GET bool_index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "match": {
                  "name": "张三"
                }
              },
              {
                "match": {
                  "description": "北京"
                }
              }
            ]
          }
        },
        {
          "bool": {
            "filter": [
              {
                "range": {
                  "age": {
                    "gte": 15
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}

返回数据如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 2.8753755,
    "hits" : [
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 2.8753755,
        "_source" : {
          "name" : "张三",
          "age" : 18,
          "description" : "北京市天安门广场"
        }
      }
    ]
  }
}

springboot实现:

@RequestMapping(value = "/filterQuery", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - filterQuery")
    public void filterQuery() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询所有
        searchRequest.source(new SearchSourceBuilder().query(
                QueryBuilders.boolQuery()
                        .must(QueryBuilders.boolQuery()
                                .should(QueryBuilders.matchQuery("name","张三"))
                                .should(QueryBuilders.matchQuery("description","北京")))
                        .must(QueryBuilders.boolQuery()
                                .filter(QueryBuilders.rangeQuery("age").gte("15")))
        ));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回数据如下:
返回hits数组长度:1
{name=张三, description=北京市天安门广场, age=18}

must_not

DSL: 查询 age 不在【11,15,18,22】 的数据

GET bool_index/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "terms": {
            "age": [
              "11",
              "15",
              "18",
              "22"
            ]
          }
        }
      ]
    }
  }
}

返回数据如下:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 0.0,
        "_source" : {
          "name" : "来福",
          "age" : 8,
          "description" : "山东济南趵突泉"
        }
      },
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 0.0,
        "_source" : {
          "name" : "憨憨",
          "age" : 27,
          "description" : "安徽黄山九华山"
        }
      },
      {
        "_index" : "bool_index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 0.0,
        "_source" : {
          "name" : "小七",
          "age" : 31,
          "description" : "上海东方明珠"
        }
      }
    ]
  }
}

springboot实现:

@RequestMapping(value = "/mustNotQuery", method = RequestMethod.GET)
    @ApiOperation(value = "DSL - mustNotQuery")
    public void mustNotQuery() throws Exception {
        // 定义请求对象
        SearchRequest searchRequest = new SearchRequest(INDEX_NAME);
        // 查询所有
        searchRequest.source(new SearchSourceBuilder().query(
                QueryBuilders.boolQuery()
                        .mustNot(QueryBuilders.termsQuery("age", new String[]{"11","15","18","22"}))
        ));
        // 打印返回数据
        printLog(client.search(searchRequest, RequestOptions.DEFAULT));
    }

返回数据如下:
返回hits数组长度:3
{name=来福, description=山东济南趵突泉, age=8}
{name=憨憨, description=安徽黄山九华山, age=27}
{name=小七, description=上海东方明珠, age=31}

总结:

在布尔条件中,可以包含两种不同的上下文。

  • 搜索上下文(query context):使用搜索上下文时,Elasticsearch需要计算每个文档与搜索条件的相关度得分,并按照相关性进行排序,返回与查询最匹配的文档,有一定的性能开销,带文本分析的全文检索的查询语句很适合放在搜索上下文中,其中 must,should属于搜索上下文。
  • 优点:可以根据文档的相关性得分进行排序,返回与查询最匹配的文档,适用于需要按照相关性排序的搜索场景。
  • 缺点:计算相关性得分的过程会消耗大量的计算资源,对于大规模数据集搜索性能可能较差。
  • 过滤上下文(filter context):过滤上下文是根据指定的过滤条件来筛选文档,不计算相关性得分,只返回符合条件的文档,例如使用Term query判断一个值是否跟搜索内容一致,使用Range query判断某数据是否位于某个区间等。过滤上下文的查询不需要进行相关度得分计算,还可以使用缓存加快响应速度,很多术语级查询语句都适合放在过滤上下文中,其中 must_not,filter属于过滤上下文。
  • 优点:不需要计算相关性得分,查询性能较好,适用于需要高效筛选文档的搜索场景。
  • 缺点:无法按照相关性排序返回文档,不适用于需要按照相关性排序的搜索场景。