目录
目标
ES版本信息
官方文档
实战
新增测试数据
聚合函数
聚合前过滤数据
去重聚合
Percentiles百分比
Percentile ranks计算百分位数等级
top_hits
目标
掌握Metrics aggregations(度量聚合,也作指标聚合 )语法,通过本文列举的各种案例举一反三。具体会涉及以下内容:
- 常用的聚合函数,如:min()、max()、avg()、sum()等;
- 聚合前过滤数据;
- 去重聚合;
- 详细分析Percentiles(百分比);
- 详细分析Percentile ranks(计算百分位数等级);
- 按照范围聚合。
- top_hits(相当于MySQL8.0版本中的窗口函数。)
ES版本信息
7.17.5
官方文档
实战
新增测试数据
PUT /library_db
{
"settings": {
"index": {
"analysis.analyzer.default.type": "ik_max_word"
}
}
}
PUT /library_db/_bulk
{"index":{"_id":"1"}}
{"id":1,"type":"玄幻","name":"诛仙","words_num":120,"chapter_num":600,"completion_time":"2000-09-01","author":"萧鼎","prices":32.12}
{"index":{"_id":"2"}}
{"id":2,"type":"玄幻","name":"诛仙前传:蛮荒行","words_num":30,"chapter_num":67,"completion_time":"2020-09-01","author":"萧鼎","prices":23.12}
{"index":{"_id":"3"}}
{"id":3,"type":"武侠","name":"天龙八部","words_num":80,"chapter_num":120,"completion_time":"1995-09-01","author":"金庸","prices":52.1}
{"index":{"_id":"4"}}
{"id":4,"type":"武侠","name":"射雕英雄传","words_num":67,"chapter_num":95,"completion_time":"1998-01-01","author":"金庸","prices":4.12}
{"index":{"_id":"5"}}
{"id":5,"type":"武侠","name":"神雕侠侣","words_num":75,"chapter_num":76,"completion_time":"2000-01-01","author":"金庸","prices":32.8}
{"index":{"_id":"6"}}
{"id":5,"type":"武侠","name":"倚天屠龙记","words_num":83,"chapter_num":130,"completion_time":"2003-01-01","author":"金庸","prices":100.12}
{"index":{"_id":"7"}}
{"id":7,"type":"玄幻","name":"凡人修仙传","words_num":600,"chapter_num":3000,"completion_time":"2018-01-01","author":"忘语","prices":120.12}
{"index":{"_id":"8"}}
{"id":8,"type":"玄幻","name":"魔天记","words_num":159,"chapter_num":400,"completion_time":"2019-01-01","author":"忘语","prices":11.12}
{"index":{"_id":"9"}}
{"id":9,"type":"都市异能","name":"黄金瞳","words_num":220,"chapter_num":400,"completion_time":"2019-01-01","author":"打眼","prices":74.5}
{"index":{"_id":"10"}}
{"id":10,"type":"玄幻","name":"将夜","words_num":210,"chapter_num":600,"completion_time":"2014-01-01","author":"血红","prices":32.0}
{"index":{"_id":"11"}}
{"id":11,"type":"军事","name":"亮剑","words_num":120,"chapter_num":100,"completion_time":"2012-01-01","author":"都梁","prices":15.0}
聚合函数
需求:在图书馆索引中,求小说单价最大值、单价最小值、单价平均值、单价总和值、小说总数量值。
方法一:使用min()、max()、avg()、sum()等函数进行查询。
#求平均值
#注意:这里的size表示针对该索引展示0条文档。
GET /library_db/_search
{
"size":0,
"aggs": {
"avg_prices": {
"avg": {
"field": "prices"
}
}
}
}
#求和
GET /library_db/_search
{
"size":0,
"aggs": {
"sum_prices": {
"sum": {
"field": "prices"
}
}
}
}
#求最大值
GET /library_db/_search
{
"size":0,
"aggs": {
"max_prices": {
"max": {
"field": "prices"
}
}
}
}
#求最小值
GET /library_db/_search
{
"size":0,
"aggs": {
"min_prices": {
"min": {
"field": "prices"
}
}
}
}
#求书本数量
GET /library_db/_search
{
"size":0,
"aggs": {
"count_prices": {
"value_count": {
"field": "id"
}
}
}
}
方法二:使用stats方法统计所有聚合值。
#求书本数量
GET /library_db/_search
{
"size":0,
"aggs": {
"stats_prices": {
"stats": {
"field": "prices"
}
}
}
}
聚合前过滤数据
需求:求价格低于50元的武侠小说数量。
GET /library_db/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"prices": {
"lt": 50
}
}
},
{
"term": {
"type.keyword": {
"value": "武侠"
}
}
}
]
}
},
"aggs": {
"num": {
"value_count": {
"field": "id"
}
}
}
}
去重聚合
需求:求图书馆中的小说类别数量。
GET /library_db/_search
{
"size": 0,
"aggs": {
"num": {
"cardinality": {
"field": "type.keyword"
}
}
}
}
Percentiles百分比
Percentiles功能
计算某百分比对应的阈值。如:80表示80%,假如80对应书本价格为45.5,表示图书馆中书本价格低于45.5元的占了80%的比例。
Percentiles注意事项
- 百分比范围:[100,0];
- 可以不指定百分比区间,百分比区间默认值为:[ 1, 5, 25, 50, 75, 95, 99 ];
- 匹配的字段必须是数字类型。
需求:求图书馆中书本价格对应的各个百分比。
GET /library_db/_search
{
"size": 0,
"aggs": {
"prices_percentiles": {
"percentiles": {
"field": "prices",
"percents": [
10,
20,
30,
40,
50,
60,
70,
80,
90,
100
]
}
}
}
}
Percentile ranks计算百分位数等级
Percentile ranks功能
计算指定阈值对应的百分比等级。如:书本价格为45.5对应80.0,表示图书馆中书本价格低于45.5元的占了80%的比例。
Percentile ranks注意事项
匹配的字段必须是数字类型。
需求:求价格大于或等于50、110、130的书本所占的百分比。
GET /library_db/_search
{
"size": 0,
"aggs": {
"prices_ranks": {
"percentile_ranks": {
"field": "prices",
"values": [ 50,110,130 ]
}
}
}
}
top_hits
需求:求图书馆中每个类型的小说中字数最多的2本小说。
GET /library_db/_search
{
"size": 0,
"aggs": {
"type_count": {
"terms": {
"field": "type.keyword"
},
"aggs": {
"type_count": {
"top_hits": {
"size": 2,
"sort": [
{
"words_num": {
"order": "desc"
}
}
]
}
}
}
}
}
}