0 _search查询数据时可以指定多个index和type

GET /index1,index2/type1,type2/_search

GET /_all/type1/_search  相当于查询全部index下的type1的document

GET /_all/type1/_search?from=0&size=5 from和size为分页参数

1 增加一条数据,手动指定document的ID

PUT /index1/type1/1
{
"content1":"abcnt地方士大夫",
"age":"abc你的"
}

2 增加一条数据,自动指定document的ID

POST /index1/type1
{
"content1":"abcnt地方士大夫",
"age":"abc你的"
}

3 获取一条数据的方式,并指定查询返回字段

GET /index1/type1/1?_source=age,content1

4 es更新数据时使用自定义版本号,只有版本号大于当前版本号才允许更新操作

PUT /index1/type1/1?version=5&version_type=external  (之前的_version属性必须小于5)
{
"description":"程序员一枚~~~"
}

5 partial update对document中的部分field进行更新

POST /index1/type1/1/_update?version=13 (必须version等于当前版本号时才可以修改数据,而且内容和原来相同则认为未更改版本号不变;该操作在更新期间不会被打断)
{
"doc":{
"description":"程序员一枚~~~7778899056"
}
}

6 通过GET /_mget 批量查找数据,需要提供index,type,id(可以通过url参数增加,根据搜索范围不同使用不同的查询参数)

GET /_mget
{
"docs":[
{
"_index":"index1",
"_type":"type1",
"_id":"1",

  "_version":16
},
{
"_index":"index1",
"_type":"type1",
"_id":"2"
}
]
}

GET /index1/_mget
{
"docs":[
{
"_type":"type1",
"_id":"1",
"_version":16
},
{
"_type":"type1",
"_id":"2"
}
]
}

GET /index1/type1/_mget
{
"ids":[1,2]
}

7 _search搜索默认查询前10条(timeout=1ms可以指定超时时间)

GET /index1/type1/_search?timeout=1ms

8 使用_search?q=xxx,为全字段查询,如果使用_search?q=field:xxx为按照具体字段进行查询;

PUT /index3/type3/1
{
  "date":"2019-01-02",
  "name":"the little",
  "content":"Half the ideas in his talk were plagiarized from an article I wrote last month."
}

PUT /index3/type3/2
{
  "date":"2019-01-01",
  "name":"a dog",
  "content":"is the girl, women's attention and love day. July 7th Qiqiao customs, originated in the Han "
}

PUT /index3/type3/3
{
  "date":"2019-07-01",
  "name":"very tag",
  "content":"Some of our comrades love to write long articles with no substance, very much like the foot bindings of a slattern, long as well as smelly"
}

//但是按照具体字段查询时如果字段类型为date或者long等时间和数值类型则使用exact value去匹配
GET /index3/type3/_search?q=date:2019-01 //只能查询出1条数据,查询方式为exact value
GET /index3/type3/_search?q=2019-01 //则能查询出3条,因为会使用full text全字匹配,会将每一可能的部分都进行分词,只要包含则可以查询出来

 9 使用mapping指定索引字段的类型以及是否要进行分词,但是手动创建索引的mapping,只能给新字段指定,或者还没创建的索引指定,mapping不能修改

DELETE /index3

//创建索引并指定字段的属性
PUT /index3
{ 
  "mappings": {
    "type3": {
      "properties": {
        "date":{
          "type": "date"//日期类型的exact value匹配粗略除外,es会按照搜索的部分日期匹配出一个返回.如:GET /index3/type3/_search?q=date:2019 
        },
        "name":{
          "type": "keyword"
        },
        "no":{
           "type": "long"
        },
        "content":{
           "analyzer": "standard",
           "type": "string"
        }
      }
    }
  }
}

//添加数据
PUT /index3/type3/1
{
  "date":"2019-01-02",
  "name":"the little",
  "content":"Half the ideas in his talk were plagiarized from an article I wrote last month.",
  "no":"123"
}

PUT /index3/type3/2
{
  "date":"2019-01-01",
  "name":"a dog",
  "content":"is the girl, women's attention and love day. July 7th Qiqiao customs, originated in the Han ",
  "no":"6867858"
}

PUT /index3/type3/3
{
  "date":"2019-07-01",
  "name":"very tag",
  "content":"Some of our comrades love to write long articles with no substance, very much like the foot bindings of a slattern, long as well as smelly",
  "no":"123"
}

GET /index3/type3/_search?q=name:very tag  使用字段名称只能exact value策略匹配可以查询的到,因为type指定为keyword

View Code

 10 _search的精准匹配和分词后的全文检索

GET /index2/type2/_search
{
  "query": {
    //bool中出现的must和should等取交集
    "bool": {
      //must要求match里面的字段name全字匹配
      "must": [
        {
          "match": {
            "name": "ui the mark"
          }
        }
      ]
      , 
      //should要求match里面的字段content可以进行分词后的查询 
      "should": [
        {
          "match": {
            "content": "bought"
          }
        }
      ]
    }
  }
}

11 使用滚动分页数据查询方式,代替es的分页功能 ,因为es分页功能在深度分页时会向coordinate节点发送大量数据,排序后在取出指定位置的数据,性能很低下

//scroll=100ms滚动查询方式,超时时长100ms
GET /index3/type3/_search?scroll=100ms
{
  "query": {
    //查询所有数据
    "match_all": {}
  },
  "sort": [
    {
      //排序方式按时间升序
      "date": {
        "order": "asc"
      }
    }
  ],
  //每次向后查询3条
  "size": 3
}

 12 es的DSL方式filter指定字段范围过滤(filter不参与TF&IDF评分,只进行条件过滤)

PUT /index2/type2/1
{
  "num":10,
  "name":"ui the mark",
  "content":"Mr. Johnson had never been up in an aerophane before and he had read a lot about air accidents, so one day when a"
}

PUT /index2/type2/2
{
  "num":100,
  "title":"他的名字",
  "name":"my tag",
  "content":"He bought a gallon of gas. He put the gas into a gas can. He waited until "
}

PUT /index2/type2/3
{
  "num":1000,
  "title":"这是谁的名字",
  "name":"very lit",
  "content":"happening in the world.But radio isn't lost. It is still with us. That's because a radio is very small,and it's easy to carry. You can put one in your pocket and "
}

GET /index2/type2/_search
{
  "query": {
    //bool中出现的must和should等取交集
    "bool": {
      //should要求match里面的字段content可以进行分词后的查询 
      "should": [
        {
          "match": {
            "content": "bought"
          }
        }
      ]
      ,
      "filter": {
        "range": {
          "num": {
            "gte": 10,
            "lte": 1010
          }
        }
      }
    }
  }
}

 13 使用mapping的动态属性限定索引的document中的json内容

PUT /index3
{ 
  "mappings": {
    "type3": {
      "dynamic":"true",
      "properties": {
        "date":{
          "type": "date"
        },
        "name":{
          "type": "keyword"
        },
        "no":{
           "type": "long"
        },
        "content":{
           "type": "keyword"
        },
        "address":{
          //dynamic(默认为true)一旦声明为strict,则不允许type下添加额外未指定的字段,而且dynamic可在json属性内部嵌套
          "dynamic":"strict",
          "properties": {
            "city":{
              "type":"keyword"
            },
            "description":{
              "type":"text"
            }
          }
        }
      }
    }
  }
}

 14 为索引index添加一个别名,如果需要index重新创建,可以通过添加删除别名指向的索引,从而不用修改程序无缝切换

POST /_aliases
{
  "actions": [
    {
      "remove": {
        "index": "index4",
        "alias": "index3_alias"
      }
    }
  ]
}

POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "index3",
        "alias": "index3_alias"
      }
    }
  ]
}

PUT /index4/type3/1
{
  "a":"a"
}

GET /index3_alias/type3/1

 15 如果索引mapping中的字段类型已经指定,则无法添加其他类型的值(形如13中的索引创建方式)

//这种操作则会报错,因为定义date字段已经指定为日期类型
PUT /index3/type3/7
{
  "date":"asdsad",
  "name":"http litty",
  "content":"The happiest of people don’t necessarily have the best of everything;they just make the most of everything that comes along their",
  "no":"9786"
}

 16 调整一个document提交到index能查询到的时间阈值,也就是buffer的refresh时间间隔(buffer默认每秒refresh到磁盘一次,translog如果没达到阈值大小则30分钟持久化到磁盘1次,并清空buffer)

PUT /index5
{
  "settings": {
    //每次从buffer执行refresh到磁盘的时间为30s    
    "refresh_interval": "30s"
  }
}

 17 当我们不关心检索词频率TF(Term Frequency)对搜索结果排序的影响时,可以使用constant_score将查询语句query或者过滤语句filter包装起来。而且term对搜索部分词,全字匹配输入;(filter不参与TF&IDF评分,只进行条件过滤,使用constant_score可以取代只有filter的bool查询,filter能够保证不参与相关度计算,只是数据过滤,所以效率要高出很多)

GET index2/type2/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "name": "my tag"
        }
      }
    }
  }
}

 18 如果name指定了type:keyword,那么只能使用"_all":"xxx"去匹配,因为keyword支持按字段extract value匹配和_all的full text全文检索匹配

GET index2/type2/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match":{
            //"name":"very"不会搜索出任何内容   
            "_all":"very"//走全文检索才能匹配出结果
          }
        }
      ]
    }
  }
}

 19 filter可以嵌套多层bool查询

PUT /index2/type2/1
{
  "num": 1,
  "title":"你的名字",
  "name":"ui the mark",
  "content":"Mr. Johnson had never been up in an aerophane before and he had read a lot about air accidents, so one day when a"
}

PUT /index2/type2/2
{
  "num": 10,
  "title":"他的名字",
  "name":"my tag",
  "content":"He bought a gallon of gas. He put the gas into a gas can. He waited until "
}

PUT /index2/type2/3
{
  "num": 105,
  "title":"这是谁的名字",
  "name":"very lit",
  "content":"happening in the world.But radio isn't lost. It is still with us. That's because a radio is very small,and it's easy to carry. You can put one in your pocket and "
}


POST /index2/_mapping/type2
{
  "properties": {
    "name":{
      "type": "keyword"
    },
    "content":{
      "type": "text",
      "analyzer": "english"
    }
  }
}

GET /index2/type2/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must":[
            {
              "term":{
                "name":"very lit"
              }
            },
            {
              "bool":{
                "should":[
                  {
                    "match":{
                      "title": "我"
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    }
  }
}

 20 使用bool查询时,如果没有must而有should则should中必须匹配一条,如果有must,则should中的条件可以不做任何匹配("minimum_number_should_match": 3, should数组至少匹配3个条件)

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 3, 
      "should": [
        {
          "match": {
            "title": "尼玛"
          }
        },
        {
          "match": {
            "name": "very lit"
          }
        },
        {
          "match": {
            "content":"happening"
          }
        }
      ]
    }
  }
}

21 单个field查询时的词量匹配,可手动控制精准程度,minimum_should_match指定在  ”你 名 字 d“  4个词至少得匹配3个词即为75%

GET /index2/type2/_search
{
  "query": {
    "match": {
      "title": {
        "query": "你 名 字 d",
        "minimum_should_match": "75%"
      }
    }
  }
}

 22 查询后的结果如果想要提升某一搜索关键词的评分使用boost属性指定score

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 2, 
      "should": [
        {
          "match": {
            "title": {
              "query": "谁",
              "boost":5
            }
          }
        },
        {
          "match": {
            "name": "very lit"
          }
        },
        {
          "match": {
            "content":"happening"
          }
        }
      ]
    }
  }
}

 23 multi_match方式的多字段,多查询模式

GET /index2/type2/_search
{
  "query": {
    "multi_match": {
      "query": "happening like",
      //query中的搜索词条去content和name两个字段中来匹配,不过会由于两个字段mapping定义不同导致得分不同,排序结果可能有差异
      "fields": ["name","content"],
      //best_fields策略是每个document的得分等于得分最高的match field的值;而匹配出最佳以后,其它document得分未必准确;most_fields根据每个field的评分计算出ducoment的综合评分
      "type":"best_fields"
    }
  }
}

结果
{
  "took": 71,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.5063205,
    "hits": [
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "3",
        "_score": 0.5063205,
        "_source": {
          "num": 105,
          "title": "这是谁的名字",
          "name": "happening like write",
          "content": ""
        }
      },
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "2",
        "_score": 0.41043553,
        "_source": {
          "num": 10,
          "title": "他的名字",
          "name": "yes happening like write",
          "content": "happening i like"
        }
      },
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "4",
        "_score": 0.34450945,
        "_source": {
          "num": 1000,
          "title": "我的名字",
          "name": "happening like write",
          "content": "happening like yeas and he had read a lot about"
        }
      }
    ]
  }
}

 24 使用match_phrase对field值进行完整query词组匹配,该词组不做分词直接完整匹配

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 1, 
      "should": [
        {
          //match_phrase短语匹配要求content字段必须包含treasure because值才能匹配得上
            "match_phrase": {
            "content": "treasure because"
          }
        }
      ]
    }
  }
}

25 使用match_phrase与slop,在使用词组完全匹配时,可以在整个field值中,移动词组内的单个词位置,移动范围由slop参数指定,如果通过移动后能组成要搜索的词条,也认为匹配成功

GET /index2/type2/_search
{
  "query": {
    "bool": {
      "minimum_number_should_match": 1, 
      "should": [
        {
          //match_phrase短语匹配要求content字段必须包含treasure because值才能匹配得上
          "match_phrase": {
            "content": {
              "query": "treasure because",
              "slop":2//treasure与because每个词,左右移动2个position后如果能够组合成treasure because词组则匹配成功
            }
          }
        }
      ]
    }
  }
}

结果:
{
  "took": 15,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.53484553,
    "hits": [
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "3",
        "_score": 0.53484553,
        "_source": {
          "num": 105,
          "title": "这是谁的名字",
          "name": "happening like write",
          "content": " national  treasure because  of its rare number and cute appearance. Many foreign people are so crazy about  pandas and they can’t watching these  lovely creatures all the time. Though some action"
        }
      },
      {
        "_index": "index2",
        "_type": "type2",
        "_id": "4",
        "_score": 0.45520112,
        "_source": {
          "num": 1000,
          "title": "我的名字",
          "name": "happening like write",
          "content": "happening treasure hello like because yeas and he happening like had read a lot about happening hello like"
        }
      }
    ]
  }
}

 26 使用rescoring机制增加匹配的精准度,并提高搜索效率,因为match要比match_phrase的性能好10倍左右,match_phrase性能要比match_phrase+slop性能好20几倍;rescoring可在搜索之后取出前300条数据(一般用户分页查询后浏览不会超过10页)进行match_phrase和slop的设置,来重新进行打分排序

GET index3/type3/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "content":"hello book"//先按照hello book分词后匹配出结果
          }
        }
      ]
    }
  },
  "rescore":{
    "window_size":300,//从must的结果中取出300条重排序
    "query":{
      "rescore_query":{
        "match_phrase":{
          "content":{
            "query":"hello book",
            "slop":88//排序规则是按照hello和book两个词的position关系来决定,距离越近得分越高
          }
        }
      }
    }
  }
}
结果:
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 6,
    "max_score": 1.0520453,
    "hits": [
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "1",
        "_score": 1.0520453,
        "_source": {
          "date": "2019-01-02",
          "name": "the little",
          "content": "Half the hello book ideas in his talk were plagiarized from an article I wrote last month.",
          "no": "123"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "4",
        "_score": 1.0472052,
        "_source": {
          "date": "2019-03-01",
          "name": "http litty",
          "content": "http://localhost:5601/app/kibana#/dev_tools/console?_g=() hello the book you ",
          "no": "123"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "3",
        "_score": 0.8442862,
        "_source": {
          "date": "2019-07-01",
          "name": "very tag",
          "content": "Some of our hello  comrades love book to write long articles with no substance, very much like the foot bindings of a slattern, long as well as smelly",
          "no": "123"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "5",
        "_score": 0.6407875,
        "_source": {
          "date": "2019-05-01",
          "name": "http litty",
          "content": "There are hello moments in life when you miss book someone so much that you just want to pick them from your dreams",
          "no": "564",
          "description": "描述"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "6",
        "_score": 0.52347976,
        "_source": {
          "date": "2019-06-01",
          "name": "http litty",
          "content": "The happiest of hello people don’t necessarily have the best of everything;they just make the you most of everything that comes along their book",
          "no": "9786"
        }
      },
      {
        "_index": "index3",
        "_type": "type3",
        "_id": "2",
        "_score": 0.1252801,
        "_source": {
          "date": "2019-02-01",
          "name": "a dog",
          "content": "is the girl,hello women's attention and love day. July 7th Qiqiao customs, originated in the Han ",
          "no": "6867858"
        }
      }
    ]
  }
}

 27 词组+左匹配搜索实现自动完成

GET index3/type3/_search
{
  "query": {
    "match_phrase_prefix": {//词组搜索+左前缀匹配(可用于自动完成功能)
      "title": {
        "query": "the yellow",
        "slop":10,//两单词左右移动位置,能匹配doc就返回
        "max_expansions": 50//词组左匹配的时候,最多匹配50条(该参数很有必要,如果不限制匹配条数则可能出现性能急剧下降,因为要针对所有索引进行左前缀过滤,这种情况是灾难的)
      }
      
    }
  }
}

 28 在bool组合查询下,1 先进行条件过滤筛选,2 在进行字段分词检索

GET /index3/type3/_search
{
  "query": {
    "bool": {
      "filter": [//name和日期交际
        {
          "term":{
            "name":"http litty"
          }
        },
        {
          "terms":{
              "date":["2019-06-01","2019-03-01"]//日期条件并集
          }
        }
      ],
      "must": [
        {
          "match":{//模糊检索content字段
            "content":"The happiest of"
          }
        }
      ]
    }
  }
}