目录

chen 的个人博客

VX:ZzzChChen
Phone:13403656751
Email:zxydczzs@gmail.com

X

ElasticSearch 解除默认查询1w数据限制

默认查询
1GET seller_invoice_item_wide_data_test/_search
2{
3  "size": 0
4}
查询结果
 1{
 2  "took" : 1,
 3  "timed_out" : false,
 4  "_shards" : {
 5    "total" : 1,
 6    "successful" : 1,
 7    "skipped" : 0,
 8    "failed" : 0
 9  },
10  "hits" : {
11    "total" : {
12      "value" : 10000,
13      "relation" : "gte"
14    },
15    "max_score" : null,
16    "hits" : [ ]
17  }
18}
19
解除默认 1w 限制
1GET seller_invoice_item_wide_data_test/_search
2{
3  "size": 0,
4  "track_total_hits": true
5}
查询结果
 1{
 2  "took" : 1,
 3  "timed_out" : false,
 4  "_shards" : {
 5    "total" : 1,
 6    "successful" : 1,
 7    "skipped" : 0,
 8    "failed" : 0
 9  },
10  "hits" : {
11    "total" : {
12      "value" : 22072,
13      "relation" : "eq"
14    },
15    "max_score" : null,
16    "hits" : [ ]
17  }
18}

可以看到最终的 total 中的 value 是不一样的,来看看官方解释

Generally the total hit count can’t be computed accurately without visiting all matches, which is costly for queries that match lots of documents. The track_total_hits parameter allows you to control how the total number of hits should be tracked. Given that it is often enough to have a lower bound of the number of hits, such as "there are at least 10000 hits", the default is set to 10,000. This means that requests will count the total hit accurately up to 10,000 hits. It is a good trade off to speed up searches if you don’t need the accurate number of hits after a certain threshold.

When set to true the search response will always track the number of hits that match the query accurately (e.g. total.relation will always be equal to "eq" when track_total_hits is set to true). Otherwise the "total.relation" returned in the "total" object in the search response determines how the "total.value" should be interpreted. A value of "gte" means that the "total.value" is a lower bound of the total hits that match the query and a value of "eq" indicates that "total.value" is the accurate count.

译文

通常,如果不访问所有匹配项,就无法准确计算总命中数,这对于匹配大量文档的查询来说代价高昂。track_total_his 参数允许您控制如何跟踪命中总数。考虑到点击次数的下限通常足够了,例如“至少有 10000 次点击”,默认设置为 10000 次。这意味着请求将准确计算总命中数,最多可达 10000 次。如果在某个阈值之后你不需要准确的点击次数,那么加快搜索速度是一个很好的折衷方案。

当设置为 true 时,搜索响应将始终跟踪准确匹配查询的命中数(例如,当 track_total_his 设置为 true,total.relation 将始终等于“eq”)。否则,搜索响应中“total”对象中返回的“total.relation”将决定如何解释“total.value”。值“gte”意味着“total.value”是匹配查询的总命中数的下限,值“eq”表示“total.walue”为准确计数。


标题:ElasticSearch 解除默认查询1w数据限制
作者:zzzzchen
地址:https://www.dczzs.com/articles/2023/01/03/1672729071116.html