5.2.10. 地理距离聚合
地理距离聚合
对 geo_point
字段操作的多桶聚合,在概念上和范围聚合非常相似。用户可以定义一个原点和一组距离范围的桶。聚合计算每个文档值到原点的距离,并根据范围确定其所属的桶(如果文档和原点之间的距离落在桶内,则文档属于该桶)。
PUT /museums
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
POST /museums/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "52.369219,4.901618", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "48.860000,2.327000", "name": "Musée d'Orsay"}
POST /museums/_search?size=0
{
"aggs" : {
"rings_around_amsterdam" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"ranges" : [
{ "to" : 100000 },
{ "from" : 100000, "to" : 300000 },
{ "from" : 300000 }
]
}
}
}
}
响应结果:
{
...
"aggregations": {
"rings_around_amsterdam" : {
"buckets": [
{
"key": "*-100000.0",
"from": 0.0,
"to": 100000.0,
"doc_count": 3
},
{
"key": "100000.0-300000.0",
"from": 100000.0,
"to": 300000.0,
"doc_count": 1
},
{
"key": "300000.0-*",
"from": 300000.0,
"doc_count": 2
}
]
}
}
}
指定的字段必须是 geo_pint 类型(只能在映射中显式设置)。多个 geo_pint 字段可以保存为一个数组,该情况下,在聚合期间将考虑该数组中所有字段。原点可以接受 geo_pint 类型支持的所有格式:
- 对象格式:
{ "lat" : 52.3760, "lon" : 4.894 }
- 这是最安全的格式,因为它明确的表示了lat
和lon
的值 - 字符串格式:
"52.3760, 4.894"
- 第一个数字是lat
,第二个数字是lon
- 数组格式:
[4.894, 52.3760]
- 数组格式基于 GeoJson 标准,第一个数字为lon
,第二个数字为lat
。
在默认情况下,距离单位是 m
(米),其它单位也可以,如:mi
(英里),in
(英寸),yd
(码),km
(公里),cm
(厘米),mm
(毫米)。
POST /museums/_search?size=0
{
"aggs" : {
"rings" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"unit" : "km", ①
"ranges" : [
{ "to" : 100 },
{ "from" : 100, "to" : 300 },
{ "from" : 300 }
]
}
}
}
}
① 距离将以公里计算
有两种距离计算模式:arc
(默认)和 plane
。arc
计算模式是最准确的,plane
计算模式是最快的,却是最不准确的。当你搜索的上下文比较小,地理区域跨越较小(~5km),可以考虑使用 plane
。当搜索跨越很大区域时(如跨大陆搜索),plane
返回结果的误差幅度会更大。距离计算类型可以使用 distance_type
参数设置:
POST /museums/_search?size=0
{
"aggs" : {
"rings" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"unit" : "km",
"distance_type" : "plane",
"ranges" : [
{ "to" : 100 },
{ "from" : 100, "to" : 300 },
{ "from" : 300 }
]
}
}
}
}
Keyed Response
将 keyed
标志设置为 true
会给每个桶关联一个唯一的字符串键,并将范围作为哈希而不是数组返回:
POST /museums/_search?size=0
{
"aggs" : {
"rings_around_amsterdam" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"ranges" : [
{ "to" : 100000 },
{ "from" : 100000, "to" : 300000 },
{ "from" : 300000 }
],
"keyed": true
}
}
}
}
响应结果:
{
...
"aggregations": {
"rings_around_amsterdam" : {
"buckets": {
"*-100000.0": {
"from": 0.0,
"to": 100000.0,
"doc_count": 3
},
"100000.0-300000.0": {
"from": 100000.0,
"to": 300000.0,
"doc_count": 1
},
"300000.0-*": {
"from": 300000.0,
"doc_count": 2
}
}
}
}
}
也可以为每个范围自定义 key:
POST /museums/_search?size=0
{
"aggs" : {
"rings_around_amsterdam" : {
"geo_distance" : {
"field" : "location",
"origin" : "52.3760, 4.894",
"ranges" : [
{ "to" : 100000, "key": "first_ring" },
{ "from" : 100000, "to" : 300000, "key": "second_ring" },
{ "from" : 300000, "key": "third_ring" }
],
"keyed": true
}
}
}
}
响应结果:
{
...
"aggregations": {
"rings_around_amsterdam" : {
"buckets": {
"first_ring": {
"from": 0.0,
"to": 100000.0,
"doc_count": 3
},
"second_ring": {
"from": 100000.0,
"to": 300000.0,
"doc_count": 1
},
"third_ring": {
"from": 300000.0,
"doc_count": 2
}
}
}
}
}
本译文仅用于学习和交流目的,转载请务必注明文章译者、出处、和本文链接
我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们。