You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Document special behaviour of ignore_malformed for geo_point mappings (#125692) (#126384)
With `geo_point` fields, here is the special case of values that have a syntactically valid format, but the numerical values for `latitude` and `longitude` are out of range.
If `ignore_malformed` is `false`, an exception will be thrown as usual. But if it is `true`, the document will be indexed correctly, by normalizing the latitude and longitude values into the valid range. The special `_ignored` field will not be set. The original source document will remain as before, but indexed values, doc-values and stored fields will all be normalized.
Copy file name to clipboardExpand all lines: docs/reference/elasticsearch/mapping-reference/geo-point.md
+71-16Lines changed: 71 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,14 +9,23 @@ mapped_pages:
9
9
10
10
Fields of type `geo_point` accept latitude-longitude pairs, which can be used:
11
11
12
-
* to find geopoints within a [bounding box](/reference/query-languages/query-dsl/query-dsl-geo-bounding-box-query.md), within a certain [distance](/reference/query-languages/query-dsl/query-dsl-geo-distance-query.md) of a central point, or within a [`geo_shape` query](/reference/query-languages/query-dsl/query-dsl-geo-shape-query.md) (for example, points in a polygon).
12
+
* to find geopoints within a [bounding box](/reference/query-languages/query-dsl/query-dsl-geo-bounding-box-query.md),
13
+
within a certain [distance](/reference/query-languages/query-dsl/query-dsl-geo-distance-query.md) of a central point,
14
+
or within a [`geo_shape` query](/reference/query-languages/query-dsl/query-dsl-geo-shape-query.md) (for example, points in a polygon).
13
15
* to aggregate documents by [distance](/reference/aggregations/search-aggregations-bucket-geodistance-aggregation.md) from a central point.
14
-
* to aggregate documents by geographic grids: either [`geo_hash`](/reference/aggregations/search-aggregations-bucket-geohashgrid-aggregation.md), [`geo_tile`](/reference/aggregations/search-aggregations-bucket-geotilegrid-aggregation.md) or [`geo_hex`](/reference/aggregations/search-aggregations-bucket-geohexgrid-aggregation.md).
15
-
* to aggregate geopoints into a track using the metrics aggregation [`geo_line`](/reference/aggregations/search-aggregations-metrics-geo-line.md).
16
+
* to aggregate documents by geographic grids: either
* to integrate distance into a document’s [relevance score](/reference/query-languages/query-dsl/query-dsl-function-score-query.md).
17
23
* to [sort](/reference/elasticsearch/rest-apis/sort-search-results.md#geo-sorting) documents by distance.
18
24
19
-
As with [geo_shape](/reference/elasticsearch/mapping-reference/geo-shape.md) and [point](/reference/elasticsearch/mapping-reference/point.md), `geo_point` can be specified in [GeoJSON](http://geojson.org) and [Well-Known Text](https://docs.opengeospatial.org/is/12-063r5/12-063r5.html) formats. However, there are a number of additional formats that are supported for convenience and historical reasons. In total there are six ways that a geopoint may be specified, as demonstrated below:
25
+
As with [geo_shape](/reference/elasticsearch/mapping-reference/geo-shape.md) and [point](/reference/elasticsearch/mapping-reference/point.md), `geo_point` can be specified in [GeoJSON](http://geojson.org)
26
+
and [Well-Known Text](https://docs.opengeospatial.org/is/12-063r5/12-063r5.html) formats.
27
+
However, there are a number of additional formats that are supported for convenience and historical reasons.
28
+
In total there are six ways that a geopoint may be specified, as demonstrated below:
20
29
21
30
```console
22
31
PUT my-index-000001
@@ -103,15 +112,28 @@ GET my-index-000001/_search
103
112
::::{admonition} Geopoints expressed as an array or string
104
113
:class: important
105
114
106
-
Please note that string geopoints are ordered as `lat,lon`, while array geopoints, GeoJSON and WKT are ordered as the reverse: `lon,lat`.
115
+
Please note that string geopoints are ordered as `lat,lon`, while array
116
+
geopoints, GeoJSON and WKT are ordered as the reverse: `lon,lat`.
107
117
108
-
The reasons for this are historical. Geographers traditionally write `latitude` before `longitude`, while recent formats specified for geographic data like [GeoJSON](https://geojson.org/) and [Well-Known Text](https://docs.opengeospatial.org/is/12-063r5/12-063r5.html) order `longitude` before `latitude` (easting before northing) in order to match the mathematical convention of ordering `x` before `y`.
118
+
The reasons for this are historical. Geographers traditionally write `latitude`
119
+
before `longitude`, while recent formats specified for geographic data like
120
+
[GeoJSON](https://geojson.org/) and [Well-Known Text](https://docs.opengeospatial.org/is/12-063r5/12-063r5.html)
121
+
order `longitude` before `latitude` (easting before northing) in order to match
122
+
the mathematical convention of ordering `x` before `y`.
109
123
110
124
::::
111
125
112
126
113
127
::::{note}
114
-
A point can be expressed as a [geohash](https://en.wikipedia.org/wiki/Geohash). Geohashes are [base32](https://en.wikipedia.org/wiki/Base32) encoded strings of the bits of the latitude and longitude interleaved. Each character in a geohash adds additional 5 bits to the precision. So the longer the hash, the more precise it is. For the indexing purposed geohashs are translated into latitude-longitude pairs. During this process only first 12 characters are used, so specifying more than 12 characters in a geohash doesn’t increase the precision. The 12 characters provide 60 bits, which should reduce a possible error to less than 2cm.
128
+
A point can be expressed as a [geohash](https://en.wikipedia.org/wiki/Geohash).
129
+
Geohashes are [base32](https://en.wikipedia.org/wiki/Base32) encoded strings of
130
+
the bits of the latitude and longitude interleaved. Each character in a geohash
131
+
adds additional 5 bits to the precision. So the longer the hash, the more
132
+
precise it is. For the indexing purposed geohashs are translated into
133
+
latitude-longitude pairs. During this process only first 12 characters are
134
+
used, so specifying more than 12 characters in a geohash doesn’t increase the
135
+
precision. The 12 characters provide 60 bits, which should reduce a possible
136
+
error to less than 2cm.
115
137
::::
116
138
117
139
@@ -120,27 +142,54 @@ A point can be expressed as a [geohash](https://en.wikipedia.org/wiki/Geohash).
120
142
The following parameters are accepted by `geo_point` fields:
: If `true`, malformed geopoints are ignored. If `false` (default), malformed geopoints throw an exception and reject the whole document. A geopoint is considered malformed if its latitude is outside the range -90 ⇐ latitude ⇐ 90, or if its longitude is outside the range -180 ⇐ longitude ⇐ 180. Note that this cannot be set if the `script` parameter is used.
145
+
: If `true`, malformed geopoints are ignored.
146
+
If `false` (default), malformed geopoints throw an exception and reject the whole document.
147
+
A geopoint is considered malformed if its latitude is outside the range -90 ⇐ latitude ⇐ 90,
148
+
or if its longitude is outside the range -180 ⇐ longitude ⇐ 180.
149
+
When set to `true`, if the format is valid, but the values are out of range,
150
+
the values will be normalized into the valid range, and the document will be indexed.
151
+
This is a special case, and a [different behaviour](/reference/elasticsearch/mapping-reference/ignore-malformed.md#_ignore_malformed_geo_point) from the normal for `ignore_malformed`.
152
+
Note that this cannot be set if the `script` parameter is used.
124
153
125
154
`ignore_z_value`
126
-
: If `true` (default) three dimension points will be accepted (stored in source) but only latitude and longitude values will be indexed; the third dimension is ignored. If `false`, geopoints containing any more than latitude and longitude (two dimensions) values throw an exception and reject the whole document. Note that this cannot be set if the `script` parameter is used.
155
+
: If `true` (default) three dimension points will be accepted (stored in source)
156
+
but only latitude and longitude values will be indexed; the third dimension is
157
+
ignored. If `false`, geopoints containing any more than latitude and longitude
158
+
(two dimensions) values throw an exception and reject the whole document. Note
159
+
that this cannot be set if the `script` parameter is used.
: Should the field be quickly searchable? Accepts `true` (default) and `false`. Fields that only have [`doc_values`](/reference/elasticsearch/mapping-reference/doc-values.md) enabled can still be queried, albeit slower.
162
+
: Should the field be quickly searchable? Accepts `true` (default) and
163
+
`false`. Fields that only have [`doc_values`](/reference/elasticsearch/mapping-reference/doc-values.md)
: Accepts an geopoint value which is substituted for any explicit `null` values. Defaults to `null`, which means the field is treated as missing. Note that this cannot be set if the `script` parameter is used.
167
+
: Accepts a geopoint value which is substituted for any explicit `null` values.
168
+
Defaults to `null`, which means the field is treated as missing. Note that this
169
+
cannot be set if the `script` parameter is used.
133
170
134
171
`on_script_error`
135
-
: Defines what to do if the script defined by the `script` parameter throws an error at indexing time. Accepts `fail` (default), which will cause the entire document to be rejected, and `continue`, which will register the field in the document’s [`_ignored`](/reference/elasticsearch/mapping-reference/mapping-ignored-field.md) metadata field and continue indexing. This parameter can only be set if the `script` field is also set.
172
+
: Defines what to do if the script defined by the `script` parameter
173
+
throws an error at indexing time. Accepts `fail` (default), which
174
+
will cause the entire document to be rejected, and `continue`, which
175
+
will register the field in the document’s [`_ignored`](/reference/elasticsearch/mapping-reference/mapping-ignored-field.md) metadata field and continue
176
+
indexing. This parameter can only be set if the `script` field is
177
+
also set.
136
178
137
179
`script`
138
-
: If this parameter is set, then the field will index values generated by this script, rather than reading the values directly from the source. If a value is set for this field on the input document, then the document will be rejected with an error. Scripts are in the same format as their [runtime equivalent](docs-content://manage-data/data-store/mapping/map-runtime-field.md), and should emit points as a pair of (lat, lon) double values.
180
+
: If this parameter is set, then the field will index values generated
181
+
by this script, rather than reading the values directly from the
182
+
source. If a value is set for this field on the input document, then
183
+
the document will be rejected with an error.
184
+
Scripts are in the same format as their [runtime equivalent](docs-content://manage-data/data-store/mapping/map-runtime-field.md), and should emit points
185
+
as a pair of (lat, lon) double values.
139
186
140
187
141
188
## Using geopoints in scripts [_using_geopoints_in_scripts]
142
189
143
-
When accessing the value of a geopoint in a script, the value is returned as a `GeoPoint` object, which allows access to the `.lat` and `.lon` values respectively:
190
+
When accessing the value of a geopoint in a script, the value is returned as
191
+
a `GeoPoint` object, which allows access to the `.lat` and `.lon` values
Synthetic `_source` is Generally Available only for TSDB indices (indices that have `index.mode` set to `time_series`). For other indices synthetic `_source` is in technical preview. Features in technical preview may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
211
+
Synthetic `_source` is Generally Available only for TSDB indices
212
+
(indices that have `index.mode` set to `time_series`). For other indices
213
+
synthetic `_source` is in technical preview. Features in technical preview may
214
+
be changed or removed in a future release. Elastic will work to fix
215
+
any issues, but features in technical preview are not subject to the support SLA
216
+
of official GA features.
163
217
::::
164
218
165
219
166
-
Synthetic source may sort `geo_point` fields (first by latitude and then longitude) and reduces them to their stored precision. For example:
220
+
Synthetic source may sort `geo_point` fields (first by latitude and then
221
+
longitude) and reduces them to their stored precision. For example:
## Dealing with malformed fields [_dealing_with_malformed_fields]
105
105
106
-
Malformed fields are silently ignored at indexing time when `ignore_malformed` is turned on. Whenever possible it is recommended to keep the number of documents that have a malformed field contained, or queries on this field will become meaningless. Elasticsearch makes it easy to check how many documents have malformed fields by using `exists`,`term` or `terms` queries on the special [`_ignored`](/reference/elasticsearch/mapping-reference/mapping-ignored-field.md) field.
107
-
106
+
Malformed fields are silently ignored at indexing time when `ignore_malformed` is turned on.
107
+
Whenever possible it is recommended to keep the number of documents that have a malformed field contained,
108
+
or queries on this field will become meaningless.
109
+
Elasticsearch makes it easy to check how many documents have malformed fields by using `exists`,
110
+
`term` or `terms` queries on the special [`_ignored`](/reference/elasticsearch/mapping-reference/mapping-ignored-field.md) field.
111
+
112
+
## The special case of `geo_point` fields [_ignore_malformed_geo_point]
113
+
114
+
With [`geo_point`](/reference/elasticsearch/mapping-reference/geo-point.md) fields,
115
+
there is the special case of values that have a syntactically valid format,
116
+
but the numerical values for `latitude` and `longitude` are out of range.
117
+
If `ignore_malformed` is `false`, an exception will be thrown as usual. But if it is `true`,
118
+
the document will be indexed correctly, by normalizing the latitude and longitude values into the valid range.
119
+
The special [`_ignored`](/reference/elasticsearch/mapping-reference/mapping-ignored-field.md) field will not be set.
120
+
The original source document will remain as before, but indexed values, doc-values and stored fields will all be normalized.
0 commit comments