Skip to content

Commit 2582b1a

Browse files
committed
Document new ip_location processor (elastic#116623)
1 parent 08306e7 commit 2582b1a

File tree

3 files changed

+243
-16
lines changed

3 files changed

+243
-16
lines changed

docs/reference/ingest/processors.asciidoc

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,10 @@ Computes a hash of the document’s content.
7777
Converts geo-grid definitions of grid tiles or cells to regular bounding boxes or polygons which describe their shape.
7878

7979
<<geoip-processor, `geoip` processor>>::
80-
Adds information about the geographical location of an IPv4 or IPv6 address.
80+
Adds information about the geographical location of an IPv4 or IPv6 address from a Maxmind database.
81+
82+
<<ip-location-processor, `ip_location` processor>>::
83+
Adds information about the geographical location of an IPv4 or IPv6 address from an ip geolocation database.
8184

8285
<<network-direction-processor, `network_direction` processor>>::
8386
Calculates the network direction given a source IP address, destination IP address, and a list of internal networks.
@@ -245,6 +248,7 @@ include::processors/grok.asciidoc[]
245248
include::processors/gsub.asciidoc[]
246249
include::processors/html_strip.asciidoc[]
247250
include::processors/inference.asciidoc[]
251+
include::processors/ip-location.asciidoc[]
248252
include::processors/join.asciidoc[]
249253
include::processors/json.asciidoc[]
250254
include::processors/kv.asciidoc[]

docs/reference/ingest/processors/geoip.asciidoc

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ ASN IP geolocation databases from http://dev.maxmind.com/geoip/geoip2/geolite2/[
1313
CC BY-SA 4.0 license. It automatically downloads these databases if your nodes can connect to `storage.googleapis.com` domain and either:
1414

1515
* `ingest.geoip.downloader.eager.download` is set to true
16-
* your cluster has at least one pipeline with a `geoip` processor
16+
* your cluster has at least one pipeline with a `geoip` or `ip_location` processor
1717

1818
{es} automatically downloads updates for these databases from the Elastic GeoIP
1919
endpoint:
@@ -25,10 +25,10 @@ If your cluster can't connect to the Elastic GeoIP endpoint or you want to
2525
manage your own updates, see <<manage-geoip-database-updates>>.
2626

2727
If you would like to have {es} download database files directly from Maxmind using your own provided
28-
license key, see <<put-geoip-database-api>>.
28+
license key, see <<put-ip-location-database-api>>.
2929

3030
If {es} can't connect to the endpoint for 30 days all updated databases will become
31-
invalid. {es} will stop enriching documents with geoip data and will add `tags: ["_geoip_expired_database"]`
31+
invalid. {es} will stop enriching documents with ip geolocation data and will add `tags: ["_geoip_expired_database"]`
3232
field instead.
3333

3434
[[using-ingest-geoip]]
@@ -40,11 +40,11 @@ field instead.
4040
|======
4141
| Name | Required | Default | Description
4242
| `field` | yes | - | The field to get the IP address from for the geographical lookup.
43-
| `target_field` | no | geoip | The field that will hold the geographical information looked up from the MaxMind database.
44-
| `database_file` | no | GeoLite2-City.mmdb | The database filename referring to one of the automatically downloaded GeoLite2 databases (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb), or the name of a supported database file in the `ingest-geoip` config directory, or the name of a <<get-geoip-database-api, configured database>> (with the `.mmdb` suffix appended).
45-
| `properties` | no | [`continent_name`, `country_iso_code`, `country_name`, `region_iso_code`, `region_name`, `city_name`, `location`] * | Controls what properties are added to the `target_field` based on the geoip lookup.
43+
| `target_field` | no | geoip | The field that will hold the geographical information looked up from the database.
44+
| `database_file` | no | GeoLite2-City.mmdb | The database filename referring to one of the automatically downloaded GeoLite2 databases (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb), or the name of a supported database file in the `ingest-geoip` config directory, or the name of a <<get-ip-location-database-api, configured database>> (with the `.mmdb` suffix appended).
45+
| `properties` | no | [`continent_name`, `country_iso_code`, `country_name`, `region_iso_code`, `region_name`, `city_name`, `location`] * | Controls what properties are added to the `target_field` based on the ip geolocation lookup.
4646
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
47-
| `first_only` | no | `true` | If `true` only first found geoip data will be returned, even if `field` contains array
47+
| `first_only` | no | `true` | If `true` only first found ip geolocation data, will be returned, even if `field` contains array
4848
| `download_database_on_pipeline_creation` | no | `true` | If `true` (and if `ingest.geoip.downloader.eager.download` is `false`), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the `default_pipeline` or `final_pipeline` in an index.
4949
|======
5050

@@ -79,15 +79,13 @@ depend on what has been found and which properties were configured in `propertie
7979
`residential_proxy`, `domain`, `isp`, `isp_organization_name`, `mobile_country_code`, `mobile_network_code`, `user_type`, and
8080
`connection_type`. The fields actually added depend on what has been found and which properties were configured in `properties`.
8181

82-
preview::["Do not use the GeoIP2 Anonymous IP, GeoIP2 Connection Type, GeoIP2 Domain, GeoIP2 ISP, and GeoIP2 Enterprise databases in production environments. This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features."]
83-
8482
Here is an example that uses the default city database and adds the geographical information to the `geoip` field based on the `ip` field:
8583

8684
[source,console]
8785
--------------------------------------------------
8886
PUT _ingest/pipeline/geoip
8987
{
90-
"description" : "Add geoip info",
88+
"description" : "Add ip geolocation info",
9189
"processors" : [
9290
{
9391
"geoip" : {
@@ -138,7 +136,7 @@ this database is downloaded automatically. So this:
138136
--------------------------------------------------
139137
PUT _ingest/pipeline/geoip
140138
{
141-
"description" : "Add geoip info",
139+
"description" : "Add ip geolocation info",
142140
"processors" : [
143141
{
144142
"geoip" : {
@@ -190,7 +188,7 @@ cannot be found:
190188
--------------------------------------------------
191189
PUT _ingest/pipeline/geoip
192190
{
193-
"description" : "Add geoip info",
191+
"description" : "Add ip geolocation info",
194192
"processors" : [
195193
{
196194
"geoip" : {
@@ -256,7 +254,7 @@ PUT my_ip_locations
256254
--------------------------------------------------
257255
PUT _ingest/pipeline/geoip
258256
{
259-
"description" : "Add geoip info",
257+
"description" : "Add ip geolocation info",
260258
"processors" : [
261259
{
262260
"geoip" : {
@@ -429,7 +427,7 @@ The `geoip` processor supports the following setting:
429427

430428
The maximum number of results that should be cached. Defaults to `1000`.
431429

432-
Note that these settings are node settings and apply to all `geoip` processors, i.e. there is one cache for all defined `geoip` processors.
430+
Note that these settings are node settings and apply to all `geoip` and `ip_location` processors, i.e. there is a single cache for all such processors.
433431

434432
[[geoip-cluster-settings]]
435433
===== Cluster settings
@@ -458,7 +456,7 @@ each node's <<es-tmpdir,temporary directory>> at `$ES_TMPDIR/geoip-databases/<no
458456
Note that {es} will make a GET request to `${ingest.geoip.downloader.endpoint}?elastic_geoip_service_tos=agree`,
459457
expecting the list of metadata about databases typically found in `overview.json`.
460458

461-
The GeoIP downloader uses the JDK's builtin cacerts. If you're using a custom endpoint, add the custom https endpoint cacert(s) to the JDK's truststore.
459+
The downloader uses the JDK's builtin cacerts. If you're using a custom endpoint, add the custom https endpoint cacert(s) to the JDK's truststore.
462460

463461
[[ingest-geoip-downloader-poll-interval]]
464462
`ingest.geoip.downloader.poll.interval`::
Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
[[ip-location-processor]]
2+
=== IP location processor
3+
++++
4+
<titleabbrev>IP Location</titleabbrev>
5+
++++
6+
7+
The `ip_location` processor adds information about the geographical location of an
8+
IPv4 or IPv6 address.
9+
10+
[[ip-location-automatic-updates]]
11+
By default, the processor uses the GeoLite2 City, GeoLite2 Country, and GeoLite2
12+
ASN IP geolocation databases from http://dev.maxmind.com/geoip/geoip2/geolite2/[MaxMind], shared under the
13+
CC BY-SA 4.0 license. It automatically downloads these databases if your nodes can connect to `storage.googleapis.com` domain and either:
14+
15+
* `ingest.geoip.downloader.eager.download` is set to true
16+
* your cluster has at least one pipeline with a `geoip` or `ip_location` processor
17+
18+
{es} automatically downloads updates for these databases from the Elastic GeoIP
19+
endpoint:
20+
https://geoip.elastic.co/v1/database?elastic_geoip_service_tos=agree[https://geoip.elastic.co/v1/database].
21+
To get download statistics for these updates, use the <<geoip-stats-api,GeoIP
22+
stats API>>.
23+
24+
If your cluster can't connect to the Elastic GeoIP endpoint or you want to
25+
manage your own updates, see <<manage-geoip-database-updates>>.
26+
27+
If you would like to have {es} download database files directly from Maxmind using your own provided
28+
license key, see <<put-ip-location-database-api>>.
29+
30+
If {es} can't connect to the endpoint for 30 days all updated databases will become
31+
invalid. {es} will stop enriching documents with ip geolocation data and will add `tags: ["_ip_location_expired_database"]`
32+
field instead.
33+
34+
[[using-ingest-ip-location]]
35+
==== Using the `ip_location` Processor in a Pipeline
36+
37+
[[ingest-ip-location-options]]
38+
.`ip-location` options
39+
[options="header"]
40+
|======
41+
| Name | Required | Default | Description
42+
| `field` | yes | - | The field to get the IP address from for the geographical lookup.
43+
| `target_field` | no | ip_location | The field that will hold the geographical information looked up from the database.
44+
| `database_file` | no | GeoLite2-City.mmdb | The database filename referring to one of the automatically downloaded GeoLite2 databases (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb), or the name of a supported database file in the `ingest-geoip` config directory, or the name of a <<get-ip-location-database-api, configured database>> (with the `.mmdb` suffix appended).
45+
| `properties` | no | [`continent_name`, `country_iso_code`, `country_name`, `region_iso_code`, `region_name`, `city_name`, `location`] * | Controls what properties are added to the `target_field` based on the ip geolocation lookup.
46+
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
47+
| `first_only` | no | `true` | If `true` only first found ip geolocation data, will be returned, even if `field` contains array
48+
| `download_database_on_pipeline_creation` | no | `true` | If `true` (and if `ingest.geoip.downloader.eager.download` is `false`), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the `default_pipeline` or `final_pipeline` in an index.
49+
|======
50+
51+
*Depends on what is available in `database_file`:
52+
53+
* If a GeoLite2 City or GeoIP2 City database is used, then the following fields may be added under the `target_field`: `ip`,
54+
`country_iso_code`, `country_name`, `country_in_european_union`, `registered_country_iso_code`, `registered_country_name`, `registered_country_in_european_union`,
55+
`continent_code`, `continent_name`, `region_iso_code`, `region_name`, `city_name`, `postal_code`, `timezone`,
56+
`location`, and `accuracy_radius`. The fields actually added depend on what has been found and which properties were configured in `properties`.
57+
* If a GeoLite2 Country or GeoIP2 Country database is used, then the following fields may be added under the `target_field`: `ip`,
58+
`country_iso_code`, `country_name`, `country_in_european_union`, `registered_country_iso_code`, `registered_country_name`, `registered_country_in_european_union`,
59+
`continent_code`, and `continent_name`. The fields actually added depend on what has been found
60+
and which properties were configured in `properties`.
61+
* If the GeoLite2 ASN database is used, then the following fields may be added under the `target_field`: `ip`,
62+
`asn`, `organization_name` and `network`. The fields actually added depend on what has been found and which properties were configured
63+
in `properties`.
64+
* If the GeoIP2 Anonymous IP database is used, then the following fields may be added under the `target_field`: `ip`,
65+
`hosting_provider`, `tor_exit_node`, `anonymous_vpn`, `anonymous`, `public_proxy`, and `residential_proxy`. The fields actually added
66+
depend on what has been found and which properties were configured in `properties`.
67+
* If the GeoIP2 Connection Type database is used, then the following fields may be added under the `target_field`: `ip`, and
68+
`connection_type`. The fields actually added depend on what has been found and which properties were configured in `properties`.
69+
* If the GeoIP2 Domain database is used, then the following fields may be added under the `target_field`: `ip`, and `domain`.
70+
The fields actually added depend on what has been found and which properties were configured in `properties`.
71+
* If the GeoIP2 ISP database is used, then the following fields may be added under the `target_field`: `ip`, `asn`,
72+
`organization_name`, `network`, `isp`, `isp_organization_name`, `mobile_country_code`, and `mobile_network_code`. The fields actually added
73+
depend on what has been found and which properties were configured in `properties`.
74+
* If the GeoIP2 Enterprise database is used, then the following fields may be added under the `target_field`: `ip`,
75+
`country_iso_code`, `country_name`, `country_in_european_union`, `registered_country_iso_code`, `registered_country_name`, `registered_country_in_european_union`,
76+
`continent_code`, `continent_name`, `region_iso_code`, `region_name`, `city_name`, `postal_code`, `timezone`,
77+
`location`, `accuracy_radius`, `country_confidence`, `city_confidence`, `postal_confidence`, `asn`, `organization_name`, `network`,
78+
`hosting_provider`, `tor_exit_node`, `anonymous_vpn`, `anonymous`, `public_proxy`,
79+
`residential_proxy`, `domain`, `isp`, `isp_organization_name`, `mobile_country_code`, `mobile_network_code`, `user_type`, and
80+
`connection_type`. The fields actually added depend on what has been found and which properties were configured in `properties`.
81+
82+
Here is an example that uses the default city database and adds the geographical information to the `ip_location` field based on the `ip` field:
83+
84+
[source,console]
85+
--------------------------------------------------
86+
PUT _ingest/pipeline/ip_location
87+
{
88+
"description" : "Add ip geolocation info",
89+
"processors" : [
90+
{
91+
"ip_location" : {
92+
"field" : "ip"
93+
}
94+
}
95+
]
96+
}
97+
PUT my-index-000001/_doc/my_id?pipeline=ip_location
98+
{
99+
"ip": "89.160.20.128"
100+
}
101+
GET my-index-000001/_doc/my_id
102+
--------------------------------------------------
103+
104+
Which returns:
105+
106+
[source,console-result]
107+
--------------------------------------------------
108+
{
109+
"found": true,
110+
"_index": "my-index-000001",
111+
"_id": "my_id",
112+
"_version": 1,
113+
"_seq_no": 55,
114+
"_primary_term": 1,
115+
"_source": {
116+
"ip": "89.160.20.128",
117+
"ip_location": {
118+
"continent_name": "Europe",
119+
"country_name": "Sweden",
120+
"country_iso_code": "SE",
121+
"city_name" : "Linköping",
122+
"region_iso_code" : "SE-E",
123+
"region_name" : "Östergötland County",
124+
"location": { "lat": 58.4167, "lon": 15.6167 }
125+
}
126+
}
127+
}
128+
--------------------------------------------------
129+
// TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/]
130+
131+
Here is an example that uses the default country database and adds the
132+
geographical information to the `geo` field based on the `ip` field. Note that
133+
this database is downloaded automatically. So this:
134+
135+
[source,console]
136+
--------------------------------------------------
137+
PUT _ingest/pipeline/ip_location
138+
{
139+
"description" : "Add ip geolocation info",
140+
"processors" : [
141+
{
142+
"ip_location" : {
143+
"field" : "ip",
144+
"target_field" : "geo",
145+
"database_file" : "GeoLite2-Country.mmdb"
146+
}
147+
}
148+
]
149+
}
150+
PUT my-index-000001/_doc/my_id?pipeline=ip_location
151+
{
152+
"ip": "89.160.20.128"
153+
}
154+
GET my-index-000001/_doc/my_id
155+
--------------------------------------------------
156+
157+
returns this:
158+
159+
[source,console-result]
160+
--------------------------------------------------
161+
{
162+
"found": true,
163+
"_index": "my-index-000001",
164+
"_id": "my_id",
165+
"_version": 1,
166+
"_seq_no": 65,
167+
"_primary_term": 1,
168+
"_source": {
169+
"ip": "89.160.20.128",
170+
"geo": {
171+
"continent_name": "Europe",
172+
"country_name": "Sweden",
173+
"country_iso_code": "SE"
174+
}
175+
}
176+
}
177+
--------------------------------------------------
178+
// TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
179+
180+
181+
Not all IP addresses find geo information from the database, When this
182+
occurs, no `target_field` is inserted into the document.
183+
184+
Here is an example of what documents will be indexed as when information for "80.231.5.0"
185+
cannot be found:
186+
187+
[source,console]
188+
--------------------------------------------------
189+
PUT _ingest/pipeline/ip_location
190+
{
191+
"description" : "Add ip geolocation info",
192+
"processors" : [
193+
{
194+
"ip_location" : {
195+
"field" : "ip"
196+
}
197+
}
198+
]
199+
}
200+
201+
PUT my-index-000001/_doc/my_id?pipeline=ip_location
202+
{
203+
"ip": "80.231.5.0"
204+
}
205+
206+
GET my-index-000001/_doc/my_id
207+
--------------------------------------------------
208+
209+
Which returns:
210+
211+
[source,console-result]
212+
--------------------------------------------------
213+
{
214+
"_index" : "my-index-000001",
215+
"_id" : "my_id",
216+
"_version" : 1,
217+
"_seq_no" : 71,
218+
"_primary_term": 1,
219+
"found" : true,
220+
"_source" : {
221+
"ip" : "80.231.5.0"
222+
}
223+
}
224+
--------------------------------------------------
225+
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]

0 commit comments

Comments
 (0)