Skip to content

Commit 4d775cb

Browse files
kkrik-esfelixbarny
andauthored
Add documentation for passthrough field type (elastic#114720)
* Guard second doc parsing pass with index setting * add test * updates * updates * merge * Add documentation for passthrough field type * Apply suggestions from code review Co-authored-by: Felix Barnsteiner <[email protected]> * updates * updates * Update docs/reference/mapping/types/passthrough.asciidoc Co-authored-by: Felix Barnsteiner <[email protected]> * address comment * address comment * Update docs/reference/mapping/types/passthrough.asciidoc Co-authored-by: Felix Barnsteiner <[email protected]> * address comment --------- Co-authored-by: Felix Barnsteiner <[email protected]>
1 parent 551a7d6 commit 4d775cb

File tree

6 files changed

+247
-23
lines changed

6 files changed

+247
-23
lines changed

docs/reference/data-streams/set-up-tsds.asciidoc

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,8 @@ naming scheme].
121121
* Specify a mapping that defines your dimensions and metrics:
122122

123123
** One or more <<time-series-dimension,dimension fields>> with a `time_series_dimension` value of `true`.
124-
At least one of these dimensions must be a plain `keyword` field.
124+
Alternatively, one or more <<passthrough-dimensions, pass-through>> fields configured as dimension containers,
125+
provided that they will contain at least one sub-field (mapped statically or dynamically).
125126

126127
** One or more <<time-series-metric,metric fields>>, marked using the `time_series_metric` mapping parameter.
127128

@@ -203,10 +204,9 @@ DELETE _ilm/policy/my-weather-sensor-lifecycle-policy
203204
Documents in a TSDS must include:
204205

205206
* A `@timestamp` field
206-
* One or more dimension fields. At least one dimension must be a `keyword` field
207-
that matches the `index.routing_path` index setting, if specified. If not specified
208-
explicitly, `index.routing_path` is set automatically to whichever mappings have
209-
`time_series_dimension` set to `true`.
207+
* One or more dimension fields. At least one dimension must match the `index.routing_path` index setting,
208+
if specified. If not specified explicitly, `index.routing_path` is set automatically to whichever mappings have
209+
`time_series_dimension` set to `true`.
210210

211211
To automatically create your TSDS, submit an indexing request that
212212
targets the TSDS's name. This name must match one of your index template's
@@ -285,13 +285,12 @@ POST metrics-weather_sensors-dev/_rollover
285285

286286
Configuring a TSDS via an index template that uses component templates is a bit more complicated.
287287
Typically with component templates mappings and settings get scattered across multiple component templates.
288-
When configuring the `index.mode` setting in a component template, the `index.routing_path` setting needs to
289-
be defined in the same component template. Additionally the fields mentioned in the `index.routing_path`
290-
also need to be defined in the same component template with the `time_series_dimension` attribute enabled.
288+
If the `index.routing_path` is defined, the fields it references need to be defined in the same component
289+
template with the `time_series_dimension` attribute enabled.
291290

292-
The reasons for this is that each component template needs to be valid on its own and the time series index mode
293-
requires the `index.routing_path` setting. When configuring the `index.mode` setting in an index template, the `index.routing_path` setting is configured automatically. It is derived from
294-
the field mappings with `time_series_dimension` attribute enabled.
291+
The reasons for this is that each component template needs to be valid on its own. When configuring the
292+
`index.mode` setting in an index template, the `index.routing_path` setting is configured automatically.
293+
It is derived from the field mappings with `time_series_dimension` attribute enabled.
295294

296295
[discrete]
297296
[[set-up-tsds-whats-next]]

docs/reference/data-streams/tsds.asciidoc

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,10 @@ parameter:
109109
* <<number,`unsigned_long`>>
110110
* <<boolean,`boolean`>>
111111

112-
For a flattened field, use the `time_series_dimensions` parameter to configure an array of fields as dimensions. For details refer to <<flattened-params,`flattened`>>.
112+
For a flattened field, use the `time_series_dimensions` parameter to configure an array of fields as dimensions.
113+
For details refer to <<flattened-params,`flattened`>>.
114+
115+
Dimension definitions can be simplified through <<passthrough-dimensions, pass-through>> fields.
113116

114117
[discrete]
115118
[[time-series-metric]]
@@ -294,12 +297,15 @@ When you create the matching index template for a TSDS, you must specify one or
294297
more dimensions in the `index.routing_path` setting. Each document in a TSDS
295298
must contain one or more dimensions that match the `index.routing_path` setting.
296299

297-
Dimensions in the `index.routing_path` setting must be plain `keyword` fields.
298300
The `index.routing_path` setting accepts wildcard patterns (for example `dim.*`)
299301
and can dynamically match new fields. However, {es} will reject any mapping
300-
updates that add scripted, runtime, or non-dimension, non-`keyword` fields that
302+
updates that add scripted, runtime, or non-dimension fields that
301303
match the `index.routing_path` value.
302304

305+
<<passthrough-dimensions, pass-through>> fields may be configured
306+
as dimension containers. In this case, their sub-fields get included to the
307+
routing path automatically.
308+
303309
TSDS documents don't support a custom `_routing` value. Similarly, you can't
304310
require a `_routing` value in mappings for a TSDS.
305311

docs/reference/mapping/params/subobjects.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ PUT my-index-000001/_doc/metric_1
111111

112112
The `subobjects` setting for existing fields and the top-level mapping definition cannot be updated.
113113

114+
[[subobjects-auto-flattening]]
114115
==== Auto-flattening object mappings
115116

116117
It is generally recommended to define the properties of an object that is configured with `subobjects: false` with dotted field names

docs/reference/mapping/types.asciidoc

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,13 @@ Dates:: Date types, including <<date,`date`>> and
3535
[[object-types]]
3636
==== Objects and relational types
3737

38-
<<object,`object`>>:: A JSON object.
39-
<<flattened,`flattened`>>:: An entire JSON object as a single field value.
40-
<<nested,`nested`>>:: A JSON object that preserves the relationship
41-
between its subfields.
42-
<<parent-join,`join`>>:: Defines a parent/child relationship for documents
43-
in the same index.
38+
<<object,`object`>>:: A JSON object.
39+
<<flattened,`flattened`>>:: An entire JSON object as a single field value.
40+
<<nested,`nested`>>:: A JSON object that preserves the relationship
41+
between its subfields.
42+
<<parent-join,`join`>>:: Defines a parent/child relationship for documents
43+
in the same index.
44+
<<passthrough,`passthrough`>>:: Provides aliases for sub-fields at the same level.
4445

4546

4647
[discrete]
@@ -167,6 +168,8 @@ include::types/numeric.asciidoc[]
167168

168169
include::types/object.asciidoc[]
169170

171+
include::types/passthrough.asciidoc[]
172+
170173
include::types/percolator.asciidoc[]
171174

172175
include::types/point.asciidoc[]
Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
[[passthrough]]
2+
=== Pass-through object field type
3+
++++
4+
<titleabbrev>Pass-through object</titleabbrev>
5+
++++
6+
7+
Pass-through objects extend the functionality of <<object, objects>> by allowing to access
8+
their subfields without including the name of the pass-through object as prefix. For instance:
9+
10+
[source,console]
11+
--------------------------------------------------
12+
PUT my-index-000001
13+
{
14+
"mappings": {
15+
"properties": {
16+
"attributes": {
17+
"type": "passthrough", <1>
18+
"priority": 10,
19+
"properties": {
20+
"id": {
21+
"type": "keyword"
22+
}
23+
}
24+
}
25+
}
26+
}
27+
}
28+
29+
PUT my-index-000001/_doc/1
30+
{
31+
"attributes" : { <2>
32+
"id": "foo",
33+
"zone": 10
34+
}
35+
}
36+
37+
GET my-index-000001/_search
38+
{
39+
"query": {
40+
"bool": {
41+
"must": [
42+
{ "match": { "id": "foo" }}, <3>
43+
{ "match": { "zone": 10 }}
44+
]
45+
}
46+
}
47+
}
48+
49+
GET my-index-000001/_search
50+
{
51+
"query": {
52+
"bool": {
53+
"must": [
54+
{ "match": { "attributes.id": "foo" }}, <4>
55+
{ "match": { "attributes.zone": 10 }}
56+
]
57+
}
58+
}
59+
}
60+
61+
--------------------------------------------------
62+
63+
<1> An object is defined as pass-through. Its priority (required) is used for conflict resolution.
64+
<2> Object contents get indexed as usual, including dynamic mappings.
65+
<3> Sub-fields can be referenced in queries as if they're defined at the root level.
66+
<4> Sub-fields can also be referenced including the object name as prefix.
67+
68+
[[passthrough-conflicts]]
69+
==== Conflict resolution
70+
71+
It's possible for conflicting names to arise, for fields that are defined within different scopes:
72+
73+
1. A pass-through object is defined next to a field that has the same name as one of the pass-through object
74+
sub-fields, e.g.
75+
76+
[source,console]
77+
--------------------------------------------------
78+
PUT my-index-000001/_doc/1
79+
{
80+
"attributes" : {
81+
"id": "foo"
82+
},
83+
"id": "bar"
84+
}
85+
--------------------------------------------------
86+
87+
In this case, references to `id` point to the field at the root level, while field `attributes.id`
88+
can only be accessed using the full path.
89+
90+
1. Two (or more) pass-through objects are defined within the same object and contain fields with the same name, e.g.
91+
92+
[source,console]
93+
--------------------------------------------------
94+
PUT my-index-000002
95+
{
96+
"mappings": {
97+
"properties": {
98+
"attributes": {
99+
"type": "passthrough",
100+
"priority": 10,
101+
"properties": {
102+
"id": {
103+
"type": "keyword"
104+
}
105+
}
106+
},
107+
"resource.attributes": {
108+
"type": "passthrough",
109+
"priority": 20,
110+
"properties": {
111+
"id": {
112+
"type": "keyword"
113+
}
114+
}
115+
}
116+
}
117+
}
118+
}
119+
--------------------------------------------------
120+
121+
In this case, param `priority` is used for conflict resolution, with the higher values taking precedence. In the
122+
example above, `resource.attributes` has higher priority than `attributes`, so references to `id` point to the field
123+
within `resource.attributes`. `attributes.id` can still be accessed using its full path.
124+
125+
[[passthrough-dimensions]]
126+
==== Defining sub-fields as time-series dimensions
127+
128+
It is possible to configure a pass-through field as a container for <<time-series-dimension,time-series dimensions>>.
129+
In this case, all sub-fields get annotated with the same parameter under the covers, and they're also
130+
included in <<dimension-based-routing, routing path>> and <<tsid, tsid>> calculations, thus simplifying
131+
the <<tsds,TSDS>> setup:
132+
133+
[source,console]
134+
--------------------------------------------------
135+
PUT _index_template/my-metrics
136+
{
137+
"index_patterns": ["metrics-mymetrics-*"],
138+
"priority": 200,
139+
"data_stream": { },
140+
"template": {
141+
"settings": {
142+
"index.mode": "time_series"
143+
},
144+
"mappings": {
145+
"properties": {
146+
"attributes": {
147+
"type": "passthrough",
148+
"priority": 10,
149+
"time_series_dimension": true,
150+
"properties": {
151+
"host.name": {
152+
"type": "keyword"
153+
}
154+
}
155+
},
156+
"cpu": {
157+
"type": "integer",
158+
"time_series_metric": "counter"
159+
}
160+
}
161+
}
162+
}
163+
}
164+
165+
POST metrics-mymetrics-test/_doc
166+
{
167+
"@timestamp": "2020-01-01T00:00:00.000Z",
168+
"attributes" : {
169+
"host.name": "foo",
170+
"zone": "bar"
171+
},
172+
"cpu": 10
173+
}
174+
--------------------------------------------------
175+
// TEST[skip: The @timestamp value won't match an accepted range in the TSDS]
176+
177+
In the example above, `attributes` is defined as a dimension container. Its sub-fields `host.name` (static) and `zone`
178+
(dynamic) get included in the routing path and tsid, and can be referenced in queries without the `attributes.` prefix.
179+
180+
[[passthrough-flattening]]
181+
==== Sub-field auto-flattening
182+
183+
Pass-through fields apply <<subobjects-auto-flattening, auto-flattening>> to sub-fields by default, to reduce dynamic
184+
mapping conflicts. As a consequence, no sub-object definitions are allowed within pass-through fields.
185+
186+
[[passthrough-params]]
187+
==== Parameters for `passthrough` fields
188+
189+
The following parameters are accepted by `passthrough` fields:
190+
191+
[horizontal]
192+
193+
<<passthrough-conflicts,`priority`>>::
194+
195+
(Required) used for naming conflict resolution between pass-through fields. The field with the highest value wins.
196+
Accepts non-negative integer values.
197+
198+
<<passthrough-dimensions,`time_series_dimension`>>::
199+
200+
Whether or not to treat sub-fields as <<time-series-dimension,time-series dimensions>>.
201+
Accepts `false` (default) or `true`.
202+
203+
<<dynamic,`dynamic`>>::
204+
205+
Whether or not new `properties` should be added dynamically to an existing object.
206+
Accepts `true` (default), `runtime`, `false` and `strict`.
207+
208+
<<enabled,`enabled`>>::
209+
210+
Whether the JSON value given for the object field should be parsed and indexed (`true`, default)
211+
or completely ignored (`false`).
212+
213+
<<properties,`properties`>>::
214+
215+
The fields within the object, which can be of any <<mapping-types,data type>>, including `object`.
216+
New properties may be added to an existing object.
217+
218+
IMPORTANT: If you need to index arrays of objects instead of single objects, read <<nested>> first.

server/src/main/java/org/elasticsearch/index/mapper/PassThroughObjectMapper.java

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,6 @@
3434
* In case different pass-through objects contain subfields with the same name (excluding the pass-through prefix), their aliases conflict.
3535
* To resolve this, the pass-through spec specifies which object takes precedence through required parameter "priority"; non-negative
3636
* integer values are accepted, with the highest priority value winning in case of conflicting aliases.
37-
*
38-
* Note that this is an experimental, undocumented mapper type, currently intended for prototyping purposes only.
39-
* It has not been vetted for use in production systems.
4037
*/
4138
public class PassThroughObjectMapper extends ObjectMapper {
4239
public static final String CONTENT_TYPE = "passthrough";

0 commit comments

Comments
 (0)