You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: manage-data/ingest/transform-enrich/error-handling.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -93,7 +93,7 @@ We can restructure the pipeline by moving the `on_failure` handling directly int
93
93
94
94
:::{note}
95
95
While executing two `set` processors within the `dissect` error handler may not always be ideal, it serves as a demonstration.
96
-
{note}
96
+
:::
97
97
98
98
For the `dissect` processor, consider setting a temporary field like `_tmp.error: dissect_failure`. You can then use `if` conditions in later processors to execute them only if parsing failed, allowing for more controlled and flexible error handling.
@@ -14,115 +14,116 @@ There are many ways to achieve similar results when creating ingest pipelines, w
14
14
This guide does not provide guidance on optimizing for ingest pipeline performance.
15
15
:::
16
16
17
-
## Write concise `if` conditional statements
18
-
19
-
Use `if` statements to ensure that an ingest pipeline processor is only applied when specific conditions are met.
20
-
21
-
### Accessing fields in conditionals
22
-
23
-
In an ingest pipeline, when working with `if` statements also known as conditionals inside processors. The topic around error processing is a bit more complex, most importantly any errors that are coming from null values, missing keys, missing values, inside the conditional, will lead to an error that is not captured by the `ignore_failure` handler and will exit the pipeline.
24
-
25
-
You can access fields in two ways:
26
-
27
-
- Dot notation
28
-
- Square bracket notation
29
-
30
-
For example:
31
-
32
-
-`ctx.event.action`
33
-
34
-
is equivalent to:
35
-
36
-
-`ctx['event']['action']`
37
-
38
-
Both notations can be used to reference fields, so choose the one that makes your pipeline easier to read and maintain.
39
-
40
-
:::{warn}
41
-
No support for null safety operations `?`
42
-
{warn}
43
-
44
-
Use the bracket notation when you have special characters such as `@` in the field name, or a `.` in the field name. As an example:
45
-
46
-
- field_name: `demo.cool.stuff`
47
-
48
-
using:
49
-
50
-
`ctx.demo.cool.stuff` it would try to access the field `stuff` in the object `cool` in the object `demo`.
51
-
52
-
using:
53
-
54
-
`ctx['demo.cool.stuff']` it can access the field directly.
55
-
56
-
You can also mix and match both worlds when needed:
Within a script there are the same two possibilities to access fields as above. As well as the new `getter`. This only works in the painless scripts in an ingest pipeline. Take the following input:
73
-
74
-
```json
75
-
{
76
-
"_source": {
77
-
"user_name": "philipp"
78
-
}
79
-
}
80
-
```
81
-
82
-
When you want to set the `user.name` field with a script:
83
-
84
-
-`ctx.user.name = ctx.user_name`
85
-
86
-
This works as long as `user_name` is populated. If it is null you will get `null` as value. Additionally, when the `user` object does not exist, it will error because Java needs you to define the `user` object first before adding a key `name` into it. We cover the `new HashMap()` further down.
87
-
88
-
This is one of the alternatives to get it working when you only want to set it, if it is not null
89
-
90
-
```painless
91
-
if (ctx.user_name != null) {
92
-
ctx.user = new HashMap();
93
-
ctx.user.name = ctx.user_name;
94
-
}
95
-
```
96
-
97
-
This works fine, as you now check for null.
98
-
99
-
However there is also an easier to write and maintain alternative available:
100
-
101
-
-`ctx.user.name = $('user_name', null);`
102
-
103
-
This $('field', 'fallback') allows you to specify the field without the `CTX` for walking. You can even supply `$('this.very.nested.field.is.super.far.away', null)` when you need to. The fallback is in case the field is null. This comes in very handy when needing to do certain manipulation of data. Let's say you want to lowercase all the field names, you can simply write this now:
You see that I switched up the null value to an empty String. Since the String has the `toLowerCase()` function. This of course works with all types. Bit of a silly thing, since you could simply write `object.abc` as the field value. As an example you can see that we can even create a map, list, array, whatever you want.
108
-
109
-
-`if ($('object', {}).containsKey('abc')){}`
110
-
111
-
One common thing I use it for is when dealing with numbers and casting. The field specifies the usage in `%`, however Elasticsearch doesn't like this, or better to say Kibana renders % as `0-1` for `0%-100%` and not `0-100`. `100` is equal to `10.000%`
112
-
113
-
- field: `cpu_usage = 100.00`
114
-
-`ctx.cpu.usage = $('cpu_usage',0.0)/100`
115
-
116
-
This allows me to always set the `cpu.usage` field and not to worry about it, have an always working division. One other way to leverage this, in a simpler script is like this, but most scripts are rather complex so this is not that often applicable.
117
-
118
-
```json
119
-
{
120
-
"script": {
121
-
"source": "ctx.abc = ctx.def"
122
-
"if": "ctx.def != null"
123
-
}
124
-
}
125
-
```
17
+
## Access fields
18
+
19
+
When creating ingest pipelines, there are are few options for accessing fields in conditional statements and scripts. All formats can be used to reference fields, so choose the one that makes your pipeline easier to read and maintain.
20
+
21
+
| Notation | Example | Notes |
22
+
|---|---|---|
23
+
| Dot notation |`ctx.event.action`| Supported in conditionals and painless scripts. |
24
+
| Square bracket notation |`ctx['event']['action']`| Supported in conditionals and painless scripts. |
25
+
| Mixed dot and bracket notation |`ctx.event['action']`| Supported in conditionals and painless scripts. |
26
+
| Getter |`$('event.action', null);`| Only supported in painless scripts. |
27
+
28
+
Below are some general guidelines for choosing the right option in a situation.
29
+
30
+
### Dot notation [dot-notation]
31
+
32
+
**Benefits:**
33
+
* Clean and easy to read.
34
+
* Supports null safety operations `?`.
35
+
For example, ...
36
+
37
+
**Limitations**
38
+
* Does not support field names that contain a `.` or any special characters such as `@`.
39
+
Use [Bracket notation](#bracket-notation) instead.
40
+
41
+
### Bracket notation [bracket-notation]
42
+
43
+
**Benefits:**
44
+
* Supports special characters such as `@` in the field name.
45
+
For example, if there's a field name called `has@!%&chars`, you would use `ctx['has@!%&chars']`.
46
+
* Supports field names that contain `.`.
47
+
For example, if there's a field named `foo.bar`, if you used `ctx.foo.bar` it will try to access the field `bar` in the object `foo` in the object `ctx`. If you used `ctx['foo.bar']` it can access the field directly.
48
+
49
+
**Limitations:**
50
+
* Slightly more verbose than dot notation.
51
+
* No support for null safety operations `?`.
52
+
Use [Dot notation](#dot-notation) instead.
53
+
54
+
### Mixed dot and bracket notation
55
+
56
+
**Benefits:**
57
+
* You can also mix dot notation and bracket notation to take advantage of the benefits of both formats.
58
+
For example, you could use `ctx.my.nested.object['has@!%&chars']`. Then you can use the `?` operator on the fields using dot notation while still accessing a field with a name that contains special characters: `ctx.my?.nested?.object['has@!%&chars']`.
59
+
60
+
**Limitations:**
61
+
* Slightly more difficult to read.
62
+
63
+
### Getter
64
+
65
+
Within a script there are the same two possibilities to access fields as above. As well as the new `getter`. This only works in the painless scripts in an ingest pipeline.
66
+
67
+
% For example, take the following input:
68
+
%
69
+
% ```json
70
+
% {
71
+
% "_source": {
72
+
% "user_name": "philipp"
73
+
% }
74
+
% }
75
+
% ```
76
+
%
77
+
% When you want to set the `user.name` field with a script:
78
+
%
79
+
% - `ctx.user.name = ctx.user_name`
80
+
%
81
+
% This works as long as `user_name` is populated. If it is null you will get `null` as value. Additionally, when the `user` object does not exist, it will error because Java needs you to define the `user` object first before adding a key `name` into it. We cover the `new HashMap()` further down.
82
+
%
83
+
% This is one of the alternatives to get it working when you only want to set it, if it is not null
84
+
%
85
+
% ```painless
86
+
% if (ctx.user_name != null) {
87
+
% ctx.user = new HashMap();
88
+
% ctx.user.name = ctx.user_name;
89
+
% }
90
+
% ```
91
+
%
92
+
% This works fine, as you now check for null.
93
+
%
94
+
% However there is also an easier to write and maintain alternative available:
95
+
%
96
+
% - `ctx.user.name = $('user_name', null);`
97
+
%
98
+
% This $('field', 'fallback') allows you to specify the field without the `CTX` for walking. You can even supply % `$('this.very.nested.field.is.super.far.away', null)` when you need to. The fallback is in case the field is % null. This comes in very handy when needing to do certain manipulation of data. Let's say you want to lowercase all the field names, you can simply write this now:
% You see that I switched up the null value to an empty String. Since the String has the `toLowerCase()` function. This of course works with all types. Bit of a silly thing, since you could simply write `object.abc` as the field value. As an example you can see that we can even create a map, list, array, whatever you want.
103
+
%
104
+
% - `if ($('object', {}).containsKey('abc')){}`
105
+
%
106
+
% One common thing I use it for is when dealing with numbers and casting. The field specifies the usage in `%`, however Elasticsearch doesn't like this, or better to say Kibana renders % as `0-1` for `0%-100%` and not `0-100`. `100` is equal to `10.000%`
107
+
%
108
+
% - field: `cpu_usage = 100.00`
109
+
% - `ctx.cpu.usage = $('cpu_usage',0.0)/100`
110
+
%
111
+
% This allows me to always set the `cpu.usage` field and not to worry about it, have an always working division. One other way to leverage this, in a simpler script is like this, but most scripts are rather complex so this is not that often applicable.
Use conditionals (`if` statements) to ensure that an ingest pipeline processor is only applied when specific conditions are met.
125
+
126
+
% In an ingest pipeline, when working with conditionals inside processors. The topic around error processing is a bit more complex, most importantly any errors that are coming from null values, missing keys, missing values, inside the conditional, will lead to an error that is not captured by the `ignore_failure` handler and will exit the pipeline.
126
127
127
128
### Avoid excessive OR conditions
128
129
@@ -146,7 +147,9 @@ When using the [boolean OR operator](elasticsearch://reference/scripting-languag
146
147
This example only checks for exact matches. Do not use this approach if you need to check for partial matches.
147
148
:::
148
149
149
-
### Use null safe operators `?`
150
+
### Use null safe operators (`?.`) [null-safe-operators]
151
+
152
+
Anticipate potential problems with the data, and use the [null safe operator](elasticsearch://reference/scripting-languages/painless/painless-operators-reference.md#null-safe-operator) (`?.`) to prevent data from being processed incorrectly.
150
153
151
154
In simplest case the `ignore_missing` parameter is available in most processors to handle fields without values. Or the `ignore_failure` parameter to let the processor fail without impacting the pipeline you but sometime you will need to use the [null safe operator `?.`](elasticsearch://reference/scripting-languages/painless/painless-operators-reference.md#null-safe-operator) to check if a field exists and is not `null`.
0 commit comments