You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{transforms-cap}} enable you to convert existing {{es}} indices into summarized indices, which provide opportunities for new insights and analytics. For example, you can use {{transforms}} to pivot your data into entity-centric indices that summarize the behavior of users or sessions or other entities in your data. Or you can use {{transforms}} to find the latest document among all the documents that have a certain unique key.
9
10
10
-
% What needs to be done: Align serverless/stateful
11
-
12
-
% Scope notes: views in last 6 months: ~90/week
13
-
14
-
% Use migrated content from existing pages that map to this page:
Copy file name to clipboardExpand all lines: explore-analyze/transforms/transform-alerts.md
+4-9Lines changed: 4 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,11 +14,10 @@ You can create {{transform}} rules under **{{stack-manage-app}} > {{rules-ui}}**
14
14
1. Click **Create rule** and select the {{transform}} health rule type.
15
15
2. Give a name to the rule and optionally provide tags.
16
16
3. Select the {{transform}} or {{transforms}} to include. You can also use a special character (`*`) to apply the rule to all your {{transforms}}. {{transforms-cap}} created after the rule are automatically included.
4. The following health checks are available and enabled by default:
24
23
@@ -33,7 +32,6 @@ You can create {{transform}} rules under **{{stack-manage-app}} > {{rules-ui}}**
33
32
34
33
As the last step in the rule creation process, define its actions.
35
34
36
-
37
35
## Defining actions [defining-actions]
38
36
39
37
You can add one or more actions to your rule to generate notifications when its conditions are met and when they are no longer met. In particular, this rule type supports:
@@ -55,7 +53,6 @@ After you select a connector, you must set the action frequency. You can choose
55
53
If you choose a custom action interval, it cannot be shorter than the rule’s check interval.
56
54
::::
57
55
58
-
59
56
Alternatively, you can set the action frequency such that actions run for each alert. Choose how often the action runs (at each check interval, only when the alert status changes, or at a custom action interval). You must also choose an action group, which indicates whether the action runs when the issue is detected or when it is recovered.
60
57
61
58
You can further refine the conditions under which actions run by specifying that actions only run when they match a KQL query or when an alert occurs within a specific time frame.
@@ -71,7 +68,6 @@ After you save the configurations, the rule appears in the **{{rules-ui}}** list
71
68
72
69
The name of an alert is always the same as the {{transform}} ID of the associated {{transform}} that triggered it. You can mute the notifications for a particular {{transform}} on the page of the rule that lists the individual alerts. You can open it via **{{rules-ui}}** by selecting the rule name.
73
70
74
-
75
71
## Action variables [transform-action-variables]
76
72
77
73
The following variables are specific to the {{transform}} health rule type. You can also specify [variables common to all rules](../alerts/kibana/rule-action-variables.md).
@@ -103,5 +99,4 @@ The following variables are specific to the {{transform}} health rule type. You
103
99
{{/context.results}}
104
100
```
105
101
106
-
107
102
For more examples, refer to [Rule action variables](../alerts/kibana/rule-action-variables.md).
Each time a {{transform}} examines the source indices and creates or updates the destination index, it generates a *checkpoint*.
13
10
14
11
If your {{transform}} runs only once, there is logically only one checkpoint. If your {{transform}} runs continuously, however, it creates checkpoints as it ingests and transforms new source data. The `sync` property of the {{transform}} configures checkpointing by specifying a time field.
@@ -31,15 +28,12 @@ To create a checkpoint, the {{ctransform}}:
31
28
32
29
The {{transform}} applies changes related to either new or changed entities or time buckets to the destination index. The set of changes can be paginated. The {{transform}} performs a composite aggregation similarly to the batch {{transform}} operation, however it also injects query filters based on the previous step to reduce the amount of work. After all changes have been applied, the checkpoint is complete.
33
30
34
-
35
31
This checkpoint process involves both search and indexing activity on the cluster. We have attempted to favor control over performance while developing {{transforms}}. We decided it was preferable for the {{transform}} to take longer to complete, rather than to finish quickly and take precedence in resource consumption. That being said, the cluster still requires enough resources to support both the composite aggregation search and the indexing of its results.
36
32
37
33
::::{tip}
38
34
If the cluster experiences unsuitable performance degradation due to the {{transform}}, stop the {{transform}} and refer to [Performance considerations](transform-overview.md#transform-performance).
39
35
::::
40
36
41
-
42
-
43
37
## Using the ingest timestamp for syncing the {{transform}} [sync-field-ingest-timestamp]
44
38
45
39
In most cases, it is strongly recommended to use the ingest timestamp of the source indices for syncing the {{transform}}. This is the most optimal way for {{transforms}} to be able to identify new changes. If your data source follows the [ECS standard](https://www.elastic.co/guide/en/ecs/{{ecs_version}}/ecs-reference.html), you might already have an [`event.ingested`](https://www.elastic.co/guide/en/ecs/{{ecs_version}}/ecs-event.html#field-event-ingested) field. In this case, use `event.ingested` as the `sync`.`time`.`field` property of your {{transform}}.
@@ -65,7 +59,6 @@ After you created the ingest pipeline, apply it to the source indices of your {{
65
59
66
60
Refer to [Add a pipeline to an indexing request](../../manage-data/ingest/transform-enrich/ingest-pipelines.md#add-pipeline-to-indexing-request) and [Ingest pipelines](../../manage-data/ingest/transform-enrich/ingest-pipelines.md) to learn more about how to use an ingest pipeline.
When the {{transform}} runs in continuous mode, it updates the documents in the destination index as new data comes in. The {{transform}} uses a set of heuristics called change detection to update the destination index with fewer operations.
@@ -74,7 +67,6 @@ In this example, the data is grouped by host names. Change detection detects whi
74
67
75
68
Another heuristic can be applied for time buckets when a `date_histogram` is used to group by time buckets. Change detection detects which time buckets have changed and only update those.
Failures in {{transforms}} tend to be related to searching or indexing. To increase the resiliency of {{transforms}}, the cursor positions of the aggregated search and the changed entities search are tracked in memory and persisted periodically.
These examples demonstrate how to use {{transforms}} to derive useful insights from your data. All the examples use one of the [{{kib}} sample datasets](https://www.elastic.co/guide/en/kibana/current/add-sample-data.html). For a more detailed, step-by-step example, see [Tutorial: Transforming the eCommerce sample data](ecommerce-transforms.md).
13
10
14
11
*[Finding your best customers](#example-best-customers)
@@ -58,12 +55,10 @@ POST _transform/_preview
58
55
1. The destination index for the {{transform}}. It is ignored by `_preview`.
59
56
2. Two `group_by` fields is selected. This means the {{transform}} contains a unique row per `user` and `customer_id` combination. Within this data set, both these fields are unique. By including both in the {{transform}}, it gives more context to the final results.
60
57
61
-
62
58
::::{note}
63
59
In the example above, condensed JSON formatting is used for easier readability of the pivot object.
64
60
::::
65
61
66
-
67
62
The preview {{transforms}} API enables you to see the layout of the {{transform}} in advance, populated with some sample values. For example:
68
63
69
64
```js
@@ -85,7 +80,6 @@ The preview {{transforms}} API enables you to see the layout of the {{transform}
85
80
86
81
:::::
87
82
88
-
89
83
This {{transform}} makes it easier to answer questions such as:
90
84
91
85
* Which customers spend the most?
@@ -95,7 +89,6 @@ This {{transform}} makes it easier to answer questions such as:
95
89
96
90
It’s possible to answer these questions using aggregations alone, however {{transforms}} allow us to persist this data as a customer centric index. This enables us to analyze data at scale and gives more flexibility to explore and navigate data from a customer centric perspective. In some cases, it can even make creating visualizations much simpler.
97
91
98
-
99
92
## Finding air carriers with the most delays [example-airline]
100
93
101
94
This example uses the Flights sample data set to find out which air carrier had the most delays. First, filter the source data such that it excludes all the cancelled flights by using a query filter. Then transform the data to contain the distinct number of flights, the sum of delayed minutes, and the sum of the flight minutes by air carrier. Finally, use a [`bucket_script`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-bucket-script-aggregation.html) to determine what percentage of the flight time was actually delay.
@@ -143,7 +136,6 @@ POST _transform/_preview
143
136
3. The data is grouped by the `Carrier` field which contains the airline name.
144
137
4. This `bucket_script` performs calculations on the results that are returned by the aggregation. In this particular example, it calculates what percentage of travel time was taken up by delays.
145
138
146
-
147
139
The preview shows you that the new index would contain data like this for each carrier:
148
140
149
141
```js
@@ -169,8 +161,6 @@ This {{transform}} makes it easier to answer questions such as:
169
161
This data is fictional and does not reflect actual delays or flight stats for any of the featured destination or origin airports.
This example uses the web log sample data set to identify suspicious client IPs. It transforms the data such that the new index contains the sum of bytes and the number of distinct URLs, agents, incoming requests by location, and geographic destinations for each client IP. It also uses filter aggregations to count the specific types of HTTP responses that each client IP receives. Ultimately, the example below transforms web log data into an entity centric index where the entity is `clientip`.
@@ -235,7 +225,6 @@ PUT _transform/suspicious_client_ips
235
225
4. Filter aggregation that counts the occurrences of successful (`200`) responses in the `response` field. The following two aggregations (`error404` and `error5xx`) count the error responses by error codes, matching an exact value or a range of response codes.
236
226
5. This `bucket_script` calculates the duration of the `clientip` access based on the results of the aggregation.
237
227
238
-
239
228
After you create the {{transform}}, you must start it:
240
229
241
230
```console
@@ -285,15 +274,13 @@ The search result shows you data like this for each client IP:
285
274
Like other Kibana sample data sets, the web log sample dataset contains timestamps relative to when you installed it, including timestamps in the future. The {{ctransform}} will pick up the data points once they are in the past. If you installed the web log sample dataset some time ago, you can uninstall and reinstall it and the timestamps will change.
286
275
::::
287
276
288
-
289
277
This {{transform}} makes it easier to answer questions such as:
290
278
291
279
* Which client IPs are transferring the most amounts of data?
292
280
* Which client IPs are interacting with a high number of different URLs?
293
281
* Which client IPs have high error rates?
294
282
* Which client IPs are interacting with a high number of destination countries?
295
283
296
-
297
284
## Finding the last log event for each IP address [example-last-log]
298
285
299
286
This example uses the web log sample data set to find the last log from an IP address. Let’s use the `latest` type of {{transform}} in continuous mode. It copies the most recent document for each unique key from the source index to the destination index and updates the destination index as new data comes into the source index.
@@ -357,7 +344,6 @@ PUT _transform/last-log-from-clientip
357
344
4. Contains the time field and delay settings used to synchronize the source and destination indices.
358
345
5. Specifies the retention policy for the transform. Documents that are older than the configured value will be removed from the destination index.
359
346
360
-
361
347
After you create the {{transform}}, start it:
362
348
363
349
```console
@@ -366,7 +352,6 @@ POST _transform/last-log-from-clientip/_start
366
352
367
353
::::
368
354
369
-
370
355
After the {{transform}} processes the data, search the destination index:
371
356
372
357
```console
@@ -425,7 +410,6 @@ This {{transform}} makes it easier to answer questions such as:
425
410
426
411
* What was the most recent log event associated with a specific IP address?
427
412
428
-
429
413
## Finding client IPs that sent the most bytes to the server [example-bytes]
430
414
431
415
This example uses the web log sample data set to find the client IP that sent the most bytes to the server in every hour. The example uses a `pivot` {{transform}} with a [`top_metrics`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-metrics.html) aggregation.
@@ -477,7 +461,6 @@ POST _transform/_preview
477
461
2. Calculates the maximum value of the `bytes` field.
478
462
3. Specifies the fields (`clientip` and `geo.src`) of the top document to return and the sorting method (document with the highest `bytes` value).
479
463
480
-
481
464
The API call above returns a response similar to this:
482
465
483
466
```js
@@ -518,7 +501,6 @@ The API call above returns a response similar to this:
518
501
}
519
502
```
520
503
521
-
522
504
## Getting customer name and email address by customer ID [example-customer-names]
523
505
524
506
This example uses the ecommerce sample data set to create an entity-centric index based on customer ID, and to get the customer name and email address by using the `top_metrics` aggregation.
@@ -566,7 +548,6 @@ POST _transform/_preview
566
548
1. The data is grouped by a `terms` aggregation on the `customer_id` field.
567
549
2. Specifies the fields to return (email and name fields) in a descending order by the order date.
568
550
569
-
570
551
The API returns a response that is similar to this:
571
552
572
553
```js
@@ -600,5 +581,3 @@ The API returns a response that is similar to this:
0 commit comments