You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The **Advanced** tab shows the underlying {{es}} configuration details and advanced configuration options for your stream.
10
10
11
-
You can use the **Advanced** tab to manually configure the index or component templates or modify other ingest pipelines used by the stream.
11
+
You can use the **Advanced** tab to add [descriptions](#streams-advanced-description) or [features](#streams-advanced-features) that provide useful information to Stream's AI components. You can also [manually configure](#streams-advanced-index-config) the index or component templates or modify other ingest pipelines used by the stream.
12
12
13
-
## Stream description
14
-
15
-
% can we provide more information on what users should put here.
Streams analyzes your data and identifies features. Features are a way to classify some of the data you have in your stream.
20
+
21
+
Each feature has a natural language description and an optional filter which points to a subset of your data.
20
22
21
-
% Why do users want to add systems here?
23
+
For example, in a stream of Kubernetes logs, the feature identification process would be able to identify that you have data from "nginx" which can be found by filtering for `WHERE service.name==nginx`. It would also include a description defining nginx.
22
24
23
-
Streams analyzes your stream and identifies systems. Then, you can select the ones you want to add to your stream.
25
+
Features provide useful information for AI processes, such as significant events, and are used as the foundation for them.
24
26
25
-
## Index configuration
27
+
## Index configuration[streams-advanced-index-config]
26
28
27
-
% Can we add use cases of when it makes sense to modify shards/replicas/refresh interval
29
+
:::{note}
30
+
Processing and schema changes should typically be done through the Streams interface, and none of these configuration processes are required. This feature mainly exists to help advanced users maintain familiar workflows.
31
+
:::
28
32
29
-
For classic streams, you can manually configure the stream's:
Copy file name to clipboardExpand all lines: solutions/observability/streams/management/data-quality.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,13 +6,13 @@ applies_to:
6
6
7
7
# Manage data quality [streams-data-retention]
8
8
9
-
Use the **Data quality** tab to find failed and degraded documents in your stream. Use the following components to monitor the health of your data and identify and fix issues:
9
+
After selecting a stream, use the **Data quality** tab to find failed and degraded documents in your stream. Use the following components to monitor the health of your data and identify and fix issues:
10
10
11
11
-**Degraded documents:** Documents with the `ignored` property usually because of malformed fields or exceeding the limit of total fields when `ignore_above:false`. This component shows the total number of degraded documents, the percentage, and status (**Good**, **Degraded**, **Poor**).
12
12
-**Failed documents:**: Documents that were rejected during ingestion because of mapping conflicts or pipeline failures.
13
-
-**Quality score:** Streams calculates the overall quality score (Good, Degraded, Poor) based on the percentage of degraded and failed documents.
13
+
-**Quality score:** Streams calculates the overall quality score (**Good**, **Degraded**, **Poor**) based on the percentage of degraded and failed documents.
14
14
-**Trends over time:** A time-series chart so you can track how degraded and failed documents are accumulating over time. Use the date picker to zoom into a specific range and understand when problems are spiking.
15
-
-**Issues:**: {applies_to}`stack: preview 9.2`Find issues with specific fields, how often they've occurred, and when they've occurred.
15
+
-**Issues:** {applies_to}`stack: preview 9.2`Find issues with specific fields, how often they've occurred, and when they've occurred.
16
16
17
17
## Failure store
18
18
@@ -23,7 +23,9 @@ To view and modify failure store in {{stack}}, you need the following data strea
23
23
-`read_failure_store`
24
24
-`manage_failure_store`
25
25
26
+
For more information, refer to [Granting privileges for data streams and aliases](../../../../deploy-manage/users-roles/cluster-or-deployment-auth/granting-privileges-for-data-streams-aliases.md).
26
27
27
-
In Streams, you need to turn on failure stores to see failed documents. To do this, select **Enable failure store*. From here you can set your failure store retention period.
28
+
### Turn on failure stores
29
+
In Streams, you need to turn on failure stores to get failed documents. To do this, select **Enable failure store** in the **Failed documents** component. From here you can set your failure store retention period.
28
30
29
31
For more information on data quality, refer to the [data set quality](../../data-set-quality-monitoring.md) documentation.
Copy file name to clipboardExpand all lines: solutions/observability/streams/management/extract.md
+25-29Lines changed: 25 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,16 +5,16 @@ applies_to:
5
5
---
6
6
# Extract fields [streams-extract-fields]
7
7
8
-
Use the **Processing** tab to add [processors](#streams-extract-processors) that extract meaningful fields from your log messages. These fields let you filter and analyze your data more effectively.
8
+
After selecting a stream, use the **Processing** tab to add [processors](#streams-extract-processors) that extract meaningful fields from your log messages. These fields let you filter and analyze your data more effectively.
9
9
10
10
For example, in [Discover](../../../../explore-analyze/discover.md), extracted fields might let you filter for log messages with an `ERROR` log level that occurred during a specific time period to help diagnose an issue. Without extracting the log level and timestamp fields from your messages, those filters wouldn't return meaningful results.
11
11
12
-
The processing tab also:
12
+
The **Processing** tab also:
13
13
14
-
- Simulates your processors and provides an immediate [preview](#streams-preview-changes) that's tested end to end.
15
-
- Flags indexing issues, like [mapping conflicts](#streams-processing-mapping-conflicts), so you can address them before applying changes.
14
+
- Simulates your processors and provides an immediate [preview](#streams-preview-changes) that's tested end to end
15
+
- Flags indexing issues, like [mapping conflicts](#streams-processing-mapping-conflicts), so you can address them before applying changes
16
16
17
-
After creating your processor, all future data ingested ingested into the stream will be parsed into structured fields accordingly.
17
+
After creating your processor, all future data ingested into the stream is parsed into structured fields accordingly.
18
18
19
19
:::{note}
20
20
Applied changes aren't retroactive and only affect *future ingested data*.
@@ -24,12 +24,12 @@ Applied changes aren't retroactive and only affect *future ingested data*.
24
24
25
25
Streams supports the following processors:
26
26
27
-
-[Date](./extract/date.md): Converts date strings into timestamps, with options for timezone, locale, and output formatting.
28
-
-[Dissect](./extract/dissect.md): Extracts fields from structured log messages using defined delimiters instead of patterns, making it faster than Grok and ideal for consistently formatted logs.
29
-
-[Grok](./extract/grok.md): Extracts fields from unstructured log messages using predefined or custom patterns, supports multiple match attempts in sequence, and can automatically generate patterns with an LLM connector.
30
-
-[Set](./extract/set.md): Assigns a specific value to a field, creating the field if it doesn’t exist or overwriting its value if it does.
31
-
-[Rename](./extract/rename.md): Changes the name of a field, moving its value to a new field name and removing the original.
32
-
-[Append](./extract/append.md): Adds a value to an existing array field, or creates the field as an array if it doesn’t exist.
27
+
-[**Date**](./extract/date.md): Converts date strings into timestamps, with options for timezone, locale, and output formatting.
28
+
-[**Dissect**](./extract/dissect.md): Extracts fields from structured log messages using defined delimiters instead of patterns, making it faster than Grok and ideal for consistently formatted logs.
29
+
-[**Grok**](./extract/grok.md): Extracts fields from unstructured log messages using predefined or custom patterns, supports multiple match attempts in sequence, and can automatically generate patterns with an LLM connector.
30
+
-[**Set**](./extract/set.md): Assigns a specific value to a field, creating the field if it doesn’t exist or overwriting its value if it does.
31
+
-[**Rename**](./extract/rename.md): Changes the name of a field, moving its value to a new field name and removing the original.
32
+
-[**Append**](./extract/append.md): Adds a value to an existing array field, or creates the field as an array if it doesn’t exist.
33
33
34
34
## Add a processor [streams-add-processors]
35
35
@@ -41,7 +41,7 @@ To add a processor from the **Processing** tab:
41
41
1. Select a processor from the **Processor** menu.
42
42
1. Configure the processor and select **Create** to save the processor.
43
43
44
-
After adding all desired processors and conditions, make sure to**Save changes**.
44
+
After adding all desired processors and conditions, select**Save changes**.
45
45
46
46
Refer to individual [supported processors](#streams-extract-processors) for more on configuring specific processors.
47
47
@@ -51,7 +51,7 @@ Editing processors with JSON is planned for a future release, and additional pro
51
51
52
52
### Add conditions to processors [streams-add-processor-conditions]
53
53
54
-
You can add conditions to processors so they only run on data that meets those conditions. Each condition is a boolean expression evaluated for every document.
54
+
You can add conditions to processors so they only run on data that meets those conditions. Each condition is a boolean expression that's evaluated for every document.
55
55
56
56
To add a condition:
57
57
@@ -80,16 +80,16 @@ Streams processors support the following comparators:
80
80
81
81
After you create processors, the **Data preview** tab simulates processor results with additional filtering options depending on the outcome of the simulation.
82
82
83
-
When you add or edit processors, the **Data preview** updates automatically.
83
+
When you add or edit processors, the **Data preview**tab updates automatically.
84
84
85
85
:::{note}
86
86
To avoid unexpected results, it's best to add processors rather than remove or reorder existing ones.
87
87
:::
88
88
89
-
**Data preview** loads 100 documents from your existing data and runs your changes against them.
89
+
The **Data preview** tab loads 100 documents from your existing data and runs your changes against them.
90
90
For any newly created processors and conditions, the preview results are reliable, and you can freely create and reorder during the preview.
91
91
92
-
Select**Save changes** to apply your changes to the data stream.
92
+
After making sure everything in the **Data preview** tab is correct, select**Save changes** to apply your changes to the data stream.
93
93
94
94
If you edit the stream after saving your changes, keep the following in mind:
95
95
@@ -99,34 +99,32 @@ If you edit the stream after saving your changes, keep the following in mind:
99
99
100
100
### Ignore failures [streams-ignore-failures]
101
101
102
-
Each processor has the option to **Ignore failures**. When enabled, processing of the document continues when the processor fails.
102
+
Each processor has the **Ignore failures** option. When enabled, document processing continues when even if the processor fails.
Dissect, grok, and rename processors include the **Ignore missing fields** option. When enabled, processing of the document continues when a source field is missing.
106
+
Dissect, grok, and rename processors include the **Ignore missing fields** option. When enabled, document processing continues even if a source field is missing.
107
107
108
108
## Detect and resolve failures [streams-detect-failures]
109
109
110
110
Documents can fail processing for various reasons. Streams helps you identify and resolve these issues before deploying changes.
111
111
112
-
In the following screenshot, the **Failed** percentage indicates that not all messages matched the provided grok pattern:
112
+
In the following screenshot, the **Failed** percentage indicates that some messages didn't match the provided grok pattern:
113
113
114
114

115
115
116
-
You can filter your documents by selecting **Parsed** or **Failed**at the top of the table.
116
+
You can filter your documents by selecting **Parsed** or **Failed**on the **Data preview** tab.
117
117
Selecting **Failed** shows the documents that weren't parsed correctly:
118
118
119
119

120
120
121
-
Failures are displayed at the bottom of the process editor:
121
+
Failures are displayed at the bottom of the process editor. Some failures may require fixes, while others simply serve as a warning:
As part of processing, Streams simulates your changes end to end to check for mapping conflicts. If a conflict is detected, Streams marks the processor as failed and displays a message like the following:
127
+
As part of processing, Streams simulates your changes end to end to check for mapping conflicts. If it detects a conflict, Streams marks the processor as failed and displays a message like the following:
@@ -140,9 +138,7 @@ Once saved, the processor displays its success rate and the fields it added.
140
138
141
139
## Advanced: How and where do these changes get applied to the underlying data stream? [streams-applied-changes]
142
140
143
-
% make sure this is all still accurate.
144
-
145
-
When you save processors, Streams modifies the best-matching ingest pipeline for the data stream. In short, it either chooses the best-matching pipeline ending in `@custom` in your data stream, or it adds one for you.
141
+
When you save processors, Streams appends processing to the best-matching ingest pipeline for the data stream. It either chooses the best-matching pipeline ending in `@custom` in your data stream, or it adds one for you.
146
142
147
143
Streams identifies the appropriate `@custom` pipeline (for example, `logs-myintegration@custom` or `logs@custom`) by checking the `default_pipeline` that is set on the data stream. You can view the default pipeline on the **Advanced** tab under **Ingest pipeline**.
148
144
@@ -177,5 +173,5 @@ You can still add your own processors manually to the `@custom` pipeline if need
177
173
178
174
## Known limitations [streams-known-limitations]
179
175
180
-
- Streams does not support all processors. We are working on adding more processors in the future.
181
-
- The data preview simulation may not accurately reflect the changes to the existing data when editing existing processors or re-ordering them. We will allow proper simulations using original documents in a future version.
176
+
- Streams does not support all processors. More processors will be added in future versions.
177
+
- The data preview simulation may not accurately reflect the changes to the existing data when editing existing processors or re-ordering them. Streams will allow proper simulations using original documents in a future version.
Copy file name to clipboardExpand all lines: solutions/observability/streams/management/extract/grok.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,11 +5,11 @@ applies_to:
5
5
---
6
6
# Grok processor [streams-grok-processor]
7
7
8
-
The grok processor parses unstructured log messages using a set of predefined patterns to match the log messages and extract the fields. The Grok processor is very powerful and can parse a wide variety of log formats.
8
+
The grok processor parses unstructured log messages using a set of predefined patterns to match the log messages and extract the fields. The grok processor is very powerful and can parse a wide variety of log formats.
9
9
10
-
You can provide multiple patterns to the grok processor. The Grok processor will try to match the log message against each pattern in the order they are provided. If a pattern matches, the fields will be extracted and the remaining patterns will not be used.
10
+
You can provide multiple patterns to the grok processor. The grok processor tries to match the log message against each pattern in the order they are provided. If a pattern matches, it extracts the fields and the remaining patterns won't be used.
11
11
12
-
If a pattern doesn't match, the grok processor will try the next pattern. If no patterns match, the Grok processor will fail and you can troubleshoot the issue. Instead of writing grok patterns, you can have streams generate patterns for you. Refer to [generate patterns](#streams-grok-patterns) for more information.
12
+
If a pattern doesn't match, the grok processor tries the next pattern. If no patterns match, the Grok processor will fail and you can troubleshoot the issue. Instead of writing grok patterns, you can have Streams generate patterns for you. Refer to [generate patterns](#streams-grok-patterns) for more information.
13
13
14
14
:::{tip}
15
15
To improve pipeline performance, start with the most common patterns first, then add more specific patterns. This reduces the number times the grok processor has to run.
@@ -44,7 +44,7 @@ Requires an LLM Connector to be configured.
44
44
45
45
Instead of writing the Grok patterns by hand, you can use the **Generate Patterns** button to generate the patterns for you.
46
46
47
-
Generated patterns work best on semi-structured data. For very custom logs with a lot of text, creating manual patterns general creates more accurate results.
47
+
Generated patterns work best on semi-structured data. For very custom logs with a lot of text, creating patterns manually generally creates more accurate results.
The **Manual pipeline configuration** lets you create a JSON-encoded array of ingest pipeline processors.
8
+
The **Manual pipeline configuration** lets you create a JSON-encoded array of ingest pipeline processors.This is helpful if you want to add more advanced processing that isn't currently available as part of the UI-based processors.
9
9
10
10
Refer to the following documentation for more on manually configuring processors:
11
11
12
12
-[Create readable and maintainable ingest pipelines](../../../../../manage-data/ingest/transform-enrich/readable-maintainable-ingest-pipelines.md)
13
13
-[Error handling in ingest pipelines](../../../../../manage-data/ingest/transform-enrich/error-handling.md)
Copy file name to clipboardExpand all lines: solutions/observability/streams/management/extract/set.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ To use a set processor:
12
12
13
13
1. Select **Create** → **Create processor**.
14
14
1. Select **Set** from the **Processor** menu.
15
-
1. Set **Source Field** to the field you want to insert, upsert, or update
15
+
1. Set **Source Field** to the field you want to insert, upsert, or update.
16
16
1. Set **Value** to the value you want the source field to be set to.
17
17
18
18
This functionality uses the {{es}} set pipeline processor. Refer to the [set processor](elasticsearch://reference/enrich-processor/set-processor.md) {{es}} documentation for more information.
0 commit comments