Skip to content

Commit 32d71cf

Browse files
committed
update extract section
1 parent ff3e2d9 commit 32d71cf

File tree

1 file changed

+44
-29
lines changed
  • solutions/observability/logs/streams/management

1 file changed

+44
-29
lines changed

solutions/observability/logs/streams/management/extract.md

Lines changed: 44 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -3,18 +3,20 @@ applies_to:
33
serverless: preview
44
---
55
# Extract fields [streams-extract-fields]
6-
Log messages are often unstructured. To get the most value, it’s important to parse them and extract some of the information into dedicated fields. The most common data to extract is usually the timestamp and the log level, but other pieces of information like IP addresses, usernames, or ports can also be useful.
76

8-
Use the **Extract field** page under the **Management** tab to easily iterate and process your data. Any change is immediately available as a preview and is tested end-to-end.
9-
Because the changes to your data's structure is simulated, you'll see them instantly.
7+
Unstructured log messages need to be parsed into meaningful fields so you can filter and analyze them quickly. Common fields to extract include timestamp and the log level, but you can also extract information like IP addresses, usernames, or ports.
108

11-
The UI also shows indexing problems, such as mapping conflicts, ahead of time, so you can address them before applying the change.
9+
Use the **Extract field** page under the **Management** tab to process your data. Changes are immediately available as a preview and tested end-to-end.
10+
The UI simulates your changes, so you can see them immediately.
11+
12+
The UI also shows indexing problems, such as mapping conflicts, so you can address them before applying changes.
1213

1314
:::{note}
1415
Applied changes aren't retroactive and only affect *future data ingested*.
1516
:::
1617

1718
## Add a processor [streams-add-processors]
19+
1820
Streams uses {{es}} ingest pipelines to process your data. Ingest pipelines are made up of processors that transform your data.
1921

2022
To add a processor:
@@ -36,7 +38,8 @@ Editing processors with JSON is planned for a future release. More processors ma
3638
:::
3739

3840
### Add conditions to processors [streams-add-processor-conditions]
39-
You can provide a condition for each processor under **Optional fields**. The condition is a boolean expression that is evaluated for each document. Provide a field, a value, and a comparator.
41+
42+
You can provide a condition for each processor under **Optional fields**. Conditions are boolean expressions that are evaluated for each document. Provide a field, a value, and a comparator.
4043
Processors support these comparators:
4144
- equals
4245
- not equals
@@ -51,31 +54,37 @@ Processors support these comparators:
5154
- not exists
5255

5356
### Ignore failures [streams-ignore-failures]
54-
Turn on the `Ignore failure` option to ignore the processor if it fails. This is useful if you want to continue processing the document even if the processor fails.
57+
58+
Turn on **Ignore failure** to ignore the processor if it fails. This is useful if you want to continue processing the document even if the processor fails.
5559

5660
### Ignore missing fields [streams-ignore-missing-fields]
57-
Turn on the `Ignore missing fields` option to ignore the processor if the field is not present. This is useful if you want to continue processing the document even if the field is not present.
61+
62+
Turn on **Ignore missing fields** to ignore the processor if the field is not present. This is useful if you want to continue processing the document even if the field is not present.
5863

5964
### Preview changes [streams-preview-changes]
60-
The left side of the UI gives you access to a subset of pipeline processors to modify your documents, while the right side shows you a preview of the results, with additional filtering options depending on the outcome of the simulation.
6165

62-
Anytime you make a change on the left side of the UI, the table on the right updates automatically.
66+
Under **Processors for field extraction**, set pipeline processors to modify your documents. **Data preview** shows you a preview of the results, with additional filtering options depending on the outcome of the simulation.
67+
68+
When you add or edit processors, the **Data preview** updates automatically.
6369

64-
We recommend primarily adding processing steps, not removing them or changing the order of existing processors, as this may lead to unexpected results.
70+
:::{note}
71+
To avoid unexpected results, focus on adding processors rather than removing or reordering existing processors.
72+
:::
6573

66-
To preview changes, streams loads 100 documents from your existing data and runs your changes using them.
74+
**Data preview** loads 100 documents from your existing data and runs your changes using them.
6775
For any newly added processors, this simulation is reliable. You can save individual processors during the preview, and even reorder them.
68-
Once you click 'Save changes' at the bottom right of the UI, the changes are applied to the data stream.
76+
Selecting 'Save changes' applies your changes to the data stream.
6977

70-
If you then edit the stream again, keep the following in mind:
78+
If you edit the stream again, note the following:
7179
- Adding more processors to the end of the list will work as expected.
72-
- Making changes to existing processors or re-ordering them may cause unexpected results, as we are not able to accurately simulate the changes to the existing data. This is because the documents used for sampling may have already been processed by the pipeline. This is a known limitation.
73-
- Adding a new processor and moving it before an existing processor may have unexpected consequences. The simulation will only simulate the new processor, and not the existing ones. This means that the simulation may not accurately reflect the changes to the existing data.
80+
- Changing existing processors or re-ordering them may cause unexpected results. Because the the pipeline may have already processed the documents used for sampling, the UI cannot accurately simulate changes to existing data.
81+
- Adding a new processor and moving it before an existing processor may cause unexpected results. The UI only simulates the new processor, not the existing ones, so the simulation may not accurately reflect changes to existing data.
7482

7583
![alt text](<grok.png>)
7684

7785
## Detect and handle failures [streams-detect-failures]
78-
Documents fail processing for many different reasons. Streams helps you to easily find and handle failures before deploying changes.
86+
87+
Documents fail processing for different reasons. Streams helps you to easily find and handle failures before deploying changes.
7988

8089
The following example shows not all messages matched the provided grok pattern:
8190

@@ -85,34 +94,38 @@ You can filter your documents by selecting **Parsed** or **Failed** at the top o
8594

8695
![alt text](<failures.png>)
8796

88-
Any failures are displayed at the bottom of the process editor:
97+
Failures are displayed at the bottom of the process editor:
8998

9099
![alt text](<processor-failures.png>)
91100

92101
These failures may be something you should address, but in some cases they also act as more of a warning.
93102

94-
**Map Conflicts**
95-
As part of processing, streams also checks for mapping conflicts. This is done by end-to-end simulation of the change. If a mapping conflict is detected, the processor is marked as failed and you'll see a failure message will the UI:
103+
### Mapping Conflicts
104+
105+
As part of processing, streams also checks for mapping conflicts by simulating the change end to end. If a mapping conflict is detected, streams marks the processor as failed and displays a failure message:
96106

97107
![alt text](<mapping-conflicts.png>)
98108

99109
## Processor statistics and detected fields [streams-stats-and-detected-fields]
100-
Once saved, the processor also gives you statistics at a quick glance to indicate how successful the processing was for this step and which fields were added.
110+
111+
Once saved, the processor also gives you a quick look at how successful the processing was for this step and which fields were added.
101112

102113
![alt text](<field-stats.png>)
103114

104115
## Advanced: How and where do these changes get applied to the underlying datastream? [streams-applied-changes]
105-
When you save processors, streams modifies the ‘best matching’ ingest pipeline for the data stream. In short, streams chooses the best matching pipeline ending in “@custom” that is already part of your data stream or adds one for you.
106116

107-
Streams identifies the appropriate @custom pipeline (for example, logs-myintegration@custom or logs@custom).
117+
When you save processors, streams modifies the "best matching" ingest pipeline for the data stream. In short, streams either chooses the best matching pipeline ending in `@custom` that is already part of your data stream, or it adds one for you.
118+
119+
Streams identifies the appropriate @custom pipeline (for example, `logs-myintegration@custom` or `logs@custom`).
108120
It checks the default_pipeline that is set on the datastream.
109121

110-
You can also view the default pipeline in the **Advanced** tab under `Ingest pipeline`.
111-
In this default pipeline, we locate the last processor that calls a pipeline ending in “@custom”. For integrations, this would result in a pipeline name like `logs-myintegration@custom`. When not using the integration, the only @custom pipeline available may be `logs@custom`.
112-
- If no default pipeline is detected, a default pipeline will be added to the data stream by updating the index templates.
113-
- If a default pipeline is detected, but it does not contain a custom pipeline, the pipeline processor is added to the pipeline directly.
122+
You can view the default pipeline at **Management****Advanced** under **Ingest pipeline**.
123+
In this default pipeline, we locate the last processor that calls a pipeline ending in `@custom`. For integrations, this would result in a pipeline name like `logs-myintegration@custom`. Without an integration, the only `@custom` pipeline available may be `logs@custom`.
114124

115-
Streams then adds a pipeline processor to the end of that @custom pipeline. This processor definition directs matching documents to a dedicated pipeline managed by streams called `<data_stream_name>@stream.processing`:
125+
- If no default pipeline is detected, streams adds a default pipeline to the data stream by updating the index templates.
126+
- If a default pipeline is detected, but it does not contain a custom pipeline, streams adds the pipeline processor directly to the pipeline.
127+
128+
Streams then adds a pipeline processor to the end of that `@custom` pipeline. This processor definition directs matching documents to a dedicated pipeline managed by streams called `<data_stream_name>@stream.processing`:
116129

117130
// Example processor added to the relevant @custom pipeline
118131
{
@@ -126,11 +139,13 @@ Streams then adds a pipeline processor to the end of that @custom pipeline. This
126139

127140
Streams then creates and manages the `<data_stream_name>@stream.processing` pipeline, placing the processors you configured in the UI (Grok, Set, etc.) inside it.
128141

129-
**User interaction with pipelines**:
142+
### User interaction with pipelines
143+
130144
Do not manually modify the `<data_stream_name>@stream.processing` pipeline created by streams.
131-
You can still add your own processors manually to the @custom pipeline if needed. Any processors you add before the pipeline processor streams created may effect the behavior in unexpected ways.
145+
You can still add your own processors manually to the `@custom` pipeline if needed. Adding processors before the pipeline processor streams created may cause unexpected behavior.
132146

133147
## Known limitations [streams-known-limitations]
148+
134149
- The UI does not support all processors. We are working on adding more processors in the future.
135150
- The UI does not support all processor options. We are working on adding more options in the future.
136151
- The simulation may not accurately reflect the changes to the existing data when editing existing processors or re-ordering them.

0 commit comments

Comments
 (0)