Skip to content

Commit 0212d63

Browse files
Merge pull request #1607 from fluent/alexakreizinger/sc-123153/update-local-testing-validating-your-data
2 parents 5a8d299 + 774d72b commit 0212d63

File tree

1 file changed

+23
-40
lines changed

1 file changed

+23
-40
lines changed

local-testing/validating-your-data-and-structure.md

Lines changed: 23 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,34 @@
1-
# Validating your data and structure
1+
# Validate your data and structure
22

3-
Fluent Bit is a powerful log processing tool that supports mulitple sources and
4-
formats. In addition, it provides filters that can be used to perform custom
5-
modifications. As your pipeline grows, it's important to validate your data and
6-
structure.
3+
Fluent Bit supports multiple sources and formats. In addition, it provides filters that you can use to perform custom modifications. As your pipeline grows, it's important to validate your data and structure.
74

8-
Fluent Bit users are encouraged to integrate data validation in their contininuous
9-
integration (CI) systems.
5+
Fluent Bit users are encouraged to integrate data validation in their continuous integration (CI) systems.
106

11-
In a normal production environment, inputs, filters, and outputs are defined in the
12-
configuration. Fluent Bit provides the [Expect](../pipeline/filters/expect.md) filter,
13-
which can be used to validate `keys` and `values` from your records and take action
14-
when an exception is found.
7+
In a normal production environment, inputs, filters, and outputs are defined in configuration files. Fluent Bit provides the [Expect](../pipeline/filters/expect.md) filter, which you can use to validate keys and values from your records and take action when an exception is found.
158

169
A simplified view of the data processing pipeline is as follows:
1710

1811
```mermaid
1912
flowchart LR
2013
IS[Inputs / Sources]
2114
Fil[Filters]
22-
OD[Outputs/ Destination]
15+
OD[Outputs / Destination]
2316
IS --> Fil --> OD
2417
```
2518

2619
## Understand structure and configuration
2720

28-
Consider the following pipeline, where your source of data is a file with JSON
29-
content and two filters:
21+
Consider the following pipeline, which uses a JSON file as its data source and has two filters:
3022

31-
- [grep](../pipeline/filters/grep.md) to exclude certain records
32-
- [record_modifier](../pipeline/filters/record-modifier.md) to alter the record
33-
content by adding and removing specific keys.
23+
- [Grep](../pipeline/filters/grep.md) to exclude certain records.
24+
- [Record Modifier](../pipeline/filters/record-modifier.md) to alter records' content by adding and removing specific keys.
3425

3526
```mermaid
3627
flowchart LR
37-
tail["tail (input)"]
38-
grep["grep (filter)"]
39-
record["record_modifier (filter)"]
40-
stdout["stdout (output)"]
28+
tail["Tail (input)"]
29+
grep["Grep (filter)"]
30+
record["Record Modifier (filter)"]
31+
stdout["Stdout (output)"]
4132
4233
tail --> grep
4334
grep --> record
@@ -46,7 +37,7 @@ record --> stdout
4637

4738
Add data validation between each step to ensure your data structure is correct.
4839

49-
This example uses the `expect` filter.
40+
This example uses the [Expect](/pipeline/filters/expect) filter.
5041

5142
```mermaid
5243
flowchart LR
@@ -61,16 +52,15 @@ tail --> E1 --> grep
6152
grep --> E2 --> record --> E3 --> stdout
6253
```
6354

64-
`Expect` filters set rules aiming to validate criteria like:
55+
Expect filters set rules aiming to validate criteria like:
6556

66-
- Does the record contain a key `A`?
57+
- Does the record contain key `A`?
6758
- Does the record not contain key `A`?
68-
- Does the record key `A` value equal `NULL`?
69-
- Is the record key `A` value not `NULL`?
70-
- Does the record key `A` value equal `B`?
59+
- Does the key `A` value equal `NULL`?
60+
- Is the key `A` value not `NULL`?
61+
- Does the key `A` value equal `B`?
7162

72-
Every `expect` filter configuration exposes rules to validate the content of your
73-
records using [configuration properties](../pipeline/filters/expect.md#configuration-parameters).
63+
Every Expect filter configuration exposes rules to validate the content of your records using [configuration parameters](../pipeline/filters/expect.md#configuration-parameters).
7464

7565
## Test the configuration
7666

@@ -82,9 +72,7 @@ Consider a JSON file `data.log` with the following content:
8272
{"color": "green", "label": {"name": "abc"}, "meta": null}
8373
```
8474

85-
The following Fluent Bit configuration file configures a pipeline to consume the
86-
log, while applying an `expect` filter to validate that the keys `color` and `label`
87-
exist:
75+
The following Fluent Bit configuration file configures a pipeline to consume the log, while applying an Expect filter to validate that the keys `color` and `label` exist:
8876

8977
```python
9078
[SERVICE]
@@ -111,12 +99,9 @@ exist:
11199
match *
112100
```
113101

114-
If the JSON parser fails or is missing in the `tail` input
115-
(`parser json`), the `expect` filter triggers the `exit` action.
102+
If the JSON parser fails or is missing in the [Tail](/pipeline/inputs/tail) input (`parser json`), the Expect filter triggers the `exit` action.
116103

117-
To extend the pipeline, add a grep filter to match records that map `label`
118-
containing a key called `name` with value the `abc`, and add an `expect` filter
119-
to re-validate that condition:
104+
To extend the pipeline, add a Grep filter to match records that map `label` containing a key called `name` with value the `abc`, and add an Expect filter to re-validate that condition:
120105

121106
```python
122107
[SERVICE]
@@ -171,6 +156,4 @@ to re-validate that condition:
171156

172157
## Production deployment
173158

174-
When deploying in production, consider removing the `expect` filters from your
175-
configuration. These filters are unneccesary unless you need 100% coverage of
176-
checks at runtime.
159+
When deploying in production, consider removing any Expect filters from your configuration file. These filters are unnecessary unless you need 100% coverage of checks at runtime.

0 commit comments

Comments
 (0)