Commit f879c02
[Streams] Consistent Grok pattern generation (#230076)
Resolves elastic/streams-program#314
## Summary
Updates the Grok pattern suggestions feature to use heuristics when
constructing Grok patterns in order to improve reliability. Once a Grok
pattern has been created an LLM is used to determine ECS field names.
Introduces a new shared package `@kbn/grok-heuristics` which exposes
Grok pattern extraction and grouping utilities for log/message analysis.
## Acceptance criteria
- Generate pattern button should create one GROK processor suggestion
showing parse rate and field metrics
- The user can accept the suggestion or dismiss it
- When accepting the suggestion any existing patterns and pattern
definitions are replaced with the suggested grok processor
- When multiple log formats are detected in the dataset the suggested
processor should contain a list of fallback patterns, one for each log
format. The maximum number of patterns returned is limited to 3.
- For each suggested Grok pattern the UI should call the selected LLM to
suggest field names and merge any fields that have been broken up too
granular
- The Grok pattern should extract fields to ECS convention for classic
streams or Otel convention for wired streams. Custom timestamps should
never be extracted to `@timestamp` to avoid date format conflicts.
- Fields that span at least 2 Grok components should be extracted into
their own pattern definition
- When multiple conflicting pattern definitions are returned by the LLM
should resolve them by appending a counter (e.g. CUSTOM_TIMESTAMP2 for
the second pattern definition)
## Screenshot
<img width="1363" height="1086" alt="Screenshot 2025-08-13 at 08 39 48"
src="https://github.com/user-attachments/assets/71791309-a2b0-4769-8ffd-d3fa3ea11c74"
/>
## Evaluations
| Stream | Before (all docs) | After (all docs) |
|--------------------|-------------------------|------------------------|
| `logs.android` | 0% | 100% |
| `logs.apache` | 100% | 100% |
| `logs.bgl` | 0% | 100% |
| `logs.hadoop` | 100% | 100% |
| `logs.hdfs` | 0% | 100% |
| `logs.healthapp` | 100% | 100% |
| `logs.hpc` | 100% | 100% |
| `logs.linux` | 100% | 100% |
| `logs.mac` | 100% | 100% |
| `logs.openssh` | 71.9% | 100% |
| `logs.openstack` | 100% | 100% |
| `logs.proxifier` | 51.0% | 99.8% |
| `logs.spark` | 99.6% | 99.4% |
| `logs.thunderbird` | 58.1% | 95.2% |
| `logs.windows` | 100% | 100% |
| `logs.zookeeper` | 100% | 100% |
| Metric | Before | After |
|-------------------------------|--------|--------|
| Average Parsing Score (samples) | 75.2% | 100% |
| Average Parsing Score (all docs) | 73.8% | 99.7% |
---------
Co-authored-by: Dario Gieselaar <[email protected]>
Co-authored-by: kibanamachine <[email protected]>1 parent 1a31fab commit f879c02
File tree
55 files changed
+3661
-1924
lines changed- .github
- x-pack
- platform
- packages
- private/kbn-evals-suite-streams
- scripts
- solutions/observability/test/serverless/api_integration/test_suites/logs_essentials_only
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
55 files changed
+3661
-1924
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
889 | 889 | | |
890 | 890 | | |
891 | 891 | | |
| 892 | + | |
892 | 893 | | |
893 | 894 | | |
894 | 895 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
595 | 595 | | |
596 | 596 | | |
597 | 597 | | |
| 598 | + | |
598 | 599 | | |
599 | 600 | | |
600 | 601 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1110 | 1110 | | |
1111 | 1111 | | |
1112 | 1112 | | |
| 1113 | + | |
| 1114 | + | |
1113 | 1115 | | |
1114 | 1116 | | |
1115 | 1117 | | |
| |||
2342 | 2344 | | |
2343 | 2345 | | |
2344 | 2346 | | |
2345 | | - | |
| 2347 | + | |
| 2348 | + | |
| 2349 | + | |
2346 | 2350 | | |
2347 | 2351 | | |
2348 | | - | |
2349 | | - | |
2350 | | - | |
2351 | | - | |
| 2352 | + | |
| 2353 | + | |
| 2354 | + | |
| 2355 | + | |
| 2356 | + | |
| 2357 | + | |
| 2358 | + | |
| 2359 | + | |
| 2360 | + | |
| 2361 | + | |
| 2362 | + | |
| 2363 | + | |
2352 | 2364 | | |
2353 | 2365 | | |
2354 | 2366 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | 26 | | |
28 | 27 | | |
29 | 28 | | |
30 | 29 | | |
31 | 30 | | |
32 | 31 | | |
33 | | - | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
34 | 38 | | |
35 | 39 | | |
36 | 40 | | |
| |||
72 | 76 | | |
73 | 77 | | |
74 | 78 | | |
75 | | - | |
| 79 | + | |
76 | 80 | | |
77 | 81 | | |
78 | 82 | | |
79 | | - | |
| 83 | + | |
80 | 84 | | |
81 | 85 | | |
82 | 86 | | |
| |||
85 | 89 | | |
86 | 90 | | |
87 | 91 | | |
88 | | - | |
| 92 | + | |
89 | 93 | | |
90 | 94 | | |
91 | 95 | | |
92 | | - | |
| 96 | + | |
93 | 97 | | |
94 | 98 | | |
95 | 99 | | |
96 | 100 | | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
107 | 119 | | |
108 | 120 | | |
109 | 121 | | |
110 | | - | |
| 122 | + | |
111 | 123 | | |
112 | 124 | | |
113 | 125 | | |
| |||
118 | 130 | | |
119 | 131 | | |
120 | 132 | | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
126 | 138 | | |
| 139 | + | |
127 | 140 | | |
128 | 141 | | |
129 | | - | |
| 142 | + | |
130 | 143 | | |
131 | | - | |
132 | | - | |
| 144 | + | |
| 145 | + | |
133 | 146 | | |
134 | | - | |
| 147 | + | |
135 | 148 | | |
136 | 149 | | |
137 | 150 | | |
138 | | - | |
| 151 | + | |
139 | 152 | | |
140 | 153 | | |
141 | 154 | | |
| |||
172 | 185 | | |
173 | 186 | | |
174 | 187 | | |
175 | | - | |
| 188 | + | |
176 | 189 | | |
177 | 190 | | |
178 | 191 | | |
| |||
181 | 194 | | |
182 | 195 | | |
183 | 196 | | |
184 | | - | |
185 | | - | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
186 | 212 | | |
187 | 213 | | |
188 | 214 | | |
189 | | - | |
| 215 | + | |
190 | 216 | | |
191 | 217 | | |
192 | 218 | | |
| |||
232 | 258 | | |
233 | 259 | | |
234 | 260 | | |
235 | | - | |
| 261 | + | |
236 | 262 | | |
237 | 263 | | |
238 | 264 | | |
| |||
243 | 269 | | |
244 | 270 | | |
245 | 271 | | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | 272 | | |
270 | 273 | | |
271 | 274 | | |
| |||
278 | 281 | | |
279 | 282 | | |
280 | 283 | | |
281 | | - | |
282 | | - | |
283 | | - | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
288 | | - | |
289 | | - | |
290 | | - | |
291 | | - | |
292 | | - | |
293 | | - | |
294 | 284 | | |
295 | 285 | | |
296 | | - | |
297 | 286 | | |
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
12 | 14 | | |
13 | 15 | | |
Lines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
Lines changed: 15 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
Lines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
0 commit comments