Commit 1de3ee6
[fileconsumer] Compute fingerprint for compressed files by decompressing its data (open-telemetry#40256)
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
This PR changes how fingerprint for compressed files is computed. It now
decompresses first 'N' bytes to compute the fingerprint. This ensures
that data will not be re-ingested when compressed and uncompressed files
exist with the same content. One such case is when log files are rotated
and compressed.
<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes
open-telemetry#37772
<!--Describe what testing was performed and which tests were added.-->
#### Testing
Have `test.log` and `test.log.gz` in your path. If the content of the
compressed file is the same as that of the uncompressed file, the logs
will only be ingested once. Otherwise, we have two readers - one for
plain text and the other for gzipped file.
Start otelcollector-contrib with following config file
```
receivers:
filelog:
include: [ "./test.log*" ] # Path to log file
start_at: beginning # Read from the start of the file
include_file_path: true
include_file_name: true
compression: auto
exporters:
debug/1:
verbosity: detailed
service:
telemetry:
logs:
level: debug
pipelines:
logs:
receivers: [filelog]
exporters: [debug/1]
```
you can see only one set of logs are emitted
---------
Co-authored-by: Vihas Makwana <[email protected]>
Co-authored-by: Andrzej Stencel <[email protected]>1 parent ec280e5 commit 1de3ee6
File tree
6 files changed
+118
-10
lines changed- .chloggen
- pkg/stanza/fileconsumer/internal
- fingerprint
- reader
- receiver/filelogreceiver
6 files changed
+118
-10
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
Lines changed: 38 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
13 | 17 | | |
14 | 18 | | |
15 | 19 | | |
16 | 20 | | |
17 | 21 | | |
18 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
19 | 31 | | |
20 | 32 | | |
21 | 33 | | |
| |||
26 | 38 | | |
27 | 39 | | |
28 | 40 | | |
29 | | - | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
30 | 44 | | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
31 | 64 | | |
32 | 65 | | |
33 | 66 | | |
34 | 67 | | |
35 | 68 | | |
36 | 69 | | |
37 | 70 | | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
38 | 75 | | |
39 | 76 | | |
40 | 77 | | |
| |||
Lines changed: 35 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
| 10 | + | |
9 | 11 | | |
10 | 12 | | |
11 | 13 | | |
12 | 14 | | |
13 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
14 | 19 | | |
15 | 20 | | |
16 | 21 | | |
| |||
37 | 42 | | |
38 | 43 | | |
39 | 44 | | |
40 | | - | |
| 45 | + | |
41 | 46 | | |
42 | 47 | | |
43 | 48 | | |
| |||
131 | 136 | | |
132 | 137 | | |
133 | 138 | | |
134 | | - | |
| 139 | + | |
135 | 140 | | |
136 | 141 | | |
137 | 142 | | |
| |||
264 | 269 | | |
265 | 270 | | |
266 | 271 | | |
267 | | - | |
| 272 | + | |
268 | 273 | | |
269 | 274 | | |
270 | 275 | | |
| |||
282 | 287 | | |
283 | 288 | | |
284 | 289 | | |
285 | | - | |
| 290 | + | |
286 | 291 | | |
287 | 292 | | |
288 | 293 | | |
| |||
308 | 313 | | |
309 | 314 | | |
310 | 315 | | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
| 55 | + | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
| 101 | + | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
320 | 320 | | |
321 | 321 | | |
322 | 322 | | |
323 | | - | |
| 323 | + | |
324 | 324 | | |
325 | 325 | | |
326 | 326 | | |
| |||
343 | 343 | | |
344 | 344 | | |
345 | 345 | | |
346 | | - | |
| 346 | + | |
347 | 347 | | |
348 | 348 | | |
349 | 349 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
| 67 | + | |
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
| |||
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
0 commit comments