Commit 62bf5e5
authored
[ AGNTLOG-462 ] Fix auditor Flush() race condition during transport restart (#46882)
## What does this PR do?
In response to flaky CI test failures in `TestRestartTestSuite`, this PR fixes a race condition in the auditor's `Flush()` method that caused stale offsets to be written to disk during transport restarts.
Previously, `Flush()` wrote the in-memory registry directly, missing payloads that destinations had already sent to the auditor channel but the `run()` goroutine hadn't consumed yet. This caused stale offsets on disk after `partialStop()`, leading to duplicate log processing after a TCP-to-HTTP transport restart.
`Flush()` now sends a synchronous request through the auditor's `run()` goroutine event loop. The goroutine drains buffered `inputChan` payloads (bounded by a `len()` snapshot), updates the in-memory registry, then writes to disk. When the auditor is stopped, it falls back to a direct `flushRegistry()` call.
## Motivation
Resolves AGNTLOG-462. `TestRestartTestSuite` sub-tests (`TestPartialStop_FlushesRegistryToDisk`, `TestRestart_FlushesAuditor`) were failing intermittently on macOS ARM64 and IoT Linux x64 CI runners. Investigation revealed the test failures exposed a real production race condition: the `LogsSent` metric is incremented by destinations before payloads reach the auditor's `inputChan`, so `partialStop()` calling `Flush()` immediately after stopping destinations could write a registry missing the latest offsets.
## Describe how you validated your changes
Existing automated tests were relied on.
## Additional Notes
- `Flush()` is now blocking (waits for the `run()` goroutine to complete drain + write). All existing callers already treated it as synchronous.
- Must not be called concurrently with `Stop()` (not a new constraint -- this was already the case).
Co-authored-by: ryan.hall <ryan.hall@datadoghq.com>1 parent 73423a7 commit 62bf5e5
File tree
2 files changed
+56
-5
lines changed- comp/logs/auditor/impl
- releasenotes/notes
2 files changed
+56
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| 57 | + | |
57 | 58 | | |
58 | 59 | | |
59 | 60 | | |
| |||
136 | 137 | | |
137 | 138 | | |
138 | 139 | | |
139 | | - | |
140 | | - | |
141 | | - | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
142 | 146 | | |
143 | | - | |
144 | | - | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
145 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
146 | 161 | | |
147 | 162 | | |
148 | 163 | | |
149 | 164 | | |
150 | 165 | | |
151 | 166 | | |
| 167 | + | |
152 | 168 | | |
153 | 169 | | |
154 | 170 | | |
| |||
164 | 180 | | |
165 | 181 | | |
166 | 182 | | |
| 183 | + | |
167 | 184 | | |
168 | 185 | | |
169 | 186 | | |
| |||
293 | 310 | | |
294 | 311 | | |
295 | 312 | | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
296 | 332 | | |
297 | 333 | | |
298 | 334 | | |
| |||
Lines changed: 15 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
0 commit comments