Commit 457048e
committed
Critical fix: Add panic recovery and reduce batch size to prevent OOM
Problem:
- Archive worker silently died after context timeout at 02:39 on 2025-10-11
- Batch size was increased to 100 stories (from original 10)
- Processing 100 stories with large datasets caused OOM
- No panic recovery meant workers could crash silently
- Worker stopped running for days, database grew unchecked
Root cause:
- Batch size of 100 stories × 10 parallel workers × large JSON = memory exhaustion
- Context deadline exceeded during JSON generation for large datasets
- No panic recovery meant any panic would kill the worker goroutine
- Worker died silently with no indication in logs
Fixes:
1. Reduced batch size from 100 back to 5 stories
- Processes fewer stories per cycle but much safer
- Prevents memory exhaustion
- Each cycle completes faster, reducing timeout risk
2. Added panic recovery to archiveWorker
- Logs panic details before exiting
- Prevents silent failures
3. Added panic recovery to purgeWorker
- Same protection for consistency
4. Added panic recovery to pool tasks
- Prevents one story's panic from crashing entire batch
- Failed stories logged and skipped, others continue
Expected behavior after fix:
- Archive worker processes 5 stories every 5 minutes (safer)
- If panic occurs, it's logged and visible
- Worker crashes are visible in logs
- Memory usage stays under control1 parent 6a17462 commit 457048e
2 files changed
+29
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
211 | 211 | | |
212 | 212 | | |
213 | 213 | | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
214 | 223 | | |
215 | 224 | | |
216 | 225 | | |
217 | 226 | | |
218 | 227 | | |
219 | 228 | | |
220 | | - | |
| 229 | + | |
221 | 230 | | |
222 | 231 | | |
223 | 232 | | |
224 | 233 | | |
225 | 234 | | |
226 | 235 | | |
227 | 236 | | |
228 | | - | |
| 237 | + | |
229 | 238 | | |
230 | 239 | | |
231 | 240 | | |
| |||
260 | 269 | | |
261 | 270 | | |
262 | 271 | | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
263 | 280 | | |
264 | 281 | | |
265 | 282 | | |
| |||
316 | 333 | | |
317 | 334 | | |
318 | 335 | | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
319 | 344 | | |
320 | 345 | | |
321 | 346 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
385 | 385 | | |
386 | 386 | | |
387 | 387 | | |
| 388 | + | |
388 | 389 | | |
389 | 390 | | |
390 | 391 | | |
391 | 392 | | |
392 | 393 | | |
393 | 394 | | |
394 | | - | |
| 395 | + | |
395 | 396 | | |
396 | 397 | | |
397 | 398 | | |
| |||
0 commit comments