fix(worker): Improve Sentry visibility for worker task crashes #636

drazisil-codecov · 2026-01-07T16:18:17Z

Summary

Improves Sentry visibility when worker tasks crash, making it easier to debug production issues.

Changes

Signal Handlers (celery_config.py)

worker_process_init - Initialize Sentry in each forked worker process
worker_shutting_down - Flush pending Sentry events before shutdown (SIGTERM)
task_prerun - Set task context (args/kwargs) for all Sentry events

Failure/Timeout Capture (tasks/base.py)

on_failure hook - Capture exceptions to Sentry with failure_type=task_failure tag
on_timeout hook - Capture hard timeouts to Sentry with immediate flush (before SIGKILL)

Cleanup

Remove redundant comments in exception handlers

Why

When worker tasks crash (especially hard timeouts leading to SIGKILL), Sentry trace data wasn't being sent, making it difficult to debug production issues. These changes ensure:

Sentry is properly initialized in forked workers
Events are flushed before graceful shutdown
Hard timeouts get captured with immediate flush before SIGKILL
Task context (args, kwargs, retries, attempts) is available in Sentry events

Testing

All 55 existing tests pass
Code paths are exercised by test_sample_task_hard_timeout and test_sample_task_failure

Note

Improves observability of worker task failures and timeouts via Sentry.

Add Celery signal handlers in celery_config.py: re-initialize Sentry per forked worker (worker_process_init), flush events on shutdown (worker_shutting_down), and set task context (task_prerun) with safe stringified args/kwargs
In BaseCodecovRequest.on_timeout, capture hard timeouts to Sentry with failure_type=hard_timeout and immediate flush, plus metrics increments
Minor cleanup: remove redundant comments in exception handlers; wire Sentry helpers (initialize_sentry, is_sentry_enabled) and sentry_sdk usage

^{Written by Cursor Bugbot for commit 1adefc4. This will update automatically on new commits. Configure here.}

- Add signal handlers for Sentry initialization in forked workers - Flush Sentry events on graceful worker shutdown (SIGTERM) - Set task context (args/kwargs) at task start for debugging - Capture task failures to Sentry with failure_type tag - Capture hard timeouts to Sentry with immediate flush before SIGKILL - Clean up redundant comments in exception handlers This ensures crash data is sent to Sentry even when workers die unexpectedly, making it easier to debug production issues.

codecov-notifications · 2026-01-07T16:25:23Z

Codecov Report

❌ Patch coverage is 68.96552% with 9 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
apps/worker/celery_config.py	63.15%	7 Missing ⚠️
apps/worker/tasks/base.py	80.00%	2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (68.96%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

📢 Thoughts on this report? Let us know!

sentry · 2026-01-07T16:25:56Z

Codecov Report

❌ Patch coverage is 68.96552% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.89%. Comparing base (0360937) to head (1adefc4).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
apps/worker/celery_config.py	63.15%	7 Missing ⚠️
apps/worker/tasks/base.py	80.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #636      +/-   ##
==========================================
- Coverage   93.90%   93.89%   -0.02%     
==========================================
  Files        1286     1286              
  Lines       46804    46833      +29     
  Branches     1517     1517              
==========================================
+ Hits        43953    43973      +20     
- Misses       2542     2551       +9     
  Partials      309      309

Flag	Coverage Δ
workerintegration	`59.14% <44.82%> (-0.03%)`	⬇️
workerunit	`91.26% <68.96%> (-0.05%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

apps/worker/tasks/base.py

Address Bugbot review: _capture_failure_to_sentry could throw and block super().on_failure(), metrics, and flow logging. Changes: - Move Sentry capture AFTER super().on_failure() (consistent with on_timeout) - Wrap Sentry capture in try-except for resilience - Add safe_str() helper to handle objects with broken __str__

apps/worker/tasks/base.py

apps/worker/celery_config.py

- Wrap _capture_hard_timeout_to_sentry in on_timeout with try-except - Add _safe_str helper to celery_config.py for signal handler - Ensures metrics and flow logging are not blocked by Sentry failures

apps/worker/tasks/base.py

…y events CeleryIntegration already captures task failures automatically. We now only enrich the context with task args/kwargs/retries without calling capture_exception, avoiding duplicate events per Sentry issue #433.

apps/worker/tasks/base.py

The _enrich_sentry_context_for_failure method was setting Sentry context in on_failure, but CeleryIntegration captures exceptions when they're raised (before on_failure is called), so the enrichment was ineffective. Changes: - Remove _enrich_sentry_context_for_failure method and its call - Add 'retries' to set_sentry_task_context at task_prerun where it will be captured with exception events - Update docstring to clarify why context is set at prerun This addresses the review feedback that Sentry context enrichment occurs after the exception is already captured.

drazisil-codecov requested review from a team and thomasrockhu-codecov January 7, 2026 16:21

cursor bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/base.py Outdated Show resolved Hide resolved

sentry bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/base.py Outdated Show resolved Hide resolved

cursor bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/base.py Outdated Show resolved Hide resolved

apps/worker/celery_config.py Outdated Show resolved Hide resolved

fix: wrap Sentry capture calls in try-except for consistency

dc85df4

- Wrap _capture_hard_timeout_to_sentry in on_timeout with try-except - Add _safe_str helper to celery_config.py for signal handler - Ensures metrics and flow logging are not blocked by Sentry failures

cursor bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/base.py Outdated Show resolved Hide resolved

cursor bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/base.py Outdated Show resolved Hide resolved

adrianviquez approved these changes Jan 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(worker): Improve Sentry visibility for worker task crashes #636

fix(worker): Improve Sentry visibility for worker task crashes #636

drazisil-codecov commented Jan 7, 2026 •

edited by cursor bot

Loading

Uh oh!

codecov-notifications bot commented Jan 7, 2026 •

edited

Loading

Uh oh!

sentry bot commented Jan 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(worker): Improve Sentry visibility for worker task crashes #636

Are you sure you want to change the base?

fix(worker): Improve Sentry visibility for worker task crashes #636

Conversation

drazisil-codecov commented Jan 7, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Signal Handlers (celery_config.py)

Failure/Timeout Capture (tasks/base.py)

Cleanup

Why

Testing

Uh oh!

codecov-notifications bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sentry bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drazisil-codecov commented Jan 7, 2026 •

edited by cursor bot

Loading

codecov-notifications bot commented Jan 7, 2026 •

edited

Loading

sentry bot commented Jan 7, 2026 •

edited

Loading