feat: per-task database sessions to prevent session state leakage #637

drazisil-codecov · 2026-01-07T18:11:34Z

Summary

Addresses "session in prepared state" errors by creating isolated database sessions for each task execution instead of sharing a global scoped session.

Problem

When tasks share a global scoped session (get_db_session()), session state can leak between tasks:

A transaction gets partially prepared (2-phase commit state)
The session is reused by another task before cleanup
The new task tries to use the corrupted session → InvalidRequestError: session in 'prepared' state

Solution

Create isolated per-task sessions using a new create_task_session() function:

Each task gets its own session in run()
Session is properly cleaned up in finally block (rollback + close)
Routing lookups in apply_async() and route_task() also use temporary sessions

Changes

File	Change
`database/engine.py`	Add `TaskSessionManager`, `create_task_session()`, `set_test_session_factory()`
`tasks/base.py`	Use per-task sessions in `run()` and `apply_async()` with proper cleanup
`celery_task_router.py`	Use temporary session for routing lookups
`tasks/tests/utils.py`	Update `hook_session()` to configure test session factory
Test files	Update mocks to use `create_task_session`

Minimal Scope

This PR is intentionally minimal (6 files vs 22 in the original). Services that call obj.get_db_session() will naturally get the correct task session because objects queried via the task session are bound to it.

Explicit db_session=db_session plumbing can be added in a follow-up PR if needed.

Testing

Updated unit tests to mock create_task_session instead of get_db_session
Added set_test_session_factory() for test session injection
hook_session() utility updated to work with new architecture

Note

Introduces isolated SQLAlchemy sessions per worker task and uses temporary sessions for routing to prevent session state leakage.

Add TaskSessionManager with create_task_session() and set_test_session_factory() in database/engine.py
Update BaseCodecovTask: use per-task session in run() (commit on success, robust rollback/close in _cleanup_task_session); apply_async() now does routing lookups with a temp session; preserve headers/metrics
Update celery_task_router.route_task to use a temporary session for plan/owner lookups with safe cleanup
Allow optional db_session plumbing in services.timeseries.upsert_components_measurements and tasks.save_commit_measurements
Revise tests to hook and mock create_task_session; add test utilities to inject shared test session and prevent test-transaction cleanup

^{Written by Cursor Bugbot for commit 640146d. This will update automatically on new commits. Configure here.}

apps/worker/tasks/tests/utils.py

apps/worker/tasks/base.py

Addresses 'session in prepared state' errors by creating isolated database sessions for each task execution instead of sharing a global scoped session. Changes: - Add TaskSessionManager and create_task_session() to database/engine.py - Update BaseCodecovTask.run() to create per-task sessions with proper cleanup - Update apply_async() and route_task() to use temporary routing sessions - Add set_test_session_factory() for test session injection - Update test utilities and mocks to work with new session architecture This is a minimal change that relies on existing fallback patterns in services (db_session = obj.get_db_session() if db_session is None). Objects queried via the task session will naturally use it.

apps/worker/tasks/base.py

- Always cleanup test session factory (prevent test pollution) - Add try-except to routing session cleanup (prevent connection leaks) - Handle SoftTimeLimitExceeded during commit (known edge case)

InterfaceError signals transient connection issues and should be retried rather than silently failing (consistent with DataError, IntegrityError, SQLAlchemyError handling).

- Reset test session factory at start of hook_session (defensive cleanup) - Mock rollback() to prevent interfering with test savepoints - Remove broken mocker.stopall override approach - Simplify cleanup logic

apps/worker/tasks/base.py

The TASK_CORE_RUNTIME metric is documented as excluding db commits, so the commit() call should be outside the timer block.

- Remove problematic in_transaction() and rollback() mocks - Mock wrap_up_task_session to prevent cleanup interference - Keep only necessary mocks: close(), commit() -> flush()

- Fix method name (was wrap_up_task_session, should be _cleanup_task_session) - Add BaseCodecovTask import - Apply ruff fixes

- Update test_commit_measurement_update_component_parallel to use hook_session - Update test_delete_repository_data_measurements_only to use hook_session - Update test_compute_component_comparisons_parallel to use hook_session - Pass db_session parameter to save_commit_measurements calls

…components_measurements These functions need the db_session parameter to work with per-task sessions, matching the pattern used in PR #618.

Without these mocks, the task's run() method would detect the test transaction and call rollback(), destroying all test data. This caused e2e tests to fail because their created data was rolled back before the tasks could use it.

drazisil-codecov · 2026-01-09T15:09:57Z

Closing as won't do - the scope of this change (per-task database sessions) is significant and the user impact from the "session in prepared state" errors doesn't justify the risk/effort at this time.

cursor bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/tests/utils.py Show resolved Hide resolved

apps/worker/tasks/base.py Outdated Show resolved Hide resolved

drazisil-codecov force-pushed the fix/per-task-db-sessions-minimal branch from 0946d17 to 9017e23 Compare January 7, 2026 18:34

sentry bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/base.py Show resolved Hide resolved

drazisil-codecov added 3 commits January 7, 2026 13:43

fix: address review feedback - session cleanup, timeout handling

b5d7f8f

- Always cleanup test session factory (prevent test pollution) - Add try-except to routing session cleanup (prevent connection leaks) - Handle SoftTimeLimitExceeded during commit (known edge case)

fix: retry InterfaceError with backoff like other DB errors

0711340

InterfaceError signals transient connection issues and should be retried rather than silently failing (consistent with DataError, IntegrityError, SQLAlchemyError handling).

fix: improve test session mocking for savepoint-based tests

7a9edea

- Reset test session factory at start of hook_session (defensive cleanup) - Mock rollback() to prevent interfering with test savepoints - Remove broken mocker.stopall override approach - Simplify cleanup logic

cursor bot reviewed Jan 7, 2026

View reviewed changes

apps/worker/tasks/base.py Show resolved Hide resolved

drazisil-codecov added 8 commits January 7, 2026 14:02

fix: move commit outside core runtime timer

58fbc40

The TASK_CORE_RUNTIME metric is documented as excluding db commits, so the commit() call should be outside the timer block.

fix: simplify test session mocking

286e74e

- Remove problematic in_transaction() and rollback() mocks - Mock wrap_up_task_session to prevent cleanup interference - Keep only necessary mocks: close(), commit() -> flush()

fix: correct mock for _cleanup_task_session method

3186b65

- Fix method name (was wrap_up_task_session, should be _cleanup_task_session) - Add BaseCodecovTask import - Apply ruff fixes

refactor: replace lambdas with named functions for readability

98161a0

fix: ruff check auto-fix

4e941fb

fix: add db_session parameter to save_commit_measurements and upsert_…

24e3836

…components_measurements These functions need the db_session parameter to work with per-task sessions, matching the pattern used in PR #618.

drazisil-codecov closed this Jan 9, 2026

drazisil-codecov deleted the fix/per-task-db-sessions-minimal branch January 9, 2026 15:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: per-task database sessions to prevent session state leakage #637

feat: per-task database sessions to prevent session state leakage #637

Uh oh!

drazisil-codecov commented Jan 7, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drazisil-codecov commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: per-task database sessions to prevent session state leakage #637

feat: per-task database sessions to prevent session state leakage #637

Uh oh!

Conversation

drazisil-codecov commented Jan 7, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Changes

Minimal Scope

Testing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drazisil-codecov commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

drazisil-codecov commented Jan 7, 2026 •

edited by cursor bot

Loading