Skip to content

Conversation

@drazisil-codecov
Copy link
Contributor

@drazisil-codecov drazisil-codecov commented Jan 7, 2026

Summary

Addresses "session in prepared state" errors by creating isolated database sessions for each task execution instead of sharing a global scoped session.

Problem

When tasks share a global scoped session (get_db_session()), session state can leak between tasks:

  1. A transaction gets partially prepared (2-phase commit state)
  2. The session is reused by another task before cleanup
  3. The new task tries to use the corrupted session → InvalidRequestError: session in 'prepared' state

Solution

Create isolated per-task sessions using a new create_task_session() function:

  • Each task gets its own session in run()
  • Session is properly cleaned up in finally block (rollback + close)
  • Routing lookups in apply_async() and route_task() also use temporary sessions

Changes

File Change
database/engine.py Add TaskSessionManager, create_task_session(), set_test_session_factory()
tasks/base.py Use per-task sessions in run() and apply_async() with proper cleanup
celery_task_router.py Use temporary session for routing lookups
tasks/tests/utils.py Update hook_session() to configure test session factory
Test files Update mocks to use create_task_session

Minimal Scope

This PR is intentionally minimal (6 files vs 22 in the original). Services that call obj.get_db_session() will naturally get the correct task session because objects queried via the task session are bound to it.

Explicit db_session=db_session plumbing can be added in a follow-up PR if needed.

Testing

  • Updated unit tests to mock create_task_session instead of get_db_session
  • Added set_test_session_factory() for test session injection
  • hook_session() utility updated to work with new architecture

Note

Introduces isolated SQLAlchemy sessions per worker task and uses temporary sessions for routing to prevent session state leakage.

  • Add TaskSessionManager with create_task_session() and set_test_session_factory() in database/engine.py
  • Update BaseCodecovTask: use per-task session in run() (commit on success, robust rollback/close in _cleanup_task_session); apply_async() now does routing lookups with a temp session; preserve headers/metrics
  • Update celery_task_router.route_task to use a temporary session for plan/owner lookups with safe cleanup
  • Allow optional db_session plumbing in services.timeseries.upsert_components_measurements and tasks.save_commit_measurements
  • Revise tests to hook and mock create_task_session; add test utilities to inject shared test session and prevent test-transaction cleanup

Written by Cursor Bugbot for commit 640146d. This will update automatically on new commits. Configure here.

Addresses 'session in prepared state' errors by creating isolated database
sessions for each task execution instead of sharing a global scoped session.

Changes:
- Add TaskSessionManager and create_task_session() to database/engine.py
- Update BaseCodecovTask.run() to create per-task sessions with proper cleanup
- Update apply_async() and route_task() to use temporary routing sessions
- Add set_test_session_factory() for test session injection
- Update test utilities and mocks to work with new session architecture

This is a minimal change that relies on existing fallback patterns in
services (db_session = obj.get_db_session() if db_session is None).
Objects queried via the task session will naturally use it.
@drazisil-codecov drazisil-codecov force-pushed the fix/per-task-db-sessions-minimal branch from 0946d17 to 9017e23 Compare January 7, 2026 18:34
- Always cleanup test session factory (prevent test pollution)
- Add try-except to routing session cleanup (prevent connection leaks)
- Handle SoftTimeLimitExceeded during commit (known edge case)
InterfaceError signals transient connection issues and should be
retried rather than silently failing (consistent with DataError,
IntegrityError, SQLAlchemyError handling).
- Reset test session factory at start of hook_session (defensive cleanup)
- Mock rollback() to prevent interfering with test savepoints
- Remove broken mocker.stopall override approach
- Simplify cleanup logic
The TASK_CORE_RUNTIME metric is documented as excluding db commits,
so the commit() call should be outside the timer block.
- Remove problematic in_transaction() and rollback() mocks
- Mock wrap_up_task_session to prevent cleanup interference
- Keep only necessary mocks: close(), commit() -> flush()
- Fix method name (was wrap_up_task_session, should be _cleanup_task_session)
- Add BaseCodecovTask import
- Apply ruff fixes
- Update test_commit_measurement_update_component_parallel to use hook_session
- Update test_delete_repository_data_measurements_only to use hook_session
- Update test_compute_component_comparisons_parallel to use hook_session
- Pass db_session parameter to save_commit_measurements calls
…components_measurements

These functions need the db_session parameter to work with per-task sessions,
matching the pattern used in PR #618.
Without these mocks, the task's run() method would detect the test
transaction and call rollback(), destroying all test data. This caused
e2e tests to fail because their created data was rolled back before
the tasks could use it.
@drazisil-codecov
Copy link
Contributor Author

Closing as won't do - the scope of this change (per-task database sessions) is significant and the user impact from the "session in prepared state" errors doesn't justify the risk/effort at this time.

@drazisil-codecov drazisil-codecov deleted the fix/per-task-db-sessions-minimal branch January 9, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants