Skip to content

Conversation

@seer-by-sentry
Copy link
Contributor

@seer-by-sentry seer-by-sentry bot commented Jan 6, 2026

Fixes API-B9G. The issue was that: Concurrent requests bypass non-atomic existence check, creating duplicate CommitReports, which are then deleted by a cleanup process before the ReportSession can access its foreign key.

  • Refactored CommitReport creation logic in the upload serializer to use Django transactions.
  • Introduced transaction.atomic() to ensure atomicity during report lookup and creation.
  • Uses select_for_update() to safely lock and update legacy reports (where report_type is None) to the correct type.
  • Utilizes get_or_create() for standard report creation/retrieval, eliminating potential race conditions.

This fix was generated by Seer in Sentry, triggered by Joe Becher. 👁️ Run ID: 8242046

Not quite right? Click here to continue debugging with Seer.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.


Note

Prevents duplicate CommitReport rows during upload by making creation atomic and handling legacy records.

  • Refactors CommitReportSerializer.create to use transaction.atomic, select_for_update to lock and upgrade legacy reports (report_type=None), and get_or_create for normal creation/retrieval
  • Ensures return value is (report, was_created) without creating duplicates
  • Adds focused tests in upload/tests/test_serializers.py for new report creation, legacy upgrade, existing report retrieval, and simulated concurrent creation (no duplicates)

Written by Cursor Bugbot for commit 67b4000. This will update automatically on new commits. Configure here.

@drazisil-codecov
Copy link
Contributor

CCMRG-1626

- test_commit_report_serializer_creates_new_report: verifies new report creation
- test_commit_report_serializer_upgrades_legacy_report: verifies legacy report_type=None upgrade
- test_commit_report_serializer_returns_existing_report: verifies existing report retrieval
- test_commit_report_serializer_handles_concurrent_creation: verifies no duplicates on concurrent calls
@sentry
Copy link

sentry bot commented Jan 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.90%. Comparing base (367d3f1) to head (67b4000).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #635   +/-   ##
=======================================
  Coverage   93.90%   93.90%           
=======================================
  Files        1286     1286           
  Lines       46802    46807    +5     
  Branches     1517     1517           
=======================================
+ Hits        43951    43956    +5     
  Misses       2542     2542           
  Partials      309      309           
Flag Coverage Δ
apiunit 96.56% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-notifications
Copy link

codecov-notifications bot commented Jan 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

@drazisil-codecov drazisil-codecov marked this pull request as ready for review January 6, 2026 20:27
@drazisil-codecov drazisil-codecov requested a review from a team January 6, 2026 20:28
Comment on lines +232 to +235
report_type=report_type,
)

return report, was_created
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The use of get_or_create for CommitReport without a database unique constraint can create duplicate reports under concurrent requests, as the existing lock only covers legacy reports.
Severity: HIGH | Confidence: High

🔍 Detailed Analysis

The CommitReport model lacks a unique database constraint on the combination of code, commit_id, and report_type. The code uses get_or_create to find or create reports, which is not safe from race conditions without such a constraint. The added select_for_update lock only applies to legacy reports (where report_type=None), leaving the primary creation path unprotected. Under concurrent load, multiple processes can simultaneously determine that a report does not exist and then both successfully insert a new, duplicate record into the database, as confirmed by the PR description mentioning this issue occurs in production.

💡 Suggested Fix

To guarantee atomicity and prevent duplicate records, either add a unique_together constraint on (code, commit_id, report_type) to the CommitReport model, or replace get_or_create with a manual select_for_update query on the target record before attempting to create it.

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: apps/codecov-api/upload/serializers.py#L232-L235

Potential issue: The `CommitReport` model lacks a unique database constraint on the
combination of `code`, `commit_id`, and `report_type`. The code uses `get_or_create` to
find or create reports, which is not safe from race conditions without such a
constraint. The added `select_for_update` lock only applies to legacy reports (where
`report_type=None`), leaving the primary creation path unprotected. Under concurrent
load, multiple processes can simultaneously determine that a report does not exist and
then both successfully insert a new, duplicate record into the database, as confirmed by
the PR description mentioning this issue occurs in production.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 8245976

code=code,
commit_id=commit_id,
report_type=report_type,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing handling for duplicate reports causes crash

Medium Severity

The old code used .first() which gracefully handled duplicate reports by returning one of them. The new code uses get_or_create(), which internally calls get() and raises MultipleObjectsReturned if duplicate reports exist with matching (code, commit_id, report_type). Since the PR description confirms duplicates are created by the race condition being fixed, any pre-existing duplicates (or duplicates created before cleanup runs) will cause this code to crash with an unhandled exception, resulting in 500 errors for users.

Fix in Cursor Fix in Web

@drazisil-codecov
Copy link
Contributor

Review Analysis: Both AI-flagged issues are valid ✅

Sentry[bot] Comment (Race Condition)

Confirmed as a real issue.

The root cause is that:

  1. The existing unique constraint (unique_commit_id_code) only covers (commit_id, code) and was added before report_type was introduced (migration 0006/0007)
  2. NULL handling problem: Since code is always None (see validate_code returning None), and PostgreSQL treats NULL != NULL, the unique constraint allows multiple rows with the same (commit_id, code=NULL) pair
  3. The report_type field was added later (migration 0011) but the unique constraint was never updated to include it

Recommended fix: Add a database migration with a proper unique constraint:

CREATE UNIQUE INDEX CONCURRENTLY unique_commit_id_code_report_type_idx 
ON reports_commitreport ("commit_id", COALESCE("code", ''), COALESCE("report_type", ''));

Cursor[bot] Comment (MultipleObjectsReturned)

Also valid.

The old code used .first() which gracefully returns one record even with duplicates. The new get_or_create() will raise MultipleObjectsReturned if duplicates exist. Given the PR acknowledges duplicates are created in production, this is a breaking change for existing data.

Recommended fix: Add exception handling:

try:
    report, was_created = CommitReport.objects.get_or_create(
        code=code,
        commit_id=commit_id,
        report_type=report_type,
    )
except CommitReport.MultipleObjectsReturned:
    # Handle pre-existing duplicates gracefully
    report = CommitReport.objects.filter(
        code=code,
        commit_id=commit_id,
        report_type=report_type,
    ).first()
    was_created = False

Both issues should be addressed before merging.

@drazisil-codecov drazisil-codecov marked this pull request as draft January 6, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants