-
Notifications
You must be signed in to change notification settings - Fork 11
fix(api): Prevent race condition during CommitReport creation/retrieval #635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
CCMRG-1626 |
- test_commit_report_serializer_creates_new_report: verifies new report creation - test_commit_report_serializer_upgrades_legacy_report: verifies legacy report_type=None upgrade - test_commit_report_serializer_returns_existing_report: verifies existing report retrieval - test_commit_report_serializer_handles_concurrent_creation: verifies no duplicates on concurrent calls
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #635 +/- ##
=======================================
Coverage 93.90% 93.90%
=======================================
Files 1286 1286
Lines 46802 46807 +5
Branches 1517 1517
=======================================
+ Hits 43951 43956 +5
Misses 2542 2542
Partials 309 309
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
| report_type=report_type, | ||
| ) | ||
|
|
||
| return report, was_created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: The use of get_or_create for CommitReport without a database unique constraint can create duplicate reports under concurrent requests, as the existing lock only covers legacy reports.
Severity: HIGH | Confidence: High
🔍 Detailed Analysis
The CommitReport model lacks a unique database constraint on the combination of code, commit_id, and report_type. The code uses get_or_create to find or create reports, which is not safe from race conditions without such a constraint. The added select_for_update lock only applies to legacy reports (where report_type=None), leaving the primary creation path unprotected. Under concurrent load, multiple processes can simultaneously determine that a report does not exist and then both successfully insert a new, duplicate record into the database, as confirmed by the PR description mentioning this issue occurs in production.
💡 Suggested Fix
To guarantee atomicity and prevent duplicate records, either add a unique_together constraint on (code, commit_id, report_type) to the CommitReport model, or replace get_or_create with a manual select_for_update query on the target record before attempting to create it.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: apps/codecov-api/upload/serializers.py#L232-L235
Potential issue: The `CommitReport` model lacks a unique database constraint on the
combination of `code`, `commit_id`, and `report_type`. The code uses `get_or_create` to
find or create reports, which is not safe from race conditions without such a
constraint. The added `select_for_update` lock only applies to legacy reports (where
`report_type=None`), leaving the primary creation path unprotected. Under concurrent
load, multiple processes can simultaneously determine that a report does not exist and
then both successfully insert a new, duplicate record into the database, as confirmed by
the PR description mentioning this issue occurs in production.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 8245976
| code=code, | ||
| commit_id=commit_id, | ||
| report_type=report_type, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing handling for duplicate reports causes crash
Medium Severity
The old code used .first() which gracefully handled duplicate reports by returning one of them. The new code uses get_or_create(), which internally calls get() and raises MultipleObjectsReturned if duplicate reports exist with matching (code, commit_id, report_type). Since the PR description confirms duplicates are created by the race condition being fixed, any pre-existing duplicates (or duplicates created before cleanup runs) will cause this code to crash with an unhandled exception, resulting in 500 errors for users.
Review Analysis: Both AI-flagged issues are valid ✅Sentry[bot] Comment (Race Condition)Confirmed as a real issue. The root cause is that:
Recommended fix: Add a database migration with a proper unique constraint: CREATE UNIQUE INDEX CONCURRENTLY unique_commit_id_code_report_type_idx
ON reports_commitreport ("commit_id", COALESCE("code", ''), COALESCE("report_type", ''));Cursor[bot] Comment (MultipleObjectsReturned)Also valid. The old code used Recommended fix: Add exception handling: try:
report, was_created = CommitReport.objects.get_or_create(
code=code,
commit_id=commit_id,
report_type=report_type,
)
except CommitReport.MultipleObjectsReturned:
# Handle pre-existing duplicates gracefully
report = CommitReport.objects.filter(
code=code,
commit_id=commit_id,
report_type=report_type,
).first()
was_created = FalseBoth issues should be addressed before merging. |
Fixes API-B9G. The issue was that: Concurrent requests bypass non-atomic existence check, creating duplicate CommitReports, which are then deleted by a cleanup process before the ReportSession can access its foreign key.
transaction.atomic()to ensure atomicity during report lookup and creation.select_for_update()to safely lock and update legacy reports (wherereport_typeisNone) to the correct type.get_or_create()for standard report creation/retrieval, eliminating potential race conditions.This fix was generated by Seer in Sentry, triggered by Joe Becher. 👁️ Run ID: 8242046
Not quite right? Click here to continue debugging with Seer.
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.
Note
Prevents duplicate
CommitReportrows during upload by making creation atomic and handling legacy records.CommitReportSerializer.createto usetransaction.atomic,select_for_updateto lock and upgrade legacy reports (report_type=None), andget_or_createfor normal creation/retrieval(report, was_created)without creating duplicatesupload/tests/test_serializers.pyfor new report creation, legacy upgrade, existing report retrieval, and simulated concurrent creation (no duplicates)Written by Cursor Bugbot for commit 67b4000. This will update automatically on new commits. Configure here.