-
Notifications
You must be signed in to change notification settings - Fork 72
Fix stale snapshot detection to return 409 instead of 400 #425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When concurrent modifications occur during a transaction commit, the metadata is refreshed but the SNAPSHOTS_JSON_KEY property still contains snapshots with stale sequence numbers. Previously, this caused Iceberg's internal validation to throw a ValidationException (mapped to 400 Bad Request), when it should return 409 Conflict. This fix: 1. Uses current() to get the refreshed metadata's sequence number instead of metadataToCommit which is derived from stale transaction metadata 2. Includes snapshots from current() in existingSnapshotIds to detect duplicates added by concurrent processes 3. Throws CommitFailedException (409) before Iceberg's validation can throw ValidationException (400) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds a test that verifies stale snapshot detection during concurrent modifications returns HTTP 409 Conflict (CommitFailedException) instead of HTTP 400 Bad Request (BadRequestException). This is the TDD test for the stale snapshot fix - the test verifies that when a client tries to add a snapshot with sequenceNumber <= lastSequenceNumber, the server returns 409 (retry needed) not 400 (invalid request). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use current() as the primary source for existing snapshot IDs since it is refreshed by commit() before doCommit() runs. Fallback to metadataToCommit only for tests that call doCommit() directly, bypassing the normal flow.
Rather than pre-checking for stale snapshots before Iceberg validation, let Iceberg's TableMetadata.Builder.addSnapshot() detect the sequence number conflict and throw ValidationException. Then reclassify that specific error as CommitFailedException (409) instead of BadRequestException (400) to allow client retry. This approach is simpler and leverages Iceberg's existing validation: - Remove manual stale snapshot check before adding snapshots - Catch ValidationException and check if it's a stale snapshot error - Reclassify stale snapshot errors to CommitFailedException (409) - Use current() as the authoritative source for existing snapshot IDs - Simplify test to verify error detection logic
Replace simplified error detection test with full integration test that exercises the actual doCommit path. Key changes: - Add parent-snapshot-id to stale_snapshot.json (required for Iceberg 1.5+ validation to trigger - snapshots without parents bypass sequence check) - Test verifies both direct Iceberg validation and OpenHouse doCommit path - Confirms ValidationException (400) is reclassified to CommitFailedException (409)
sumedhsakdeo
approved these changes
Jan 5, 2026
Collaborator
sumedhsakdeo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @cbb330
teamurko
reviewed
Jan 5, 2026
.../src/main/java/com/linkedin/openhouse/internal/catalog/OpenHouseInternalTableOperations.java
Show resolved
Hide resolved
teamurko
requested changes
Jan 5, 2026
Collaborator
teamurko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cbb330! Looks good, asking for a test
teamurko
approved these changes
Jan 5, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
ValidationExceptionwith stale snapshot message toCommitFailedException(409) to allow client retryProblem
When concurrent modifications occur during a transaction commit:
lastSequenceNumber = 4)SNAPSHOTS_JSON_KEYlastSequenceNumber = 5)doRefresh()which updatescurrent()to version N+1SNAPSHOTS_JSON_KEYare now stale (their sequence numbers are based on version N)TableMetadata.addSnapshot()throwsValidationException→ mapped to 400 Bad RequestSolution
Let Iceberg's existing validation detect sequence number conflicts, then catch the
ValidationExceptionand reclassify it asCommitFailedExceptionfor the specific stale snapshot error pattern:This approach is simpler than pre-checking and leverages Iceberg's existing validation.
Test Plan
testStaleSnapshotErrorDetection()verifies error detection logic