Fix purge_deleted() to only delete tombstones on PostgreSQL backend#3644
Open
sambhav wants to merge 1 commit intoKinto:mainfrom
Open
Fix purge_deleted() to only delete tombstones on PostgreSQL backend#3644sambhav wants to merge 1 commit intoKinto:mainfrom
sambhav wants to merge 1 commit intoKinto:mainfrom
Conversation
0c50549 to
3a46d98
Compare
The purge_deleted() method's base class contract and docstring specify that it deletes "tombstones" (soft-deleted objects). The memory backend correctly operates only on tombstone storage (self._cemetery). However, the PostgreSQL backend's DELETE queries lacked a deleted = TRUE filter, meaning purge_deleted with a before timestamp or max_retained would delete both live records and tombstones. This can result in silent data loss when running the purge_deleted maintenance script. What changed: - Added AND deleted = TRUE to the before-path DELETE query - Added WHERE deleted = TRUE to the max_retained-path CTE so the ranking window only considers tombstones - Added three test cases that create live records + tombstones, call purge_deleted, and verify only tombstones are removed Test coverage: - test_purge_deleted_with_before_only_deletes_tombstones: Tests the before parameter path with mixed live/deleted objects - test_purge_deleted_with_max_retained_only_affects_tombstones: Tests the max_retained parameter with 3 live + 5 tombstones - test_purge_deleted_without_before_only_deletes_tombstones: Tests purging all tombstones while preserving live records Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3a46d98 to
aa52ced
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
purge_deleted()method's base class contract and docstring specify that it should delete "all deleted object tombstones" (line 271 inkinto/core/storage/__init__.py). The memory backend correctly operates only on tombstone storage (self._cemetery), never touching live objects inself._store.However, the PostgreSQL backend's DELETE queries lacked a
deleted = TRUEfilter in both execution paths:Before Path
The DELETE query filtered only by
parent_id,resource_name, and timestamp:This deletes ALL objects (live AND tombstones) older than
before.Max Retained Path
The ROW_NUMBER window function ranked ALL objects together:
If
max_retained = 100and you have 80 live + 50 tombstones, it keeps the 100 most recent regardless of type and deletes 30 — potentially deleting live records.This means calling
purge_deleted()on the PostgreSQL backend could silently destroy live data. This is a correctness bug that can result in data loss when running thepurge_deletedmaintenance script.Solution
Added
AND deleted = TRUEfilters to both execution paths to ensure only tombstones are deleted:Before Path
Max Retained Path
This aligns the PostgreSQL backend with:
self._cemetery)Testing
Added three comprehensive test cases to
BaseTestStoragethat verify the fix:test_purge_deleted_with_before_only_deletes_tombstonespurge_deleted()with abeforetimestamptest_purge_deleted_with_max_retained_only_affects_tombstonespurge_deleted()withmax_retained=2test_purge_deleted_without_before_only_deletes_tombstonespurge_deleted()without parameters to purge all tombstonesThese tests exercise both code paths and verify that live records are never affected by
purge_deleted()operations.Impact
Risk: Very low
purge_deleted()on actual tombstones, so they continue to passEffect: Prevents silent data loss in production environments when running maintenance scripts
Files Changed
kinto/core/storage/postgresql/__init__.py- Addeddeleted = TRUEfilters to both code pathskinto/core/storage/testing.py- Added 3 regression test cases🤖 Generated with Claude Code