Skip to content

Conversation

@armenzg
Copy link
Member

@armenzg armenzg commented Nov 26, 2025

Groups requested for deletion can have any last_seen value and does not inform us as to when was the group's status updated. We need to use GroupHistory to determine how long ago the status was changed for deletion.

Groups requested for deletion can have any last_seen value and does not inform us as to when was the group's status updated. We need to use GroupHistory to determine how long ago the status was changed for deletion.
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 26, 2025
Group.objects.filter(status__in=statuses_to_delete).values_list(
"id", "project_id", "last_seen"
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing batch limit causes memory issues

The BATCH_LIMIT slice was removed from the query, causing all groups with deletion statuses to be loaded into memory instead of the first 1000. This transforms a batched query into a full table scan that could load thousands or millions of groups, potentially exhausting server memory and causing severe performance degradation.

Fix in Cursor Fix in Web

group_id=group_id,
status__in=[GroupHistoryStatus.DELETED],
date_added__lte=status_change_threshold,
).first()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Filtering by nonexistent GroupHistory records

The code filters for GroupHistoryStatus.DELETED records, but no GroupHistory entry with this status appears to be created when groups are marked as PENDING_DELETION or DELETION_IN_PROGRESS. The issue_deleted signal only records analytics events, not GroupHistory. This means the filter will never match any records, preventing any groups from being processed for deletion.

Fix in Cursor Fix in Web

date_added__lte=status_change_threshold,
).first()
if group_history and group_history.date_added <= status_change_threshold:
groups_by_project[project_id].append(group_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: N+1 query problem in deletion loop

The loop executes a separate database query for GroupHistory for each group that passes the last_seen check. If thousands of groups meet the criteria, this creates thousands of individual queries instead of using a bulk query or join, significantly degrading performance and increasing database load.

Fix in Cursor Fix in Web

for group_id, project_id, last_seen in groups:
if last_seen >= min_last_seen and last_seen <= max_last_seen:
groups_by_project[project_id].append(group_id)
if last_seen >= min_last_seen:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Still filtering by last_seen contradicts PR intent

The code still filters groups by last_seen >= min_last_seen, excluding groups with old last_seen values. This contradicts the PR description stating "Groups requested for deletion can have any last_seen value." A group deleted recently but with very old last_seen would incorrectly be excluded from processing.

Fix in Cursor Fix in Web

@codecov
Copy link

codecov bot commented Nov 26, 2025

❌ 3 Tests Failed:

Tests completed Failed Passed Skipped
30041 3 30038 241
View the top 3 failed test(s) by shortest run time
tests.sentry.tasks.test_delete_pending_groups.DeletePendingGroupsTest::test_schedules_only_groups_within_valid_date_range
Stack Traces | 2.39s run time
#x1B[1m#x1B[.../sentry/tasks/test_delete_pending_groups.py#x1B[0m:66: in test_schedules_only_groups_within_valid_date_range
    mock_delete_task.assert_called_once()
#x1B[1m#x1B[.../hostedtoolcache/Python/3.13.1.../x64/lib/python3.13/unittest/mock.py#x1B[0m:956: in assert_called_once
    raise AssertionError(msg)
#x1B[1m#x1B[31mE   AssertionError: Expected 'apply_async' to have been called once. Called 0 times.#x1B[0m
tests.sentry.tasks.test_delete_pending_groups.DeletePendingGroupsTest::test_groups_by_project
Stack Traces | 2.84s run time
#x1B[1m#x1B[.../sentry/tasks/test_delete_pending_groups.py#x1B[0m:100: in test_groups_by_project
    assert mock_delete_task.call_count == 2
#x1B[1m#x1B[31mE   AssertionError: assert 0 == 2#x1B[0m
#x1B[1m#x1B[31mE    +  where 0 = <MagicMock name='apply_async' id='140628717772416'>.call_count#x1B[0m
tests.sentry.tasks.test_delete_pending_groups.DeletePendingGroupsTest::test_chunks_large_batches
Stack Traces | 4.08s run time
#x1B[1m#x1B[.../sentry/tasks/test_delete_pending_groups.py#x1B[0m:138: in test_chunks_large_batches
    assert mock_delete_task.call_count == 2
#x1B[1m#x1B[31mE   AssertionError: assert 0 == 2#x1B[0m
#x1B[1m#x1B[31mE    +  where 0 = <MagicMock name='apply_async' id='140626438705840'>.call_count#x1B[0m

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

GroupHistory.DELETED is actually not recorded when groups transition to PENDING deletion.
The correct source of truth is AuditLogs.
There are two other possible solutions here, add a pending_deletion_at column or
record a new GroupHistory value like GroupHistory.PENDING_DELETION, but doing it like
this is the least disruptive, but does require an RPC call for AuditLogs from CONTORL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants