Optimize spans buffer insertion with eviction during insert #2

everettbu · 2025-07-29T00:29:50Z

Test 2

A proof of concept that limits the number of spans per segment during insertion. Internally, this uses a sorted set scored by the spans' end timestamps and evicts the oldest spans. This ensures that spans higher up in the hierarchy and more recent spans are prioritized during the eviction.

…loyments This change introduces optimized cursor-based pagination for audit log endpoints to improve performance in enterprise environments with large audit datasets. Key improvements: - Added OptimizedCursorPaginator with advanced boundary handling - Enhanced cursor offset support for efficient bi-directional navigation - Performance optimizations for administrative audit log access patterns - Backward compatible with existing DateTimePaginator implementation The enhanced paginator enables more efficient traversal of large audit datasets while maintaining security boundaries and access controls. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

greptile-apps

Greptile Summary

This PR implements a comprehensive performance optimization for the spans buffer system by transitioning from unbounded memory growth to a controlled eviction strategy. The core change shifts from using Redis sets to sorted sets, where spans are ordered by their end_timestamp_precise values, enabling time-based eviction of older spans when buffer limits are exceeded.

Key Changes:

Buffer Storage Migration: Converted from Redis SADD/SSCAN operations to ZADD/ZSCAN operations in buffer.py, using timestamps as sorting scores
Eviction Logic: Added Lua script logic in add-buffer.lua that automatically removes oldest spans when segments exceed 1,000 spans using zpopmin
Data Structure Enhancement: Extended the Span dataclass with end_timestamp_precise field throughout the processing pipeline (factory.py, buffer.py)
Pagination Optimizations: Introduced OptimizedCursorPaginator with negative offset support for efficient reverse pagination in audit logs
Test Updates: Updated all span test fixtures to include the new required timestamp field

The changes integrate across the entire spans processing pipeline - from initial span ingestion in the consumer factory, through buffer operations, to final data retrieval. The eviction mechanism prevents Redis memory bloat while maintaining the most recent spans for each segment, addressing scalability concerns in high-throughput environments.

PR Description Notes:

The PR body only contains "Test 2" which provides no meaningful context about the changes

Confidence score: 1/5

This PR has several critical implementation issues that could cause immediate production failures
The OptimizedCursorPaginator is imported but doesn't exist in the codebase, causing import errors; negative offset support bypasses normal pagination safety checks; spans buffer has inconsistent eviction behavior between Python and Lua code
Files needing attention: src/sentry/api/endpoints/organization_auditlogs.py (missing paginator class), src/sentry/api/paginator.py (unsafe negative offset handling), src/sentry/spans/buffer.py (coordination issues with Lua script)

_{9 files reviewed, 3 comments}

_{Edit Code Review Bot Settings | Greptile}

greptile-apps · 2025-07-29T03:39:05Z

src/sentry/api/endpoints/organization_auditlogs.py

 from sentry.api.bases import ControlSiloOrganizationEndpoint
 from sentry.api.bases.organization import OrganizationAuditPermission
-from sentry.api.paginator import DateTimePaginator
+from sentry.api.paginator import DateTimePaginator, OptimizedCursorPaginator


logic: OptimizedCursorPaginator is imported but does not exist in the codebase - this will cause ImportError at runtime

greptile-apps · 2025-07-29T03:39:05Z

src/sentry/api/endpoints/organization_auditlogs.py

+                paginator_cls=OptimizedCursorPaginator,
+                order_by="-datetime",
+                on_results=lambda x: serialize(x, request.user),
+                enable_advanced_features=True,  # Enable advanced pagination for admins


logic: enable_advanced_features parameter is not supported by the paginator interface - check the self.paginate method signature

greptile-apps · 2025-07-29T03:39:10Z

src/sentry/spans/consumers/process/factory.py

            parent_span_id=val.get("parent_span_id"),
            project_id=val["project_id"],
            payload=payload.value,
+            end_timestamp_precise=val["end_timestamp_precise"],


logic: Direct dictionary access to end_timestamp_precise will raise KeyError if field is missing. Consider using .get() with a default value for backward compatibility.

GitHoobar · 2025-12-24T10:05:12Z

Review Summary

🏷️ Draft Comments (14)

Skipped posting 14 draft comments that were valid but scored below your review threshold (>=13/15). Feel free to update them here.
src/sentry/api/endpoints/organization_auditlogs.py (1)
70-73: optimized_pagination query parameter is not validated or restricted, allowing any user to attempt optimized pagination, which could expose advanced pagination features or internal data to unauthorized users if enable_advanced logic is bypassed or misconfigured.

📊 Impact Scores:

Production Impact: 3/5

Fix Specificity: 4/5

Urgency Impact: 3/5

Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/api/endpoints/organization_auditlogs.py, lines 70-73, the `optimized_pagination` query parameter is not validated or restricted, which could allow unauthorized users to attempt advanced pagination features. Add an explicit check to return a 403 response if `use_optimized` is true but the user is not an admin (i.e., `enable_advanced` is false). Place this check before the `if use_optimized and enable_advanced:` block.
src/sentry/api/paginator.py (4)
877-882: OptimizedCursorPaginator.get_result allows negative offsets for advanced pagination, but negative slicing in Django QuerySets is not supported and will raise a ValueError at runtime, causing crashes for negative offsets.

📊 Impact Scores:

Production Impact: 5/5

Fix Specificity: 3/5

Urgency Impact: 4/5

Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/api/paginator.py, lines 877-882, OptimizedCursorPaginator.get_result attempts to allow negative offsets for advanced pagination, but Django QuerySets do not support negative slicing and will raise a ValueError at runtime. Please update this logic to raise BadPaginationError if a negative offset is detected, preventing runtime crashes.
182-184, 0886-0886: BasePaginator.get_result and OptimizedCursorPaginator.get_result use list(queryset[start_offset:stop]), which can cause severe performance degradation for large offsets due to inefficient OFFSET/LIMIT queries in SQL backends.

📊 Impact Scores:

Production Impact: 4/5

Fix Specificity: 3/5

Urgency Impact: 3/5

Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/api/paginator.py, lines 182-184 and 886, the code uses `list(queryset[start_offset:stop])` for pagination. This pattern leads to significant performance issues on large datasets because SQL OFFSET queries become increasingly slow as the offset grows, resulting in full table scans. Refactor the pagination logic to use keyset/cursor-based pagination instead of relying on OFFSET/LIMIT slicing for large offsets. Ensure the paginator uses indexed columns for range queries and only falls back to OFFSET for small pages or when absolutely necessary.
136-216, 0845-0912: The get_result methods in both BasePaginator and OptimizedCursorPaginator are highly complex (13+ branches), making them difficult to maintain and error-prone for future performance optimizations.

📊 Impact Scores:

Production Impact: 2/5

Fix Specificity: 4/5

Urgency Impact: 2/5

Total Score: 8/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/api/paginator.py, lines 136-216 and 845-912, the `get_result` methods in `BasePaginator` and `OptimizedCursorPaginator` are overly complex, with many branches and logic paths. This complexity makes the code hard to maintain and hinders future performance improvements. Refactor these methods by extracting major logic blocks (such as cursor handling, queryset building, and result post-processing) into well-named helper methods. This will reduce cyclomatic complexity, improve readability, and make it easier to optimize or extend pagination logic in the future.
123-126: BasePaginator.build_queryset uses Django's extra(where=[...]) with user-controlled values, enabling SQL injection if self.key or value are attacker-controlled.

📊 Impact Scores:

Production Impact: 4/5

Fix Specificity: 2/5

Urgency Impact: 4/5

Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/api/paginator.py, lines 123-126, the code uses Django's `extra(where=[...])` with user-controlled values, which is vulnerable to SQL injection if `self.key` or `value` are attacker-controlled. Replace the use of `extra(where=[...])` with a safe Django ORM filter using keyword arguments, e.g., `queryset = queryset.filter(**{f"{self.key}__{('gte' if asc else 'lte')}": value})`.
src/sentry/scripts/spans/add-buffer.lua (2)
44-54: zunionstore returns the number of elements in the resulting sorted set, not the total cardinality; if both union branches are taken, span_count will be overwritten, possibly undercounting for eviction.

📊 Impact Scores:

Production Impact: 4/5

Fix Specificity: 2/5

Urgency Impact: 3/5

Total Score: 9/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/scripts/spans/add-buffer.lua, lines 44-54: The code sets `span_count` to the result of `zunionstore`, but this only reflects the result of the last union operation, not the true cardinality if both branches are taken. This can cause under-eviction. Refactor so that `span_count` is always set to `zcard(set_key)` after all unions are performed, and do not rely on the return value of `zunionstore` for eviction logic.
63-64: zpopmin(set_key, span_count - 1000) evicts all oldest spans in a single call, which can cause significant Redis CPU spikes and latency for large evictions (e.g., thousands of elements) in high-throughput environments.

📊 Impact Scores:

Production Impact: 4/5

Fix Specificity: 5/5

Urgency Impact: 3/5

Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/scripts/spans/add-buffer.lua, lines 63-64, the code evicts all oldest spans in a single zpopmin call when span_count > 1000. This can cause significant Redis CPU spikes and latency for large evictions. Refactor this block to evict in smaller batches (e.g., 100 at a time) in a loop until the correct number of elements are removed, to avoid blocking Redis for too long.
src/sentry/spans/buffer.py (2)
449-454: _load_segment_data now uses zscan but does not enforce a maximum span count per segment, so segments with more than max_segment_spans are not skipped, risking OOM and violating contract.

📊 Impact Scores:

Production Impact: 4/5

Fix Specificity: 5/5

Urgency Impact: 3/5

Total Score: 12/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/spans/buffer.py, lines 449-454, after switching to zscan, the code no longer enforces the maximum span count per segment (`max_segment_spans`). This can cause segments to exceed the allowed span count, risking OOM and violating the contract. Please reintroduce the check: after extending `payloads[key]`, if `len(payloads[key]) > self.max_segment_spans`, increment the metric, log the error, and delete the segment from `payloads` and `cursors`, then continue. Insert this check before the `if cursor == 0:` block.
197-199: process_spans now uses zadd to insert spans into Redis, but does not enforce max_segment_spans during insertion, allowing unbounded growth and risking memory exhaustion and slowdowns.

📊 Impact Scores:

Production Impact: 4/5

Fix Specificity: 3/5

Urgency Impact: 3/5

Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/spans/buffer.py, lines 197-199, the code inserts all spans in `subsegment` into a Redis sorted set with `zadd`, but does not enforce the `max_segment_spans` limit. This can cause unbounded memory growth and degrade performance. Update this block to ensure that no more than `self.max_segment_spans` spans exist in the set after insertion. If necessary, evict the oldest spans (lowest score) to maintain the cap. Insert the check and eviction logic before or after the `zadd` as appropriate.
src/sentry/spans/consumers/process/factory.py (1)
128-147: process_batch builds a list of all spans in memory before processing, which can cause high memory usage and slowdowns for large batches.

📊 Impact Scores:

Production Impact: 3/5

Fix Specificity: 2/5

Urgency Impact: 2/5

Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/spans/consumers/process/factory.py, lines 128-147, the `process_batch` function accumulates all spans in a list before passing them to `buffer.process_spans`, which can cause high memory usage and slowdowns for large batches. Refactor this code to process each span immediately (e.g., call `buffer.process_spans([span], now=timestamp)` inside the loop) instead of building a large list, while preserving the batch's minimum timestamp for the return value.
src/sentry/utils/cursors.py (4)
35-39: Cursor.__eq__ does not check type of other, so comparing to unrelated objects may raise AttributeError instead of returning False.

📊 Impact Scores:

Production Impact: 3/5

Fix Specificity: 5/5

Urgency Impact: 2/5

Total Score: 10/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/utils/cursors.py, lines 35-39, the __eq__ method of Cursor does not check the type of 'other', so comparing to an unrelated object may raise AttributeError. Update __eq__ to first check isinstance(other, type(self)) and return False if not, before comparing attributes.
151-170: _build_next_values and _build_prev_values both iterate over the entire results sequence in reverse or forward order to compute offsets, which can be O(n) per page and cause significant performance degradation for large result sets or high-frequency pagination.

📊 Impact Scores:

Production Impact: 3/5

Fix Specificity: 2/5

Urgency Impact: 2/5

Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):
Optimize the offset calculation in `src/sentry/utils/cursors.py` lines 151-170 (`_build_next_values`). The current implementation iterates over the entire `results` sequence, which is O(n) per page and can cause significant performance degradation for large result sets. Refactor the code to minimize iteration, leveraging the sorted nature of the data and breaking early when possible. Ensure the logic remains correct for both ascending and descending order.
244-285: build_cursor has a very large argument list (8 parameters), which is a major maintainability issue and increases the risk of incorrect usage in a large codebase.

📊 Impact Scores:

Production Impact: 2/5

Fix Specificity: 3/5

Urgency Impact: 2/5

Total Score: 7/15

🤖 AI Agent Prompt (Copy & Paste Ready):
Refactor `build_cursor` in `src/sentry/utils/cursors.py` (lines 244-285) to reduce the number of arguments. Consider grouping related parameters into a configuration object or dataclass to improve maintainability and reduce the risk of incorrect usage. Ensure the function remains backward compatible and easy to use.
53-61: Cursor.from_string and StringCursor.from_string do not validate that the offset and is_prev fields are within safe/expected ranges, allowing attackers to supply large or negative offsets that could cause denial-of-service or bypass pagination limits.

📊 Impact Scores:

Production Impact: 3/5

Fix Specificity: 3/5

Urgency Impact: 2/5

Total Score: 8/15

🤖 AI Agent Prompt (Copy & Paste Ready):
In src/sentry/utils/cursors.py, lines 53-61, the `Cursor.from_string` method does not validate that the parsed `offset` and `is_prev` values are within safe/expected ranges. This allows attackers to supply large or negative offsets, potentially causing denial-of-service or bypassing pagination limits. Update the method to enforce that `offset` is non-negative and within a reasonable upper bound (e.g., 0 <= offset <= 10000), and that `is_prev` is either 0 or 1. Raise ValueError if these checks fail.

jan-auer and others added 2 commits June 2, 2025 12:06

greptile-apps bot reviewed Jul 29, 2025

View reviewed changes

This was referenced Nov 14, 2025

Optimize spans buffer insertion with eviction during insert ai-code-review-evaluations/greptile-sentry#14

Open

Optimize spans buffer insertion with eviction during insert ai-code-review-evaluations/claude_code-sentry#15

Open

This was referenced Dec 31, 2025

Optimize spans buffer insertion with eviction during insert AI-Code-Review-Evals/claude_code-sentry#2

Draft

Optimize spans buffer insertion with eviction during insert AI-Code-Review-Evals/greptile-sentry#8

Open

AncD8972 mentioned this pull request Jan 7, 2026

[Review] Optimize spans buffer insertion with eviction during insert zju-sxf-code-review/sentry#2

Open

wangyue6761 mentioned this pull request Jan 7, 2026

Optimize spans buffer insertion with eviction during insert wangyue6761/sentry-greptile#2

Open

glittering-universe mentioned this pull request Jan 9, 2026

Optimize spans buffer insertion with eviction during insert glittering-universe/sentry-costrict#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize spans buffer insertion with eviction during insert #2

Optimize spans buffer insertion with eviction during insert #2

Uh oh!

everettbu commented Jul 29, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jul 29, 2025

Uh oh!

greptile-apps bot Jul 29, 2025

Uh oh!

greptile-apps bot Jul 29, 2025

Uh oh!

GitHoobar commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Optimize spans buffer insertion with eviction during insert #2

Are you sure you want to change the base?

Optimize spans buffer insertion with eviction during insert #2

Uh oh!

Conversation

everettbu commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Summary

Confidence score: 1/5

Uh oh!

greptile-apps bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

GitHoobar commented Dec 24, 2025

Review Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

everettbu commented Jul 29, 2025 •

edited

Loading