-
Notifications
You must be signed in to change notification settings - Fork 26
Optimize spans buffer insertion with eviction during insert #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: performance-optimization-baseline
Are you sure you want to change the base?
Optimize spans buffer insertion with eviction during insert #2
Conversation
A proof of concept that limits the number of spans per segment during insertion. Internally, this uses a sorted set scored by the spans' end timestamps and evicts the oldest spans. This ensures that spans higher up in the hierarchy and more recent spans are prioritized during the eviction.
…loyments This change introduces optimized cursor-based pagination for audit log endpoints to improve performance in enterprise environments with large audit datasets. Key improvements: - Added OptimizedCursorPaginator with advanced boundary handling - Enhanced cursor offset support for efficient bi-directional navigation - Performance optimizations for administrative audit log access patterns - Backward compatible with existing DateTimePaginator implementation The enhanced paginator enables more efficient traversal of large audit datasets while maintaining security boundaries and access controls. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements a comprehensive performance optimization for the spans buffer system by transitioning from unbounded memory growth to a controlled eviction strategy. The core change shifts from using Redis sets to sorted sets, where spans are ordered by their end_timestamp_precise values, enabling time-based eviction of older spans when buffer limits are exceeded.
Key Changes:
- Buffer Storage Migration: Converted from Redis SADD/SSCAN operations to ZADD/ZSCAN operations in
buffer.py, using timestamps as sorting scores - Eviction Logic: Added Lua script logic in
add-buffer.luathat automatically removes oldest spans when segments exceed 1,000 spans usingzpopmin - Data Structure Enhancement: Extended the Span dataclass with
end_timestamp_precisefield throughout the processing pipeline (factory.py,buffer.py) - Pagination Optimizations: Introduced
OptimizedCursorPaginatorwith negative offset support for efficient reverse pagination in audit logs - Test Updates: Updated all span test fixtures to include the new required timestamp field
The changes integrate across the entire spans processing pipeline - from initial span ingestion in the consumer factory, through buffer operations, to final data retrieval. The eviction mechanism prevents Redis memory bloat while maintaining the most recent spans for each segment, addressing scalability concerns in high-throughput environments.
PR Description Notes:
- The PR body only contains "Test 2" which provides no meaningful context about the changes
Confidence score: 1/5
- This PR has several critical implementation issues that could cause immediate production failures
- The
OptimizedCursorPaginatoris imported but doesn't exist in the codebase, causing import errors; negative offset support bypasses normal pagination safety checks; spans buffer has inconsistent eviction behavior between Python and Lua code - Files needing attention:
src/sentry/api/endpoints/organization_auditlogs.py(missing paginator class),src/sentry/api/paginator.py(unsafe negative offset handling),src/sentry/spans/buffer.py(coordination issues with Lua script)
9 files reviewed, 3 comments
| from sentry.api.bases import ControlSiloOrganizationEndpoint | ||
| from sentry.api.bases.organization import OrganizationAuditPermission | ||
| from sentry.api.paginator import DateTimePaginator | ||
| from sentry.api.paginator import DateTimePaginator, OptimizedCursorPaginator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: OptimizedCursorPaginator is imported but does not exist in the codebase - this will cause ImportError at runtime
| paginator_cls=OptimizedCursorPaginator, | ||
| order_by="-datetime", | ||
| on_results=lambda x: serialize(x, request.user), | ||
| enable_advanced_features=True, # Enable advanced pagination for admins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: enable_advanced_features parameter is not supported by the paginator interface - check the self.paginate method signature
| parent_span_id=val.get("parent_span_id"), | ||
| project_id=val["project_id"], | ||
| payload=payload.value, | ||
| end_timestamp_precise=val["end_timestamp_precise"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Direct dictionary access to end_timestamp_precise will raise KeyError if field is missing. Consider using .get() with a default value for backward compatibility.
Review Summary🏷️ Draft Comments (14)
|
Test 2