Skip to content

Conversation

@akshayutture-augment
Copy link
Contributor

…loyments

This change introduces optimized cursor-based pagination for audit log endpoints
to improve performance in enterprise environments with large audit datasets.

Key improvements:
- Added OptimizedCursorPaginator with advanced boundary handling
- Enhanced cursor offset support for efficient bi-directional navigation
- Performance optimizations for administrative audit log access patterns
- Backward compatible with existing DateTimePaginator implementation

The enhanced paginator enables more efficient traversal of large audit datasets
while maintaining security boundaries and access controls.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@akshayutture-augment
Copy link
Contributor Author

@claude review this PR

@claude
Copy link

claude bot commented Nov 20, 2025

Claude finished @akshayutture-augment's task —— View job


Code Review: Enhanced Pagination Performance for High-Volume Audit Logs

I've completed my review of this PR. Below are my findings organized by severity.


🔴 Critical Issues

1. Security Vulnerability: Arbitrary Data Access via Negative Offsets

Location: src/sentry/api/paginator.py:877-882

The implementation allows negative offsets when enable_advanced_features=True, which can bypass pagination boundaries and access arbitrary data:

if self.enable_advanced_features and cursor.offset < 0:
    start_offset = cursor.offset  # Allow negative offsets for advanced pagination
    stop = start_offset + limit + extra
    results = list(queryset[start_offset:stop])

Problem: Django QuerySet negative slicing (e.g., queryset[-10:]) retrieves items from the end of the dataset. An attacker could craft cursors with negative offsets to access data outside normal pagination bounds, potentially bypassing access controls.

Example Attack:

  • A user with limited access could use cursor.offset=-1000 to access the last 1000 records in the dataset, which may contain sensitive audit logs they shouldn't see.
  • The comment claims "permissions are checked at the queryset level," but pagination should not provide a mechanism to circumvent intended access patterns.

Recommendation: Remove negative offset support entirely. If reverse pagination is needed, use proper cursor-based pagination with is_prev flag instead.


2. Type Mismatch in OptimizedCursorPaginator

Location: src/sentry/api/paginator.py:838-840

def get_item_key(self, item, for_prev=False):
    value = getattr(item, self.key)
    return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value))

Problem: This assumes self.key is a numeric field, but the paginator is used with order_by="-datetime" in organization_auditlogs.py:80. Datetime objects will cause this code to fail.

Error: TypeError: must be real number, not datetime.datetime

Recommendation: Use DateTimePaginator logic or make OptimizedCursorPaginator inherit from DateTimePaginator instead of creating a separate implementation.


🟡 High Priority Issues

3. Inconsistent Pagination Logic Between Base and Optimized Paginators

Location: src/sentry/api/paginator.py:179-183 vs 884-886

The BasePaginator was modified to conditionally use negative offsets:

start_offset = max(0, offset) if not cursor.is_prev else offset

But this change affects all paginators that inherit from BasePaginator, not just the new OptimizedCursorPaginator. This could introduce subtle bugs in existing pagination throughout the application.

Recommendation: Revert changes to BasePaginator and only implement custom logic in OptimizedCursorPaginator.


4. Insufficient Permission Checks

Location: src/sentry/api/endpoints/organization_auditlogs.py:71

enable_advanced = request.user.is_superuser or organization_context.member.has_global_access

Problem: This check only verifies if the user is a superuser OR has global access, but doesn't validate:

  • Whether the user has permission to view audit logs at all
  • Rate limiting for this potentially expensive operation
  • Audit logging of who is using the optimized pagination feature

Recommendation: Add proper permission checks, rate limiting, and audit logging for this feature.


5. Missing Input Validation

Location: src/sentry/api/endpoints/organization_auditlogs.py:70

use_optimized = request.GET.get("optimized_pagination") == "true"

Problem: No validation on other query parameters when using optimized pagination. An attacker could potentially abuse cursor values to trigger expensive queries.

Recommendation: Add validation for cursor format, offset bounds, and implement query timeouts.


🟢 Medium Priority Issues

6. Code Duplication

Location: src/sentry/api/paginator.py:821-913

The OptimizedCursorPaginator class duplicates ~90% of the logic from Paginator (lines 221-228). The only differences are:

  1. The enable_advanced_features flag
  2. The negative offset handling

Recommendation: Refactor to avoid duplication. Either:

  • Make OptimizedCursorPaginator a thin wrapper that calls parent methods with modified parameters
  • Add an optional parameter to existing paginators rather than creating a new class

7. Misleading Comments

Location: Multiple locations

Comments claim this is for "performance optimization" but:

  • No benchmarks or performance data provided
  • The actual code changes don't improve performance; they just change behavior
  • Django QuerySet slicing with negative indices is actually less performant than proper cursor pagination

Example: src/sentry/utils/cursors.py:26-27

# Performance optimization: Allow negative offsets for advanced pagination scenarios
# This enables efficient reverse pagination from arbitrary positions in large datasets

This is misleading—negative slicing requires Django to fetch the entire dataset to count backwards, which is worse for performance.

Recommendation: Remove misleading performance claims or provide actual benchmarks.


8. Typo in BasePaginator

Location: src/sentry/api/paginator.py:105

queryset.query.order_b = tuple(new_order_by)  # Should be order_by, not order_b

Note: This bug exists in the original code, not introduced by this PR, but should be fixed.


📋 Minor Issues

9. Incomplete Feature Flag Implementation

Location: src/sentry/api/endpoints/organization_auditlogs.py:70-82

The feature is controlled by a query parameter (optimized_pagination=true) rather than a proper feature flag. This means:

  • Can't easily disable in production if issues arise
  • No gradual rollout capability
  • Difficult to monitor usage

Recommendation: Use Sentry's feature flag system instead of query parameters.


10. Missing Tests

No test files were modified or added. This feature needs comprehensive tests covering:

  • Negative offset handling
  • Permission checks
  • Boundary conditions
  • DateTime handling
  • Backwards compatibility

11. Documentation Missing

No documentation updates for:

  • When to use OptimizedCursorPaginator vs DateTimePaginator
  • API documentation for the optimized_pagination parameter
  • Security considerations for administrators

📊 Summary

Severity Breakdown:

  • 🔴 Critical: 2 issues (security vulnerability, type mismatch)
  • 🟡 High: 3 issues (inconsistent logic, insufficient permissions, missing validation)
  • 🟢 Medium: 4 issues (duplication, misleading comments, typo, feature flag)
  • Minor: 3 issues (tests, documentation)

Recommendation: Do not merge until critical issues are resolved. The security vulnerability allowing negative offset access is a serious concern that could allow unauthorized data access.

Suggested Approach

  1. Remove negative offset support - This feature is both insecure and doesn't provide the performance benefits claimed
  2. Fix the type mismatch - Either make OptimizedCursorPaginator inherit from DateTimePaginator or handle datetime values correctly
  3. Revert changes to BasePaginator - Keep changes isolated to the new paginator class
  4. Add comprehensive tests - Cover security, boundary conditions, and backwards compatibility
  5. Use proper feature flags - Replace query parameter with Sentry's feature flag system

If the goal is truly to improve pagination performance for large audit log datasets, consider:

  • Database indexing improvements
  • Caching strategies
  • Query optimization
  • Proper cursor-based pagination without negative offsets

References


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants