feat(upsampling) - Support upsampled error count with performance optimizations #14

akshayutture-augment · 2025-11-19T23:20:06Z

Test 3

Replicated from ai-code-review-evaluation/sentry-greptile#3

…(#94376) Part of the Error Upsampling project: https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link Events-stats API will now check if all projects in the query are allowlisted for upsampling, and convert the count query to a sum over `sample_weight` in Snuba, this is done by defining a new SnQL function `upsampled_count()`. I noticed there are also eps() and epm() functions in use in this endpoint. I considered (and even worked on) also supporting swapping eps() and epm() which for correctness should probably also not count naively and use `sample_weight`, but this caused some complications and since they are only in use by specific dashboard widgets and not available in discover I decided to defer changing them until we realize it is needed.

- Add 60-second cache for upsampling eligibility checks to improve performance - Separate upsampling eligibility check from query transformation for better optimization - Remove unnecessary null checks in upsampled_count() function per schema requirements - Add cache invalidation utilities for configuration management This improves performance during high-traffic periods by avoiding repeated expensive allowlist lookups while maintaining data consistency.

akshayutture-augment · 2025-11-20T00:40:17Z

@claude review this PR

claude · 2025-11-20T00:41:26Z

Claude finished @akshayutture-augment's task —— View job

Code Review - Error Upsampling with Performance Optimizations

I've completed my review of this PR. Overall, the implementation is solid with good test coverage, but I've identified several important issues that should be addressed.

✅ Strengths

Good test coverage - Comprehensive unit tests and integration tests covering various scenarios
Clear separation of concerns - Helper functions are well-organized
Performance optimization - Caching strategy is well-documented
Safe rollout - Feature is gated behind allowlist configuration

🔴 Critical Issues

1. Race Condition in Caching Logic (src/sentry/api/helpers/error_upsampling.py:27-40)

The caching implementation has a critical race condition that can cause inconsistent query results:

# Line 27-40
cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"

# Check cache first for performance optimization
cached_result = cache.get(cache_key)
if cached_result is not None:
    return cached_result and _should_apply_sample_weight_transform(dataset, request)

# Cache miss - perform fresh allowlist check
is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization)

# Cache for 60 seconds to improve performance during traffic spikes
cache.set(cache_key, is_eligible, 60)

return is_eligible and _should_apply_sample_weight_transform(dataset, request)

Problem: The function caches is_eligible but returns is_eligible AND _should_apply_sample_weight_transform(...). On cache hit, it evaluates _should_apply_sample_weight_transform at call time, but on cache miss, it doesn't. This means:

Cache hit: Evaluates dataset/request check every time
Cache miss: Only evaluates project eligibility

If the dataset or request query changes between calls with the same cache key, you'll get inconsistent results.

Solution: Cache the full eligibility decision:

cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}:{dataset.__name__}:{request.GET.get('query', '')}"

cached_result = cache.get(cache_key)
if cached_result is not None:
    return cached_result

is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization)
should_transform = is_eligible and _should_apply_sample_weight_transform(dataset, request)

cache.set(cache_key, should_transform, 60)
return should_transform

2. Potential Data Inconsistency (src/sentry/api/helpers/error_upsampling.py:43-64)

The comment on line 50-53 acknowledges a serious issue but doesn't address it:

# NOTE: This function reads the allowlist configuration fresh each time,
# which means it can return different results between calls if the 
# configuration changes during request processing. This is intentional
# to ensure we always have the latest configuration state.

Problem: Combined with the cache, this creates a window where:

Request A caches "eligible=True" for projects [1,2]
Config changes to remove project 2
Request B uses cached "eligible=True"
Query runs with upsampling but project 2 may not have sample_weight column populated consistently

Solution: Either:

Include allowlist version in cache key, OR
Document the acceptable inconsistency window and monitoring strategy

3. Missing Null Safety (src/sentry/api/helpers/error_upsampling.py:87-100)

def transform_query_columns_for_error_upsampling(
    query_columns: Sequence[str],
) -> list[str]:
    # ...
    # Note: We rely on the database schema to ensure sample_weight exists for all
    # events in allowlisted projects, so no additional null checks are needed here.

Problem: The comment claims schema guarantees, but:

Events may have been ingested before upsampling was enabled
Migration timing issues
Partial rollout scenarios

Impact: Queries will break with NULL errors if sample_weight is missing.

Solution: Add defensive null handling in the SQL function definition in discover.py:1046-1050:

snql_aggregate=lambda args, alias: Function(
    "toInt64",
    [Function("sum", [Function("coalesce", [Column("sample_weight"), 1])])],
    alias,
),

⚠️ Medium Priority Issues

4. Cache Key Collision Risk (src/sentry/api/helpers/error_upsampling.py:27)

cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"

Problem: Python's hash() can have collisions. Different project ID combinations might hash to the same value.

Solution: Use a deterministic hash or just join the IDs:

cache_key = f"error_upsampling_eligible:{organization.id}:{','.join(map(str, sorted(snuba_params.project_ids)))}"

5. Inconsistent Upsampling Application (src/sentry/api/endpoints/organization_events_stats.py:218-296)

The upsampling check happens early but transformation happens late in 3 different places (lines 233, 277, 296). This creates maintenance burden and potential for bugs if one path is missed.

Recommendation: Refactor to apply transformation in one place or use a decorator pattern.

6. Test Data Quality Issue (tests/snuba/api/endpoints/test_organization_events_stats.py:3626-3627)

assert data[0][1][0]["count"] == 10  # First bucket has 1 event
assert data[1][1][0]["count"] == 10  # Second bucket has 1 event

Problem: Comments say "1 event" but assertions check for count == 10. This is confusing and suggests the test might not be validating what you think it is.

Expected: With client_sample_rate: 0.1, 1 event should upsample to ~10. But are you actually testing upsampling, or is this hardcoded?

Action: Verify the test is correctly validating upsampled counts and fix the comments.

💡 Minor Issues & Code Quality

7. Empty Directory Added (sentry-repo/)

An empty sentry-repo/ directory was added. This appears unintentional and should be removed.

8. Type Safety (src/sentry/api/helpers/error_upsampling.py:103)

def _should_apply_sample_weight_transform(dataset: Any, request: Request) -> bool:

Issue: dataset: Any loses type safety.

Recommendation: Use Union type or Protocol:

from typing import Union
from types import ModuleType

def _should_apply_sample_weight_transform(dataset: ModuleType, request: Request) -> bool:

9. Query Parsing Fragility (src/sentry/api/helpers/error_upsampling.py:135-138)

query = request.GET.get("query", "").lower()
if "event.type:error" in query:
    return True

Problem: Simple substring matching can have false positives:

event.type:error_count would match
!event.type:error (negation) would match incorrectly

Recommendation: Use proper query parsing or more robust pattern matching.

10. Magic Numbers (src/sentry/api/helpers/error_upsampling.py:38)

cache.set(cache_key, is_eligible, 60)  # 60 seconds

Recommendation: Extract to a named constant:

UPSAMPLING_CACHE_TTL_SECONDS = 60
cache.set(cache_key, is_eligible, UPSAMPLING_CACHE_TTL_SECONDS)

📋 Testing Gaps

No tests for cache invalidation - The invalidate_upsampling_cache function is never tested
No tests for concurrent requests - Race condition scenarios are not covered
No tests for mixed dataset queries - What happens when discover dataset has both errors and transactions?
No performance tests - The PR title mentions "performance optimizations" but there are no benchmarks

📚 Documentation

Missing:

How to monitor for cache inconsistencies
Rollback procedure if upsampling causes issues
Expected performance improvements (quantified)
Migration guide for adding sample_weight column to existing events

🎯 Recommendations for Next Steps

Fix the race condition (Critical) - Address issue chore(deps): bump react-mentions and @types/react-mentions #1
Add null safety (Critical) - Address issue chore(deps): bump fsevents from 2.3.2 to 2.3.3 #3
Fix cache key collisions (High) - Address issue chore(deps-dev): bump @storybook/addon-essentials from 6.2.9 to 8.1.11 #4
Verify test assertions (High) - Address issue Add Claude Code GitHub Workflow #6
Remove empty directory (Quick win)
Add cache invalidation tests
Document monitoring strategy

Summary

This is a solid implementation with good structure and test coverage. The main concerns are around cache consistency and null safety. The race condition in particular could lead to incorrect query results or runtime errors. I recommend addressing the critical issues before merging.

The performance optimization approach is sound, but consider adding observability (metrics, logging) to track:

Cache hit rate
Upsampling transformation frequency
Query performance improvements

| View PR branch

yuvmen and others added 2 commits July 25, 2025 09:48

github-actions bot added the Scope: Backend label Nov 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(upsampling) - Support upsampled error count with performance optimizations #14

feat(upsampling) - Support upsampled error count with performance optimizations #14

Uh oh!

akshayutture-augment commented Nov 19, 2025

Uh oh!

akshayutture-augment commented Nov 20, 2025

Uh oh!

claude bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(upsampling) - Support upsampled error count with performance optimizations #14

Are you sure you want to change the base?

feat(upsampling) - Support upsampled error count with performance optimizations #14

Uh oh!

Conversation

akshayutture-augment commented Nov 19, 2025

Uh oh!

akshayutture-augment commented Nov 20, 2025

Uh oh!

claude bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review - Error Upsampling with Performance Optimizations

✅ Strengths

🔴 Critical Issues

1. Race Condition in Caching Logic (src/sentry/api/helpers/error_upsampling.py:27-40)

2. Potential Data Inconsistency (src/sentry/api/helpers/error_upsampling.py:43-64)

3. Missing Null Safety (src/sentry/api/helpers/error_upsampling.py:87-100)

⚠️ Medium Priority Issues

4. Cache Key Collision Risk (src/sentry/api/helpers/error_upsampling.py:27)

5. Inconsistent Upsampling Application (src/sentry/api/endpoints/organization_events_stats.py:218-296)

6. Test Data Quality Issue (tests/snuba/api/endpoints/test_organization_events_stats.py:3626-3627)

💡 Minor Issues & Code Quality

7. Empty Directory Added (sentry-repo/)

8. Type Safety (src/sentry/api/helpers/error_upsampling.py:103)

9. Query Parsing Fragility (src/sentry/api/helpers/error_upsampling.py:135-138)

10. Magic Numbers (src/sentry/api/helpers/error_upsampling.py:38)

📋 Testing Gaps

📚 Documentation

🎯 Recommendations for Next Steps

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude bot commented Nov 20, 2025 •

edited

Loading