Skip to content

Conversation

@hussam789
Copy link

@hussam789 hussam789 commented Oct 30, 2025

User description

PR #3


PR Type

Enhancement, Tests


Description

  • Implement error upsampling support in events-stats API with upsampled_count() function

  • Add 60-second caching for upsampling eligibility checks to optimize performance

  • Transform count() aggregations to use sum(sample_weight) for allowlisted projects

  • Add comprehensive test coverage for upsampling logic and edge cases

  • Extract sample_rate from error_sampling context in event factory


Diagram Walkthrough

flowchart LR
  A["Events Stats API"] --> B["Check Upsampling Eligibility"]
  B --> C["Cache Result 60s"]
  C --> D["Transform Query Columns"]
  D --> E["Apply upsampled_count()"]
  E --> F["Execute Snuba Query"]
  F --> G["Return Results"]
Loading

File Walkthrough

Relevant files
Enhancement
organization_events_stats.py
Integrate upsampling eligibility checks and column transformation

src/sentry/api/endpoints/organization_events_stats.py

  • Import error upsampling helper functions for eligibility checks and
    query transformation
  • Add early upsampling eligibility check with caching for performance
    optimization
  • Apply query column transformation before executing top events and
    standard timeseries queries
  • Fix error message grammar in validation response
  • Pass transformed columns (final_columns) to all query execution paths
+33/-5   
error_upsampling.py
Create error upsampling helper with caching and transformations

src/sentry/api/helpers/error_upsampling.py

  • Create new helper module for error upsampling logic with caching
    support
  • Implement is_errors_query_for_error_upsampled_projects() with
    60-second cache for allowlist eligibility
  • Add _are_all_projects_error_upsampled() to validate all projects are
    in allowlist
  • Implement transform_query_columns_for_error_upsampling() to convert
    count() to upsampled_count()
  • Add _should_apply_sample_weight_transform() to determine dataset
    applicability
  • Provide invalidate_upsampling_cache() utility for configuration
    management
+140/-0 
discover.py
Add upsampled_count SnQL function for error aggregation   

src/sentry/search/events/datasets/discover.py

  • Add new upsampled_count SnQL function to function converter mapping
  • Implement aggregation using toInt64(sum(sample_weight)) for upsampled
    error counting
  • Set default result type to number for the new function
+12/-0   
factories.py
Extract sample_rate from error_sampling context in events

src/sentry/testutils/factories.py

  • Add MutableMapping to imports from collections.abc
  • Create _set_sample_rate_from_error_sampling() helper function to
    extract client_sample_rate from error_sampling context
  • Call helper in store_event() to populate sample_rate field from
    error_sampling context data
+20/-1   
Tests
test_error_upsampling.py
Add unit tests for error upsampling helper functions         

tests/sentry/api/helpers/test_error_upsampling.py

  • Create comprehensive test suite for error upsampling helper functions
  • Test _are_all_projects_error_upsampled() with various allowlist
    scenarios
  • Test transform_query_columns_for_error_upsampling() with case
    insensitivity and whitespace handling
  • Test _is_error_focused_query() for error vs transaction detection
  • Test _should_apply_sample_weight_transform() across different datasets
+101/-0 
test_organization_events_stats.py
Add integration tests for error upsampling in events-stats

tests/snuba/api/endpoints/test_organization_events_stats.py

  • Add OrganizationEventsStatsErrorUpsamplingTest test class with
    integration tests
  • Test upsampling with fully allowlisted projects showing upsampled
    counts
  • Test partial allowlist scenario where upsampling is not applied
  • Test that transaction events use regular count regardless of allowlist
  • Test behavior when no projects are allowlisted
+171/-0 
Configuration changes
pyproject.toml
Update mypy configuration for new modules                               

pyproject.toml

  • Add sentry.api.helpers.error_upsampling to mypy ignore list
  • Add tests.sentry.api.helpers.test_error_upsampling to mypy ignore list
+2/-0     
Miscellaneous
sentry-repo
Update subproject commit reference                                             

sentry-repo

  • Update subproject commit reference to latest version
+1/-0     

yuvmen and others added 2 commits July 25, 2025 09:48
…(#94376)

Part of the Error Upsampling project:
https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link

Events-stats API will now check if all projects in the query are
allowlisted for upsampling, and convert the count query to a sum over
`sample_weight` in Snuba, this is done by defining a new SnQL function
`upsampled_count()`.

I noticed there are also eps() and epm() functions in use in this
endpoint. I considered (and even worked on) also supporting
swapping eps() and epm() which for correctness should probably also not
count naively and use `sample_weight`, but this
caused some complications and since they are only in use by specific
dashboard widgets and not available in discover
I decided to defer changing them until we realize it is needed.
- Add 60-second cache for upsampling eligibility checks to improve performance
- Separate upsampling eligibility check from query transformation for better optimization
- Remove unnecessary null checks in upsampled_count() function per schema requirements
- Add cache invalidation utilities for configuration management

This improves performance during high-traffic periods by avoiding repeated
expensive allowlist lookups while maintaining data consistency.
@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logs: The new upsampling eligibility and transformation logic performs configuration-based
decisions and caching without emitting audit logs for critical actions or changes, and the
diff does not show any added audit trail.

Referred Code
    cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"

    # Check cache first for performance optimization
    cached_result = cache.get(cache_key)
    if cached_result is not None:
        return cached_result and _should_apply_sample_weight_transform(dataset, request)

    # Cache miss - perform fresh allowlist check
    is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization)

    # Cache for 60 seconds to improve performance during traffic spikes
    cache.set(cache_key, is_eligible, 60)

    return is_eligible and _should_apply_sample_weight_transform(dataset, request)


def _are_all_projects_error_upsampled(
    project_ids: Sequence[int], organization: Organization
) -> bool:
    """
    Check if ALL projects in the query are allowlisted for error upsampling.


 ... (clipped 28 lines)
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Cache key risks: Caching eligibility with a hash of sorted project IDs lacks collision handling and does
not validate empty/missing IDs, which may cause subtle misapplication of upsampling
without explicit error handling.

Referred Code
cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"

# Check cache first for performance optimization
cached_result = cache.get(cache_key)
if cached_result is not None:
    return cached_result and _should_apply_sample_weight_transform(dataset, request)

# Cache miss - perform fresh allowlist check
is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization)

# Cache for 60 seconds to improve performance during traffic spikes
cache.set(cache_key, is_eligible, 60)

return is_eligible and _should_apply_sample_weight_transform(dataset, request)
Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Input assumptions: The transformation assumes presence of sample_weight and trusts request query to detect
error events without sanitization of inputs beyond simple string checks, which may require
validation against unsupported datasets or malformed inputs.

Referred Code
def _should_apply_sample_weight_transform(dataset: Any, request: Request) -> bool:
    """
    Determine if we should apply sample_weight transformations based on the dataset
    and query context. Only apply for error events since sample_weight doesn't exist
    for transactions.
    """
    from sentry.snuba import discover, errors

    # Always apply for the errors dataset
    if dataset == errors:
        return True

    from sentry.snuba import transactions

    # Never apply for the transactions dataset
    if dataset == transactions:
        return False

    # For the discover dataset, check if we're querying errors specifically
    if dataset == discover:
        result = _is_error_focused_query(request)


 ... (clipped 17 lines)
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Improve exception handling and code clarity

Replace broad try...except Exception blocks with safer dictionary access using
.get() and more specific exception handling for type conversion. This will
improve error handling and code robustness.

src/sentry/testutils/factories.py [344-357]

 def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None:
     """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid."""
-    client_sample_rate = None
-    try:
-        client_sample_rate = (
-            normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate")
-        )
-    except Exception:
-        pass
-    if client_sample_rate:
+    error_sampling = normalized_data.get("contexts", {}).get("error_sampling")
+    if not isinstance(error_sampling, Mapping):
+        return
+
+    client_sample_rate = error_sampling.get("client_sample_rate")
+    if client_sample_rate is not None:
         try:
             normalized_data["sample_rate"] = float(client_sample_rate)
-        except Exception:
+        except (ValueError, TypeError):
             pass
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out the use of broad except Exception blocks and proposes replacing them with safer dictionary access and more specific exception handling (ValueError, TypeError), which is a significant improvement for code robustness and maintainability.

Medium
Refactor to avoid redundant logic

Refactor the code to eliminate redundant column transformation logic. Apply the
transformation once at the beginning based on the upsampling_enabled flag.

src/sentry/api/endpoints/organization_events_stats.py [226-305]

 upsampling_enabled = should_upsample
-final_columns = query_columns
+
+if upsampling_enabled:
+    final_columns = transform_query_columns_for_error_upsampling(query_columns)
+else:
+    final_columns = query_columns
 
 if top_events > 0:
-    # Apply upsampling transformation just before query execution
-    # This late transformation ensures we use the most current schema assumptions
-    if upsampling_enabled:
-        final_columns = transform_query_columns_for_error_upsampling(query_columns)
-        
     if use_rpc:
         return scoped_dataset.run_top_events_timeseries_query(
             params=snuba_params,
             query_string=query,
             y_axes=final_columns,
             ...
         )
     return scoped_dataset.top_events_timeseries(
         timeseries_columns=final_columns,
         ...
     )
 
 if use_rpc:
-    # Apply upsampling transformation just before RPC query execution
-    if upsampling_enabled:
-        final_columns = transform_query_columns_for_error_upsampling(query_columns)
-        
     return scoped_dataset.run_timeseries_query(
         params=snuba_params,
         query_string=query,
         y_axes=final_columns,
         ...
     )
 
-# Apply upsampling transformation just before standard query execution
-if upsampling_enabled:
-    final_columns = transform_query_columns_for_error_upsampling(query_columns)
-
 return scoped_dataset.timeseries_query(
     selected_columns=final_columns,
     ...
 )

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies repeated logic and proposes a valid refactoring to apply the column transformation once, which improves code readability and maintainability by reducing redundancy.

Low
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants