Skip to content

Conversation

@akshayutture-augment
Copy link
Contributor

armenzg and others added 30 commits June 20, 2025 12:49
…o 'low' (#93927)"

This reverts commit 8d04522.

Co-authored-by: roaga <[email protected]>
Missed in the initial commit, leading to some relevant logs being
unannotated.
We have had a few tasks get killed at 10% rollout.
Also add a test, so that this doesn't happen again
Fixes DE-129 and DE-156

---------

Co-authored-by: Cursor Agent <[email protected]>
These transitions should be matching
…` (#93946)

Use `project_id` on the replay record instead of the URL (where it does
not always exist).

---------

Co-authored-by: Cursor Agent <[email protected]>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Also fixed `replay.view_html` -> `replay.view-html`

---------

Co-authored-by: Michelle Zhang <[email protected]>
…948)

gets `npx @typescript/native-preview` passing again
The conditions associated with a DCG can change over time, and it's good
if we can be completely confident that they're consistent within a given
task execution.
This is unused and most regex experiments have required broader changes
to ensure that regexes are evaluated in a specific order (ex:
traceparent). Removing this for now to simplify the code and very
slightly improve runtime performance.
From some testing (on feedback lists of all different lengths), this
prompt seems to work better. It doesn't write overly long sentences and
also does a better job at "summarizing" versus just mentioning a few
specific topics and leaving out others.
Just remove a couple custom Flex* classes in favor of the Flex primitive
This has been killed a few times.

Refs SENTRY-42M7
…n table (#93892)

<!-- Describe your PR here. -->

[ticket](https://linear.app/getsentry/issue/ID-156/grouping-info-remove-type-field-from-ui)
The Type field in the Grouping Info section of the issue details page
was redundant.
This removes the Type row from all variant types while keeping the
underlying data structure intact.

before
![Screenshot 2025-06-20 at 12 00
54 PM](https://github.com/user-attachments/assets/97ca72da-0a52-4446-9825-cd4fcb505adf)

after
![Screenshot 2025-06-20 at 11 59
29 AM](https://github.com/user-attachments/assets/a4284d2b-c9f5-442f-b010-7fe72a598e39)
### Changes
Related to this PR: getsentry/sentry#93810. This
is part 1 of the change, which is pulling out the new component and just
adding it to the repo. Also includes some simplification of the logic in
the base component.

Part 2 will be replacing tables in widgets.

### Before/After

There is no UI change as the table is not being used yet. There is a new
story page for the component.
…93943)

to prevent this issue from becoming too noisy, add a noise config
Unfortunately, 'event_data' went from being the variable for current
event context to being the complete parsed data from Redis, and we
continued logging it per group.
That's more data than we should be logging even arguably once, let alone
per group.
Adds some simple analytics to our endpoint so we can begin building a
dashboard in Amplitude.
Previously, explore supported multiple y axis per chart, so each
visualize supported multiple y axis. That functionality has since been
removed for simplicity so update the types here to match. Keep in mind
that saved queries still store them as an array so when
serializing/deserializing, we still need to treat it as an array.
We'll need the `useGetTraceItemAttributeKeys` hook in other places so
refactoring it so that it can exported.
mrduncan and others added 2 commits June 20, 2025 13:20
When the max segment ID is null the process fails. We should exit early
since if there aren't any segments to delete there's nothing to do.
@github-actions
Copy link
Contributor

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

@ai-code-review-evaluations ai-code-review-evaluations deleted a comment from claude bot Nov 19, 2025
@ai-code-review-evaluations ai-code-review-evaluations deleted a comment from claude bot Nov 20, 2025
@akshayutture-augment
Copy link
Contributor Author

@claude review this PR

@claude
Copy link

claude bot commented Nov 20, 2025

Claude finished @akshayutture-augment's task —— View job


Code Review Complete ✅

Reviewing PR: Replays Self-Serve Bulk Delete System

Scope: 100 files changed (2312 additions, 981 deletions)


Overview

This PR appears to be a test PR that combines changes from multiple unrelated features rather than focusing solely on a "Replays Self-Serve Bulk Delete System" as the title suggests. The changes span:

  • Replay delete functionality improvements
  • Browser Reporting API implementation
  • Workflow engine detector updates
  • Frontend refactoring across multiple areas (Explore, Dashboards, Codecov, etc.)
  • New table widget visualization component
  • Various test updates

Critical Issues

1. Frontend/Backend Separation Violation ⚠️

File: Multiple files across frontend and backend

The GitHub bot warning is correct - this PR mixes frontend and backend changes extensively, which violates Sentry's deployment practices. According to CLAUDE.md, "The Frontend and Backend are not atomically deployed." This needs to be split into separate PRs.

Recommendation: Split this into separate PRs for frontend and backend changes, ensuring they are forward/backward compatible.


2. Null Handling Bug Fix in Replay Delete

File: src/sentry/replays/usecases/delete.py:89-90

if row["max_segment_id"] is None:
    return []

Good: This correctly handles the case where max_segment_id is None, preventing a crash. This aligns with the commit message "fix(replay): Add handling for null max_segment_id".

Issue: The logic assumes all segments 0 to max_segment_id exist (line 100), which could delete non-existent segments. While the code notes this is acceptable as a waste, consider if this impacts performance at scale.


3. ThreadPoolExecutor with max_workers=100 ⚠️

File: src/sentry/replays/usecases/delete.py:74

with cf.ThreadPoolExecutor(max_workers=100) as pool:
    pool.map(_delete_if_exists, _make_recording_filenames(project_id, row))

Issue: Using 100 concurrent workers for blob deletion could:

  • Overwhelm the storage service (GCS)
  • Lead to rate limiting
  • Cause memory issues if many replays are deleted simultaneously

Recommendation:

  • Reduce to 10-20 workers
  • Add rate limiting
  • Monitor GCS API quota usage

4. Browser Reporting Collector - Validation Issues ⚠️

File: src/sentry/issues/endpoints/browser_reporting_collector.py:50-60

def validate_timestamp(self, value: int) -> int:
    """Validate that age is absent, but timestamp is present."""
    if self.initial_data.get("age"):
        raise serializers.ValidationError("If timestamp is present, age must be absent")
    return value

def validate_age(self, value: int) -> int:
    """Validate that age is present, but not timestamp."""
    if self.initial_data.get("timestamp"):
        raise serializers.ValidationError("If age is present, timestamp must be absent")
    return value

Issue: Mutual exclusivity validation logic is fragile. If both fields are present, both validators will raise errors, resulting in confusing error messages (as seen in test line 171-174).

Recommendation: Move this validation to a class-level validate() method to handle mutual exclusivity more cleanly.


5. Missing Error Handling in AI Summary Endpoint ⚠️

File: src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py:184-208

def make_seer_request(request_data: str) -> bytes:
    response = requests.post(...)
    
    if response.status_code != 200:
        logger.warning(...)
    
    response.raise_for_status()
    return response.content

Issues:

  • No timeout specified on the requests.post() call - could hang indefinitely
  • No retry logic for transient failures
  • No circuit breaker for Seer service failures

Recommendation:

response = requests.post(
    ...,
    timeout=30,  # Add timeout
    # Consider using retry decorator
)

6. Security: CSRF Exemption on Public Endpoint ⚠️

File: src/sentry/issues/endpoints/browser_reporting_collector.py:87-89

@csrf_exempt
@allow_cors_options
def post(self, request: Request, *args: Any, **kwargs: Any) -> Response:

Issue: While CSRF exemption is necessary for the Browser Reporting API (which sends cross-origin requests), this endpoint:

  • Has no authentication (permission_classes = ())
  • Is open to any origin via CORS
  • Could be abused for log spam or DDoS

Recommendations:

  1. Add rate limiting per IP/origin
  2. Implement request size limits
  3. Add origin validation/allowlist
  4. Consider API key authentication
  5. Monitor for abuse patterns

Current Protection: The endpoint is behind a feature flag (issues.browser_reporting.collector_endpoint_enabled), which is good.


7. Workflow Detector - Owner Field Update Logic

File: src/sentry/workflow_engine/endpoints/validators/base/detector.py:66-79

if "owner" in validated_data:
    owner = validated_data.get("owner")
    if owner:
        if owner.is_user:
            instance.owner_user_id = owner.id
            instance.owner_team_id = None
        elif owner.is_team:
            instance.owner_user_id = None
            instance.owner_team_id = owner.id
    else:
        # Clear owner if None is passed
        instance.owner_user_id = None
        instance.owner_team_id = None

Good: This correctly handles owner updates including clearing the owner when None is passed. The logic properly maintains mutual exclusivity between user and team owners.


Frontend Issues

8. Missing Error Boundaries in New Components ⚠️

File: static/app/views/dashboards/widgets/tableWidget/tableWidgetVisualization.tsx

The new TableWidgetVisualization component lacks error boundary wrapping. While the parent might have one, complex visualization components should handle their own errors gracefully.

Recommendation: Add error boundary or ensure parent components handle errors appropriately.


9. useGetTraceItemAttributeKeys - Missing Cleanup ℹ️

File: static/app/views/explore/hooks/useGetTraceItemAttributeKeys.tsx:64-116

The getTraceItemAttributeKeys function makes async API calls but doesn't handle cleanup if the component unmounts during the request.

Recommendation: Consider using an AbortController or the built-in cancellation from React Query if converting to a hook.


10. Frontend Refactoring - Inconsistent Patterns ℹ️

Multiple frontend files show refactoring from one pattern to another (e.g., component simplification, hook usage). While individually these look fine, the scale of changes makes it hard to verify:

  • static/app/components/codecov/* - Multiple simplifications
  • static/app/views/explore/* - Extensive refactoring
  • static/app/views/dashboards/* - New table widget

Recommendation: Consider breaking frontend refactoring into smaller, focused PRs for easier review.


Test Coverage

11. Good Test Coverage for Browser Reporting

File: tests/sentry/api/endpoints/test_browser_reporting_collector.py

The test file has comprehensive coverage:

  • Tests for 404 by default
  • Valid reports (both spec formats)
  • Invalid content types
  • Multiple reports
  • Field validation (missing fields, invalid types, mutual exclusivity)

Good Practice: The use of fixtures and clear test organization.


12. Deleted Migration Test ⚠️

File: tests/sentry/migrations/test_0917_convert_org_saved_searches_to_views.py (DELETED)

The corresponding migration file shows it was an error and never ran, so deleting the test makes sense. However, ensure the corrected migration (0920 or 0921) has proper test coverage.


Performance Concerns

13. Snuba Query with Large Limits ⚠️

File: src/sentry/integrations/source_code_management/commit_context.py:699

][:OPEN_PR_MAX_RECENT_ISSUES]  # 5000 issues

Fetching 5000 group IDs from the database could be expensive. Consider:

  • Adding pagination
  • Using a more restrictive time window
  • Adding indexes on relevant columns

14. Large Parameterization Regex ℹ️

File: src/sentry/grouping/parameterization.py

The parameterization logic uses complex regex patterns that could be CPU-intensive for large strings. The use of @lru_cache for tiktoken encoding is good, but consider profiling the overall parameterization performance.


Missing Documentation

15. New Analytics Event Lacks Context ℹ️

File: src/sentry/preprod/analytics.py

class PreprodArtifactApiAssembleEvent(analytics.Event):
    type = "preprod_artifact.api.assemble"

No docstring explaining when this event is triggered or what it tracks.

Recommendation: Add docstring explaining the event's purpose and when it fires.


Positive Highlights

  1. Null handling fix in replay delete prevents crashes
  2. Comprehensive test coverage for browser reporting endpoint
  3. Proper use of feature flags for experimental features
  4. Audit logging in workflow detector changes
  5. Type safety - Good use of TypeScript and TypedDict
  6. Code organization - New table widget is well-structured
  7. Graceful error handling in many places (Sentry SDK capture, logging)

Recommendations Summary

High Priority

  1. Split this PR into separate frontend and backend PRs
  2. Reduce ThreadPoolExecutor workers from 100 to 10-20
  3. Add rate limiting and origin validation to Browser Reporting endpoint
  4. Add timeout to Seer API request

Medium Priority

  1. Fix mutual exclusivity validation in Browser Reporting serializer
  2. Add error boundaries for new visualization components
  3. Review and optimize large Snuba queries

Low Priority

  1. Add documentation for new analytics events
  2. Consider breaking up large frontend refactoring PRs
  3. Profile parameterization performance

Final Verdict

This PR contains good individual changes but violates the fundamental principle of not mixing frontend and backend changes in a single PR. The title "Replays Self-Serve Bulk Delete System" is misleading as it contains changes across many unrelated features.

Recommendation:

  • Split into at least 2-3 focused PRs
  • Address the security and performance concerns noted above
  • Ensure proper testing of the migration changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.