Skip to content

Conversation

@akshayutture-augment
Copy link
Contributor

…0528)

Since we'll be doing more with issue occurrences split out the concept
of incidents into it's own logic module, as well as incident_occurrence
into it's own module

Part of GH-80527
@akshayutture-augment
Copy link
Contributor Author

@claude review this PR

@claude
Copy link

claude bot commented Nov 20, 2025

Claude finished @akshayutture-augment's task —— View job


PR Review Complete

Todo List

  • Check for repository guidelines (CLAUDE.md)
  • Read changed files to understand the refactoring
  • Analyze code structure and logic
  • Provide comprehensive review feedback

Summary

This PR performs a clean refactoring that extracts incident and occurrence logic from mark_failed.py into two new, focused modules. The refactoring improves code organization and maintainability while preserving existing functionality.

Overall Assessment: ✅ Approved

The refactoring is well-executed with proper separation of concerns. The code is functionally equivalent to the original with one notable improvement: the function name change from create_issue_platform_occurrence to create_incident_occurrence is more semantic and descriptive.


Detailed Review

✅ What Was Done Well

  1. Clear Separation of Concerns: The refactoring properly separates three distinct responsibilities:

    • mark_failed.py: Monitor environment state management
    • incidents.py: Incident threshold detection and creation logic
    • incident_occurrence.py: Issue occurrence creation and reporting
  2. Function Renaming Improvement: Changed create_issue_platform_occurrencecreate_incident_occurrence (more descriptive and clearer)

  3. Type Organization: Moving SimpleCheckIn TypedDict to types.py is appropriate since it's used across multiple modules

  4. Import Cleanup: Removed unnecessary imports from mark_failed.py after extracting the code

  5. Preserved Functionality: The refactoring maintains exact behavioral equivalence - just moves code without changing logic


Issues Found

🐛 Typo in Docstring

Location: src/sentry/monitors/logic/incident_occurrence.py:131

def get_failure_reason(failed_checkins: Sequence[SimpleCheckIn]):
    """
    Builds a humam readible string from a list of failed check-ins.

Issue: "humam readible" should be "human readable"

Severity: Low (cosmetic)


Observations & Questions

1. Loop Over Checkins in try_incident_threshold

Location: src/sentry/monitors/logic/incidents.py:93-100

if not monitor_env.monitor.is_muted and not monitor_env.is_muted and incident:
    checkins = MonitorCheckIn.objects.filter(id__in=[c["id"] for c in previous_checkins])
    for checkin in checkins:
        create_incident_occurrence(
            previous_checkins,
            checkin,
            incident,
            received=received,
        )

Question: Is it intentional to create an occurrence for each check-in in the threshold window?

  • When failure_issue_threshold > 1, this creates multiple occurrences for the same incident
  • For example, if threshold is 3, this creates 3 separate occurrences all referencing the same failed_checkins list and incident
  • Is this for creating multiple events in the issue platform, or should it only create one occurrence for the incident?

Impact: This could result in duplicate notifications or issue events if unintentional


2. Potential Edge Case: Empty Check-ins After Filter

Location: src/sentry/monitors/logic/incidents.py:93

checkins = MonitorCheckIn.objects.filter(id__in=[c["id"] for c in previous_checkins])

Observation: If any check-ins in previous_checkins have been deleted between constructing the list and this query, the checkins QuerySet could have fewer items than previous_checkins

Question: Is there any scenario where check-ins could be deleted during this execution window? If so, should there be validation that checkins.count() == len(previous_checkins)?

Likelihood: Very low, but worth considering for data integrity


3. Code Quality: Import Organization

Location: src/sentry/monitors/logic/incident_occurrence.py:34-35

def create_incident_occurrence(...):
    from sentry.issues.issue_occurrence import IssueEvidence, IssueOccurrence
    from sentry.issues.producer import PayloadType, produce_occurrence_to_kafka

Observation: These imports are inside the function. This is a valid pattern to avoid circular imports, but worth documenting if that's the reason.

Suggestion: If there's no circular dependency issue, moving these to top-level imports would be more conventional Python style.


Testing Considerations

Since this is a refactoring without logic changes, the existing test suite should provide coverage. However, ensure:

  1. ✅ Unit tests for mark_failed() still pass
  2. ✅ Integration tests for the incident creation flow still pass
  3. ✅ Tests verify that occurrences are created correctly with the new module structure
  4. ⚠️ Consider adding a test specifically for the multi-occurrence behavior mentioned in observation chore(deps): bump react-mentions and @types/react-mentions #1 to document the expected behavior

Architecture & Design

Positive: This refactoring aligns well with the stated goal of "doing more with issue occurrences" (per the commit message referencing GH-80527). The new module structure makes it easier to:

  • Add new occurrence types
  • Modify incident detection logic independently
  • Test each component in isolation
  • Maintain and reason about the codebase

Final Recommendation

Approve with Minor Fixes: The refactoring is solid and improves code organization. The only required change is fixing the typo. The observations about occurrence creation behavior should be validated to ensure the current behavior is intentional, but they don't block merging if this matches the expected design.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants