Skip to content

Conversation

@hussam789
Copy link

@hussam789 hussam789 commented Oct 30, 2025

User description

PR #8


PR Type

Enhancement


Description

  • Extract incident creation logic into dedicated incidents.py module

  • Extract incident occurrence logic into dedicated incident_occurrence.py module

  • Move SimpleCheckIn type definition to types.py for reusability

  • Simplify mark_failed.py by delegating to new try_incident_threshold function


Diagram Walkthrough

flowchart LR
  A["mark_failed.py"] -- "calls" --> B["try_incident_threshold"]
  B -- "in incidents.py" --> C["Incident Logic"]
  C -- "calls" --> D["create_incident_occurrence"]
  D -- "in incident_occurrence.py" --> E["Occurrence Logic"]
  F["types.py"] -- "defines" --> G["SimpleCheckIn"]
  C -- "uses" --> G
  E -- "uses" --> G
Loading

File Walkthrough

Relevant files
Enhancement
incident_occurrence.py
New incident occurrence creation logic module                       

src/sentry/monitors/logic/incident_occurrence.py

  • New module containing incident occurrence creation logic
  • Implements create_incident_occurrence() function to produce Kafka
    events
  • Includes helper functions get_failure_reason() and
    get_monitor_environment_context()
  • Defines mappings for human-readable failure status messages
+171/-0 
incidents.py
New incident threshold and creation logic module                 

src/sentry/monitors/logic/incidents.py

  • New module containing incident threshold and creation logic
  • Implements try_incident_threshold() function to handle incident state
    transitions
  • Manages monitor environment status updates and incident creation
  • Delegates to create_incident_occurrence() for event production
+104/-0 
mark_failed.py
Refactor to delegate incident logic to new modules             

src/sentry/monitors/logic/mark_failed.py

  • Removed 254 lines of incident and occurrence creation logic
  • Removed SimpleCheckIn TypedDict definition
  • Removed helper functions: get_failure_reason(),
    get_monitor_environment_context()
  • Removed status mapping constants
  • Simplified to delegate incident handling to try_incident_threshold()
    function
  • Reduced imports to only necessary dependencies
+4/-264 
types.py
Add SimpleCheckIn type definition for reuse                           

src/sentry/monitors/types.py

  • Added SimpleCheckIn TypedDict definition with fields: id, date_added,
    status
  • Includes docstring describing it as a stripped down check-in object
  • Enables reuse across multiple logic modules
+10/-0   

…0528)

Since we'll be doing more with issue occurrences split out the concept
of incidents into it's own logic module, as well as incident_occurrence
into it's own module

Part of GH-80527
@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Audit context: The new occurrence production and incident state transitions do not add explicit audit
logs capturing actor/user, action, and outcome, relying on existing signals/logging that
are not visible in this diff.

Referred Code
def create_incident_occurrence(
    failed_checkins: Sequence[SimpleCheckIn],
    failed_checkin: MonitorCheckIn,
    incident: MonitorIncident,
    received: datetime | None,
) -> None:
    from sentry.issues.issue_occurrence import IssueEvidence, IssueOccurrence
    from sentry.issues.producer import PayloadType, produce_occurrence_to_kafka

    monitor_env = failed_checkin.monitor_environment

    if monitor_env is None:
        return

    current_timestamp = datetime.now(timezone.utc)

    # Get last successful check-in to show in evidence display
    last_successful_checkin_timestamp = "Never"
    last_successful_checkin = monitor_env.get_last_successful_checkin()
    if last_successful_checkin:
        last_successful_checkin_timestamp = last_successful_checkin.date_added.isoformat()


 ... (clipped 65 lines)
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing try/except: Kafka production and DB lookups lack explicit error handling or fallback behavior, which
could cause silent failures if upstream functions raise exceptions.

Referred Code
def create_incident_occurrence(
    failed_checkins: Sequence[SimpleCheckIn],
    failed_checkin: MonitorCheckIn,
    incident: MonitorIncident,
    received: datetime | None,
) -> None:
    from sentry.issues.issue_occurrence import IssueEvidence, IssueOccurrence
    from sentry.issues.producer import PayloadType, produce_occurrence_to_kafka

    monitor_env = failed_checkin.monitor_environment

    if monitor_env is None:
        return

    current_timestamp = datetime.now(timezone.utc)

    # Get last successful check-in to show in evidence display
    last_successful_checkin_timestamp = "Never"
    last_successful_checkin = monitor_env.get_last_successful_checkin()
    if last_successful_checkin:
        last_successful_checkin_timestamp = last_successful_checkin.date_added.isoformat()


 ... (clipped 65 lines)
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid creating multiple redundant occurrences

To avoid creating multiple redundant occurrences for a single monitor failure,
call create_incident_occurrence only once using the triggering failed_checkin,
instead of iterating through all previous failed check-ins.

src/sentry/monitors/logic/incidents.py [92-100]

 if not monitor_env.monitor.is_muted and not monitor_env.is_muted and incident:
-    checkins = MonitorCheckIn.objects.filter(id__in=[c["id"] for c in previous_checkins])
-    for checkin in checkins:
-        create_incident_occurrence(
-            previous_checkins,
-            checkin,
-            incident,
-            received=received,
-        )
+    create_incident_occurrence(
+        previous_checkins,
+        failed_checkin,
+        incident,
+        received=received,
+    )
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies that the code creates multiple redundant occurrences for a single monitor failure, and the proposed fix correctly simplifies the logic to create only one.

Medium
Use modified config in return value

In get_monitor_environment_context, return the modified config variable instead
of the original monitor_environment.monitor.config to ensure the human-readable
schedule_type is used.

src/sentry/monitors/logic/incident_occurrence.py [159-171]

 def get_monitor_environment_context(monitor_environment: MonitorEnvironment):
     config = monitor_environment.monitor.config.copy()
     if "schedule_type" in config:
         config["schedule_type"] = monitor_environment.monitor.get_schedule_type_display()
 
     return {
         "id": str(monitor_environment.monitor.guid),
         "slug": str(monitor_environment.monitor.slug),
         "name": monitor_environment.monitor.name,
-        "config": monitor_environment.monitor.config,
+        "config": config,
         "status": monitor_environment.get_status_display(),
         "type": monitor_environment.monitor.get_type_display(),
     }
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a bug where the modified config is discarded, causing the function to return the original, unmodified configuration.

Medium
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants