ref(crons): Reorganize incident creation / issue occurrence logic#8
ref(crons): Reorganize incident creation / issue occurrence logic#8everettbu wants to merge 1 commit intomonitor-incident-refactor-beforefrom
Conversation
…0528) Since we'll be doing more with issue occurrences split out the concept of incidents into it's own logic module, as well as incident_occurrence into it's own module Part of GH-80527
There was a problem hiding this comment.
Greptile Summary
This PR implements a significant refactoring of the monitor incident creation and issue occurrence logic within Sentry's cron monitoring system. The changes extract complex functionality from the monolithic mark_failed.py file into two new dedicated modules:
incidents.py- Contains thetry_incident_thresholdfunction that centralizes incident creation logic, handling different monitor status states and failure thresholdsincident_occurrence.py- Houses functions for creating issue occurrences in Kafka, generating human-readable failure messages, and building monitor environment context
The refactoring also introduces a new SimpleCheckIn TypedDict in types.py to optimize database queries by using lightweight dictionary objects instead of full Django model instances when processing check-ins.
The main mark_failed function now delegates incident threshold handling to the new try_incident_threshold function, creating a cleaner separation of concerns. This modular approach improves code organization by giving each module a single, focused responsibility: mark_failed.py handles monitor state updates, incidents.py manages incident threshold logic, and incident_occurrence.py handles issue creation and messaging.
This refactoring maintains identical functionality while making the codebase more maintainable, testable, and following the single responsibility principle more closely. The changes also include performance optimizations through the use of .values() queries with the new SimpleCheckIn type.
Confidence score: 3/5
• This refactoring appears safe but has some potential concurrency and database consistency concerns that need attention
• The score reflects database transaction safety issues in incidents.py and a potential bug in incident_occurrence.py, plus the lack of comprehensive testing for concurrent scenarios
• Files needing more attention: src/sentry/monitors/logic/incidents.py for database race conditions and src/sentry/monitors/logic/incident_occurrence.py for the context building bug
4 files reviewed, 2 comments
|
|
||
| def get_failure_reason(failed_checkins: Sequence[SimpleCheckIn]): | ||
| """ | ||
| Builds a humam readible string from a list of failed check-ins. |
There was a problem hiding this comment.
syntax: 'humam' is misspelled, should be 'human'
| Builds a humam readible string from a list of failed check-ins. | |
| Builds a human readible string from a list of failed check-ins. |
| "id": str(monitor_environment.monitor.guid), | ||
| "slug": str(monitor_environment.monitor.slug), | ||
| "name": monitor_environment.monitor.name, | ||
| "config": monitor_environment.monitor.config, |
There was a problem hiding this comment.
logic: Using monitor_environment.monitor.config instead of the modified config variable defeats the purpose of the schedule_type transformation on line 162
| "config": monitor_environment.monitor.config, | |
| "config": config, |
Review Summary🏷️ Draft Comments (6)
|
Test 8