Skip to content

Commit 8ceafa1

Browse files
JIRA <-> IMPACT sync for Status and Priority (#209)
* Impact Jira double sync --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 9905833 commit 8ceafa1

File tree

11 files changed

+1160
-199
lines changed

11 files changed

+1160
-199
lines changed

docs/architecture/jira-integration.md

Lines changed: 86 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,16 @@
55
The RAID module provides comprehensive bidirectional synchronization between Impact incidents and JIRA tickets, ensuring data consistency across both platforms.
66

77
**Applies to all P1-P5**:
8+
89
- All priorities create both `Incident` objects AND JIRA tickets
910
- The JIRA integration works identically for all priorities
1011

12+
**Double sync (both directions)**:
13+
14+
- Impact → Jira: on incident updates (status, priority, title, description, commander) via `incident_updated` signals; admin saves fall back to post_save handlers for status/priority
15+
- Jira → Impact: on Jira webhooks (status, priority, mapped fields) via webhook handlers
16+
- Loop-prevention cache ensures a change coming from one side is not re-sent back immediately
17+
1118
See [incident-workflow.md](incident-workflow.md) for architecture overview.
1219

1320
## Synchronization Architecture
@@ -37,25 +44,29 @@ See [incident-workflow.md](incident-workflow.md) for architecture overview.
3744
Centralizes all JIRA field preparation for both P1-P3 and P4-P5 workflows.
3845

3946
**P1-P3 (Critical)**:
47+
4048
- Trigger: `incident_channel_done` signal
4149
- Handler: `src/firefighter/raid/signals/incident_created.py`
4250
- Flow: Create Incident → Create Slack channel → Signal triggers JIRA ticket
4351

4452
**P4-P5 (Normal)**:
53+
4554
- Trigger: Form submission
4655
- Handler: `UnifiedIncidentForm._trigger_normal_incident_workflow()`
4756
- Flow: Direct call to `jira_client.create_issue()`
4857

4958
### Custom Fields Mapping
5059

5160
**Always Passed**:
61+
5262
- `customfield_11049` (environments): List of env values (PRD, STG, INT)
5363
- P1-P3: First environment only
5464
- P4-P5: All selected environments
5565
- `customfield_10201` (platform): Platform value (platform-FR, platform-All, etc.)
5666
- `customfield_10936` (business_impact): Computed from impacts_data
5767

5868
**Impact-Specific**:
69+
5970
- Customer: `zendesk_ticket_id`
6071
- Seller: `seller_contract_id`, `zoho_desk_ticket_id`, `is_key_account`, `is_seller_in_golden_list`
6172
- P4-P5: `suggested_team_routing`
@@ -66,117 +77,114 @@ Centralizes all JIRA field preparation for both P1-P3 and P4-P5 workflows.
6677

6778
### Impact → JIRA Sync
6879

69-
**Trigger**: Incident field updates in Impact
70-
**Handler**: `sync_incident_changes_to_jira()`
80+
**Trigger**: Incident field updates in Impact (via `incident_updated` with `updated_fields`), plus admin saves via post_save fallbacks for status/priority.
81+
82+
**Handlers**: `incident_updated_close_ticket_when_mitigated_or_postmortem` (status), `incident_updated_sync_priority_to_jira` (priority), post_save fallbacks for both.
7183

7284
**Syncable Fields**:
85+
7386
- `title``summary`
7487
- `description``description`
75-
- `priority``priority` (with value mapping)
76-
- `status``status` (with transitions)
88+
- `priority`Jira `customfield_11064` (numeric 1–5, or mapped option)
89+
- `status`Jira status (transitions via workflow)
7790
- `commander``assignee`
7891

7992
**Process**:
93+
8094
1. Check if RAID is enabled
81-
2. Validate update_fields parameter
82-
3. Filter for syncable fields only
83-
4. Apply loop prevention cache
84-
5. Call `sync_incident_to_jira()`
95+
2. Validate/update_fields
96+
3. Apply loop prevention
97+
4. Push status (Impact→Jira map)
98+
5. Push priority to Jira `customfield_11064`
8599

86100
### JIRA → Impact Sync
87101

88102
**Trigger**: JIRA webhook updates
103+
89104
**Handler**: `handle_jira_webhook_update()`
90105

91106
**Process**:
107+
92108
1. Parse webhook changelog data
93109
2. Identify changed fields
94-
3. Apply appropriate sync functions:
95-
- `sync_jira_status_to_incident()`
96-
- `sync_jira_priority_to_incident()`
97-
- `sync_jira_fields_to_incident()`
110+
3. For each changelog item, a single helper `_sync_jira_fields_to_incident()` handles:
111+
- Loop-prevention check (skips if mirrored Impact→Jira change)
112+
- Slack alert for the item
113+
- Status updates via `_handle_status_update`
114+
- Priority updates via `_handle_priority_update`
98115

99116
## Field Mapping
100117

101118
### Status Mapping
102119

103120
**JIRA → Impact**:
121+
104122
```python
105123
JIRA_TO_IMPACT_STATUS_MAP = {
106-
"Open": IncidentStatus.INVESTIGATING,
107-
"To Do": IncidentStatus.INVESTIGATING,
108-
"In Progress": IncidentStatus.MITIGATING,
109-
"In Review": IncidentStatus.MITIGATING,
110-
"Resolved": IncidentStatus.MITIGATED,
111-
"Done": IncidentStatus.MITIGATED,
112-
"Closed": IncidentStatus.POST_MORTEM,
113-
"Reopened": IncidentStatus.INVESTIGATING,
114-
"Blocked": IncidentStatus.MITIGATING,
115-
"Waiting": IncidentStatus.MITIGATING,
124+
"Incoming": IncidentStatus.OPEN,
125+
"Pending resolution": IncidentStatus.OPEN,
126+
"in progress": IncidentStatus.MITIGATING,
127+
"Reporter validation": IncidentStatus.MITIGATED,
128+
"Closed": IncidentStatus.CLOSED,
116129
}
117130
```
118131

119132
**Impact → JIRA**:
133+
120134
```python
121135
IMPACT_TO_JIRA_STATUS_MAP = {
122-
IncidentStatus.OPEN: "Open",
123-
IncidentStatus.INVESTIGATING: "In Progress",
124-
IncidentStatus.MITIGATING: "In Progress",
125-
IncidentStatus.MITIGATED: "Resolved",
126-
IncidentStatus.POST_MORTEM: "Closed",
136+
IncidentStatus.OPEN: "Incoming",
137+
IncidentStatus.INVESTIGATING: "in progress",
138+
IncidentStatus.MITIGATING: "in progress",
139+
IncidentStatus.MITIGATED: "Reporter validation",
140+
IncidentStatus.POST_MORTEM: "Reporter validation",
127141
IncidentStatus.CLOSED: "Closed",
128142
}
129143
```
130144

131145
### Priority Mapping
132146

133147
**JIRA → Impact**:
134-
```python
135-
JIRA_TO_IMPACT_PRIORITY_MAP = {
136-
"Highest": 1, # P1 - Critical
137-
"High": 2, # P2 - High
138-
"Medium": 3, # P3 - Medium
139-
"Low": 4, # P4 - Low
140-
"Lowest": 5, # P5 - Lowest
141-
}
142-
```
148+
149+
Uses the numeric Jira priority (1–5) and writes directly to Impact.
150+
151+
**Impact → JIRA**:
152+
153+
Uses the numeric Impact priority (1–5) and writes directly to Jira `customfield_11064`. Admin saves and UI edits both sync via signals/post_save fallback.
143154

144155
## Loop Prevention
145156

146157
### Cache-Based Mechanism
147158

148159
**Function**: `should_skip_sync()`
160+
149161
**Cache Key Format**: `sync:{entity_type}:{entity_id}:{direction}`
162+
150163
**Timeout**: 30 seconds
151164

152165
**Process**:
166+
153167
1. Check if sync recently performed
154168
2. Set cache flag during sync
155169
3. Automatic expiration prevents permanent blocks
156170

157-
### Sync Directions
158-
159-
```python
160-
class SyncDirection(Enum):
161-
IMPACT_TO_JIRA = "impact_to_jira"
162-
JIRA_TO_IMPACT = "jira_to_impact"
163-
IMPACT_TO_SLACK = "impact_to_slack"
164-
SLACK_TO_IMPACT = "slack_to_impact"
165-
```
171+
**Webhook bounce guard**: Impact→Jira writes a short-lived cache key per change (`sync:impact_to_jira:{incident_id}:{field}:{value}`). Jira webhook processing checks and clears that key to skip the mirrored change, preventing loops for status and priority (including `customfield_11064`).
166172

167173
## Error Handling
168174

169175
### Transaction Management
170176

171-
- All sync operations wrapped in `transaction.atomic()`
172-
- Rollback on any failure
177+
- Incident update creation uses `transaction.atomic()` (in `Incident.create_incident_update`) to ensure `IncidentUpdate` persistence and recovered-event handling.
178+
- Jira webhook and Impact→Jira signal handlers are not wrapped in `transaction.atomic()` today (they call out to Jira directly).
179+
- Rollback on any failure where wrapped; external Jira calls are best-effort and may partially succeed.
173180
- Detailed error logging with context
174181

175182
### Graceful Degradation
176183

177-
- Missing JIRA tickets: Log warning, continue
178-
- Field validation errors: Skip invalid fields
179-
- Network failures: Retry mechanism via Celery
184+
- Missing JIRA tickets: Log warning and continue (no rollback).
185+
- Field validation errors: Skip invalid fields (best-effort persist).
186+
- Jira/Slack calls: Best-effort with exception logging; no automatic retry in the sync handlers today.
187+
- Celery retries apply only to the dedicated Celery tasks (not the webhook/signal handlers).
180188

181189
## IncidentUpdate Integration
182190

@@ -194,6 +202,7 @@ class SyncDirection(Enum):
194202
### Loop Detection for IncidentUpdates
195203

196204
**Pattern**: Updates created by sync have:
205+
197206
- `created_by = None` (system update)
198207
- `message` contains "from Jira"
199208

@@ -253,23 +262,26 @@ def test_sync_incident_changes(self, mock_sync):
253262

254263
## Jira Post-Mortem Integration
255264

256-
### Overview
265+
### Post-mortem Overview
257266

258267
The Jira post-mortem feature creates dedicated post-mortem issues in Jira when an incident moves to the POST_MORTEM status. This provides a structured place to document root causes, impacts, and mitigation actions.
259268

260269
### Architecture
261270

262271
**Service Layer**: `src/firefighter/jira_app/service_postmortem.py`
272+
263273
- `JiraPostMortemService` - Main service for creating post-mortems
264274
- `create_postmortem_for_incident()` - Creates Jira post-mortem issue
265275
- `_generate_issue_fields()` - Generates content from templates
266276

267277
**Jira Client**: `src/firefighter/jira_app/client.py`
278+
268279
- `create_postmortem_issue()` - Creates the Jira issue
269280
- `_create_issue_link_safe()` - Links post-mortem to incident ticket (robust with fallbacks)
270281
- `assign_issue()` - Assigns to incident commander (graceful failure)
271282

272283
**Signal Handlers**: `src/firefighter/jira_app/signals/`
284+
273285
- `postmortem_created.py` - Triggers post-mortem creation on incident status change
274286
- `incident_key_events_updated.py` - Syncs key events from Slack to Jira timeline
275287

@@ -278,6 +290,7 @@ The Jira post-mortem feature creates dedicated post-mortem issues in Jira when a
278290
1. **Trigger**: Incident status changes to `POST_MORTEM` (via Slack modal or direct status update)
279291
2. **Signal**: `postmortem_created` signal sent with incident data
280292
3. **Content Generation**: Templates rendered with incident data:
293+
281294
- `incident_summary.txt` - Priority, category, created_at (excludes Status/Created)
282295
- `timeline.txt` - Chronological list of status changes and key events
283296
- `impact.txt` - Business impact description
@@ -294,13 +307,15 @@ The Jira post-mortem feature creates dedicated post-mortem issues in Jira when a
294307
### Issue Linking Strategy
295308

296309
**Problem**: Jira parent-child relationships have strict hierarchy rules. Setting a parent field can fail with:
297-
```
310+
311+
```text
298312
{"errors":{"parentId":"Given parent work item does not belong to appropriate hierarchy."}}
299313
```
300314

301315
**Solution**: Use flexible issue links instead of parent-child relationships.
302316

303317
**Implementation** (`_create_issue_link_safe()`):
318+
304319
1. Validate both issues exist
305320
2. Try multiple link types in order of preference:
306321
- "Relates" (standard bidirectional link)
@@ -310,6 +325,7 @@ The Jira post-mortem feature creates dedicated post-mortem issues in Jira when a
310325
4. Post-mortem creation always succeeds even if linking fails
311326

312327
**Benefits**:
328+
313329
- Works across any issue types regardless of hierarchy
314330
- Graceful degradation if link types unavailable
315331
- Main workflow never blocked by linking failures
@@ -319,13 +335,15 @@ The Jira post-mortem feature creates dedicated post-mortem issues in Jira when a
319335
**Template**: `src/firefighter/jira_app/templates/jira/postmortem/timeline.txt`
320336

321337
**Content**:
338+
322339
- Incident creation event
323340
- All status changes with timestamps
324341
- All key events (detected, started, recovered, etc.) with optional messages
325342
- Sorted chronologically ascending by `event_ts`
326343

327344
**Format** (Jira Wiki Markup):
328-
```
345+
346+
```text
329347
h2. Timeline
330348
331349
|| Time || Event ||
@@ -338,53 +356,62 @@ h2. Timeline
338356
```
339357

340358
**Key Events Sync**:
359+
341360
- Key events entered in Slack are synced to Jira timeline via `incident_key_events_updated` signal
342361
- Ensures timeline is always up-to-date with the latest incident events
343362

344363
### Graceful Error Handling
345364

346365
**Assignment Failures**:
366+
347367
- `assign_issue()` returns boolean instead of raising exceptions
348368
- Logs WARNING instead of ERROR
349369
- Post-mortem creation succeeds even if commander assignment fails
350370

351371
**Invalid Emojis** (Test Environments):
372+
352373
- Bookmark creation wrapped in try/except `SlackApiError`
353374
- Custom emojis (`:jira_new:`, `:confluence:`) may not exist in test workspaces
354375
- Logs warnings but doesn't fail post-mortem workflow
355376

356377
**Issue Link Failures**:
378+
357379
- Multiple link type fallbacks
358380
- Validates issues before linking
359381
- Post-mortem always created even if linking fails
360382

361383
### Slack Integration
362384

363385
**Notification Message** (`SlackMessageIncidentPostMortemCreated`):
386+
364387
- Posted to incident channel and pinned
365388
- Contains links to all available post-mortems (Confluence + Jira)
366389
- Sent by `postmortem_created_send()` signal handler
367390

368391
**Initial Message Update** (`SlackMessageIncidentDeclaredAnnouncement`):
392+
369393
- The pinned initial incident announcement message is automatically updated
370394
- Shows all post-mortem links alongside incident ticket link
371395
- Uses `SlackMessageStrategy.UPDATE` to update existing message
372396
- Format:
373-
```
397+
398+
```text
374399
:jira_new: <link|Jira ticket>
375400
:confluence: <link|Confluence Post-mortem>
376401
:jira_new: <link|Jira Post-mortem (PM-123)>
377402
```
378403

379404
**Channel Bookmarks**:
405+
380406
- Bookmarks added for quick access to post-mortems
381407
- Confluence: `:confluence:` emoji
382408
- Jira: `:jira_new:` emoji with issue key
383409
- Gracefully handles missing custom emojis in test environments
384410

385-
### Configuration
411+
### Post-mortem Configuration
386412

387413
**Settings** (`settings.py`):
414+
388415
```python
389416
# Post-mortem project and issue type
390417
JIRA_POSTMORTEM_PROJECT_KEY = "POSTMORTEM" # or same as incident project
@@ -402,13 +429,15 @@ JIRA_POSTMORTEM_FIELDS = {
402429
```
403430

404431
**Environment Variables**:
432+
405433
```bash
406434
ENABLE_JIRA_POSTMORTEM=true # Enable Jira post-mortem feature
407435
```
408436

409437
### Testing
410438

411439
**Test Files**:
440+
412441
- `tests/test_jira_app/test_postmortem_service.py` - Service layer tests (4 tests)
413442
- Incident summary excludes Status and Created fields
414443
- Timeline includes status changes
@@ -424,6 +453,7 @@ ENABLE_JIRA_POSTMORTEM=true # Enable Jira post-mortem feature
424453
- Handling key events with/without messages
425454

426455
**Test Patterns**:
456+
427457
```python
428458
@pytest.mark.django_db
429459
@patch("firefighter.jira_app.service_postmortem.JiraClient")
@@ -447,6 +477,7 @@ def test_create_postmortem(mock_jira_client):
447477
### Database Models
448478

449479
**JiraPostMortem** (`src/firefighter/jira_app/models.py`):
480+
450481
- `incident` - OneToOne to Incident (related_name: `jira_postmortem_for`)
451482
- `jira_issue_key` - Jira issue key (e.g., "PM-123")
452483
- `jira_issue_id` - Jira internal ID

0 commit comments

Comments
 (0)