Skip to content

Commit 307b547

Browse files
committed
New incident automations docs (#33696)
* New incident automations docs * Add to the index * Apply suggestions from code review * Remove postmortm and viewer access, add link to blueprints
1 parent d499fe4 commit 307b547

File tree

4 files changed

+255
-0
lines changed

4 files changed

+255
-0
lines changed

content/en/incident_response/incident_management/incident_settings/_index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ To create an incident type:
3838
{{< nextlink href="/incident_response/incident_management/incident_settings/property_fields" >}}Property Fields{{< /nextlink >}}
3939
{{< nextlink href="/incident_response/incident_management/incident_settings/responder_types" >}}Responder Types{{< /nextlink >}}
4040
{{< nextlink href="/incident_response/incident_management/incident_settings/templates" >}}Templates{{< /nextlink >}}
41+
{{< nextlink href="/incident_response/incident_management/incident_settings/automations" >}}Automations{{< /nextlink >}}
4142
{{< /whatsnext >}}
4243

4344
[1]: https://app.datadoghq.com/incidents/settings
Lines changed: 254 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,254 @@
1+
---
2+
title: Automations
3+
site_support_id: actions
4+
further_reading:
5+
- link: "/actions/workflows/"
6+
tag: "Documentation"
7+
text: "Workflow Automation"
8+
- link: "/incident_response/incident_management/incident_settings/notification_rules"
9+
tag: "Documentation"
10+
text: "Notification Rules"
11+
---
12+
13+
## Overview
14+
15+
{{< img src="/incident_response/incident_automations_workflow.png" alt="Incident automations workflow diagram showing automation actions." style="width:100%;" >}}
16+
17+
Automations enable you to customize and extend incident management to fit your organization's specific processes. Automatically trigger actions based on incident events such as severity changes, or state transitions.
18+
19+
Automations are powered by [Datadog Workflow Automation][1] and are included in your Incident Management billing at no additional cost.
20+
21+
## Prerequisites
22+
23+
To create and manage automations, you must have the following permissions:
24+
25+
- `Workflows Write` permission
26+
- `Incident Settings Write` **OR** `Incident Notification Settings Write` permission
27+
28+
To run automations on **private incidents**, use a user or service account with the `Private Incidents Global Access` permission. Without this permission, the automation cannot access incident data.
29+
30+
For more information on permissions, see [Datadog Role Permissions][2].
31+
32+
## Accessing automations
33+
34+
Automations are configured per [incident type][3]. To manage automations:
35+
36+
1. Navigate to [**Incidents > Settings**][4].
37+
2. Select an incident type from the list.
38+
3. Click the **Automations** tab.
39+
40+
From this page, you can view, create, enable, disable, and manage your automations.
41+
42+
<div class="alert alert-info">Any user with <code>Incident Settings Write</code> or <code>Incident Notification Settings Write</code> permissions can toggle automations on or off. This is true even if they don't have edit access to the automation itself. Administrators can quickly disable problematic automations if needed.</div>
43+
44+
## Creating an automation
45+
46+
You can build automations entirely from the Incident Management settings UI. For more advanced workflows, open the automation in the [Workflow Automation][1] editor to access additional actions and logic capabilities.
47+
48+
When you click **New Automation**, you have two options for building your workflow:
49+
50+
### Start with a blueprint
51+
52+
Blueprints provide pre-configured automation templates for common use cases, such as sending a Slack message to the incident channel. Using a blueprint is the fastest way to get started.
53+
54+
### Choose an action
55+
56+
For custom processes, you can build an automation from scratch by starting with an individual action. You can choose from incident-specific actions or explore the full Datadog [Action Catalog][5], which contains thousands of integrations.
57+
58+
## Configuring triggers and conditions
59+
60+
### Trigger types
61+
62+
Select when your automation should run:
63+
64+
| Trigger Type | Description |
65+
|--------------|-------------|
66+
| **When the incident is declared** | Runs once when an incident is first declared and meets the defined conditions. |
67+
| **When the incident is declared or updated** | Runs when an incident is declared or when any field changes that cause the incident to meet the conditions. |
68+
| **On a schedule** | Runs repeatedly on a per-incident basis (for example, every 10 minutes for each active incident that meets the conditions). Useful for periodic reminders and status checks. |
69+
70+
### Conditions
71+
72+
Define conditions to specify which incidents trigger the automation. Conditions are based on incident attributes such as Severity, State, Teams, or other custom property fields.
73+
74+
* **Logic within a row**: Selecting multiple values for a single property (like `SEV-1` and `SEV-2`) uses `OR` logic.
75+
* **Logic across rows**: Adding multiple property filters uses `AND` logic.
76+
77+
**Example**: Set conditions for `severity:SEV-1`, `severity:SEV-2`, and `summary:is empty`. The automation runs when the incident is (SEV-1 **OR** SEV-2) **AND** the summary is empty.
78+
79+
{{< img src="/incident_response/incident_automations_conditions.png" alt="Screenshot showing incident automation conditions configuration in Datadog. Displays UI for setting trigger, severity, and summary conditions." style="width:90%;" >}}
80+
81+
## Building automation workflows
82+
83+
Automations use the Datadog Workflow Automation engine. Each automation is a workflow that can include multiple actions and logic steps.
84+
85+
### Using incident data
86+
87+
Automations have access to all incident data through the `incident` context variable, which includes:
88+
89+
- `incident.id`: The incident's unique identifier
90+
- `incident.attributes`: All incident attributes (severity, state, title, custom fields, and more)
91+
- `incident.fieldDiffs`: A list of fields that changed (for update triggers)
92+
93+
Use these variables in your automation actions by referencing them with curly braces, such as `{{ incident.id }}`.
94+
95+
### Configuring actions
96+
97+
Each action in your automation requires configuration. For example, to send a message to an incident's Slack channel:
98+
99+
1. Add the **Get incident Slack channel** action.
100+
2. Set the input parameter to `{{ incident.id }}`.
101+
3. Add the **Send Slack message** action.
102+
4. Configure the message content using incident variables.
103+
104+
The workflow editor provides autocomplete for available variables and validates your configuration.
105+
106+
## Testing automations
107+
108+
There are two ways to test your automations:
109+
110+
### Option 1: Declare a test incident
111+
112+
1. Enable [test incidents][6] in your incident settings.
113+
2. Declare a test incident that matches your automation's conditions.
114+
3. View the automation execution in the incident timeline.
115+
116+
### Option 2: Test from an existing incident
117+
118+
1. Open the automation in the workflow editor.
119+
2. Click the **Run** button.
120+
3. Select **Test from incident**.
121+
4. Choose an existing incident to simulate the trigger.
122+
123+
This populates the `incident` context variable with data from the selected incident without actually triggering the automation for that incident.
124+
125+
## Viewing automation executions
126+
127+
### From the incident timeline
128+
129+
Every automation execution appears in the [incident timeline][7]. Timeline entries include:
130+
131+
- The automation name
132+
- Execution timestamp
133+
- Link to the detailed execution view
134+
- Execution status (success or failure)
135+
136+
You can filter the timeline to show only automation executions or exclude them entirely.
137+
138+
### From execution history
139+
140+
To view all executions of an automation:
141+
142+
1. Open the automation.
143+
2. Click **Execution** in the workflow editor.
144+
145+
The execution history shows:
146+
- All input parameters and their values
147+
- The `incident` context data
148+
- The `fieldDiffs` showing what changed
149+
- Step-by-step execution results
150+
- Any errors or failures
151+
152+
## Permissions and access control
153+
154+
### Edit access
155+
156+
By default, only the automation creator can edit an automation. To grant edit access to others:
157+
158+
1. Open the automation.
159+
2. Click **Edit Access**.
160+
3. Add users or service accounts.
161+
162+
<div class="alert alert-tip">Granting edit access allows others to use the Datadog API as you or as the service account. Use service accounts for shared automations to avoid issues when users leave the organization.</div>
163+
164+
165+
### Service accounts
166+
167+
Using a service account to run automations provides several benefits:
168+
169+
- Automations continue running if the creator leaves the organization
170+
- Better separation of duties and access control
171+
- Clearer audit trails
172+
173+
To use a service account:
174+
175+
1. Open the automation.
176+
2. Click **Run as Service Account**.
177+
3. Create a new service account with appropriate roles or select an existing one.
178+
179+
You must have the `Service Account Write` permission to configure service accounts for automations.
180+
181+
## Private incidents
182+
183+
Automations can run on private incidents with the following considerations:
184+
185+
### Required permissions
186+
187+
To run automations on private incidents, use a user or service account with the `Private Incidents Global Access` permission. Without this permission, the automation cannot access incident data.
188+
189+
### Security considerations
190+
191+
By default, execution history (including private incident data) is visible to anyone in your organization. To run automations on private incidents securely:
192+
193+
1. Use a service account with `Private Incidents Global Access` permission.
194+
1. Restrict viewer access to only users who should see private incident data.
195+
196+
## Differences from notification rules
197+
198+
Both automations and [notification rules][8] can respond to incident events, but they serve different purposes:
199+
200+
| Feature | Automations | Notification Rules |
201+
|---------|-------------|-------------------|
202+
| **Purpose** | Execute complex workflows and integrations | Send notifications to stakeholders |
203+
| **Triggers** | Declared, updated, or scheduled | Declared or updated |
204+
| **Actions** | Access to full Datadog Action Catalog | Limited to notification channels |
205+
| **Complexity** | Multi-step workflows with logic | Single notification per rule |
206+
| **Cost** | Included in Incident Management | Included in Incident Management |
207+
208+
Use notification rules for straightforward notifications and automations for complex, multi-step processes.
209+
210+
211+
## Use cases and examples
212+
213+
Use the following examples to help you build your own incident automations.
214+
215+
{{% collapse-content title="Add teams to the incident channel" level="h4" expanded=false %}}
216+
217+
**Trigger**: When declared or updated<br>
218+
**Condition**: Severity is `SEV-1` or `SEV-2`<br>
219+
**Actions**:
220+
1. Detect when teams field changes
221+
2. Add new teams to the incident teams list
222+
3. For all users in the team, invite them to the incident slack channel
223+
224+
Access the [blueprint in Datadog][9].
225+
{{% /collapse-content %}}
226+
227+
{{% collapse-content title="Periodic status reminders" level="h4" expanded=false %}}
228+
229+
**Trigger**: On a schedule (every 30 minutes)<br>
230+
**Condition**: Severity is `SEV-1` or `SEV-2`, State is `Active` or `Stable`<br>
231+
**Actions**:
232+
1. Check time since last update
233+
2. Send Slack reminder if > 30 minutes
234+
3. Prompt commander to update incident status
235+
236+
Access the [blueprint in Datadog][10].
237+
238+
{{% /collapse-content %}}
239+
240+
241+
## Further reading
242+
243+
{{< partial name="whats-next/whats-next.html" >}}
244+
245+
[1]: /actions/workflows/
246+
[2]: /account_management/rbac/permissions/#case-and-incident-management
247+
[3]: /incident_response/incident_management/incident_settings/#incident-types
248+
[4]: https://app.datadoghq.com/incidents/settings
249+
[5]: https://app.datadoghq.com/actions/action-catalog
250+
[6]: /incident_response/incident_management/incident_settings/information#test-incidents
251+
[7]: /incident_response/incident_management/investigate/timeline
252+
[8]: /incident_response/incident_management/incident_settings/notification_rules
253+
[9]: https://app.datadoghq.com/workflow/blueprints/add-datadog-team-to-incident-channel
254+
[10]: https://app.datadoghq.com/workflow/blueprints/nudge-incident-commander-old-incident
24.3 KB
Loading
144 KB
Loading

0 commit comments

Comments
 (0)