Skip to content

Commit 75e80e7

Browse files
committed
add new azure sql database alerts
1 parent 5a6b86b commit 75e80e7

File tree

1 file changed

+127
-1
lines changed

1 file changed

+127
-1
lines changed

csp-mixin/alerts/azure-alerts.yml

Lines changed: 127 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,135 @@ groups:
2222
keep_firing_for: 10m
2323
labels:
2424
severity: critical
25-
service: 'Azure Virtual Machines.'
25+
service: 'Azure Virtual Machines'
2626
namespace: cloud-provider-azure
2727
annotations:
2828
summary: 'VM unavailable.'
2929
description: 'The VM {{ $labels.resourceName }} is not functioning or crashed, which may require immediate action.'
3030
dashboard_uid: '58f33c50e66c911b0ad8a25aa438a96e'
31+
32+
- alert: AzureHighDtuConsumption
33+
expr: |
34+
avg by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_dtu_consumption_percent_average_percent{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 90
35+
for: 10m
36+
keep_firing_for: 10m
37+
labels:
38+
severity: critical
39+
service: 'Azure SQL database'
40+
namespace: cloud-provider-azure
41+
annotations:
42+
summary: 'High DTU consumption.'
43+
description: 'Check active queries and optimize indexes or consider scaling up DTUs to handle load in {{ $labels.resourceName }} database.'
44+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
45+
46+
- alert: AzureHighStorageUsage
47+
expr: |
48+
avg by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_storage_percent_maximum_percent{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 85
49+
for: 10m
50+
keep_firing_for: 10m
51+
labels:
52+
severity: critical
53+
service: 'Azure SQL database'
54+
namespace: cloud-provider-azure
55+
annotations:
56+
summary: 'High Storage usage.'
57+
description: 'Archive or delete old data, or scale up storage capacity in {{ $labels.resourceName }} database.'
58+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
59+
60+
- alert: AzureHighDeadlockCount
61+
expr: |
62+
sum by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_deadlock_total_count{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 5
63+
for: 10m
64+
keep_firing_for: 10m
65+
labels:
66+
severity: info
67+
service: 'Azure SQL database'
68+
namespace: cloud-provider-azure
69+
annotations:
70+
summary: 'High Deadlock count.'
71+
description: 'Check {{ $labels.resourceName }} database logs for deadlock details and optimize affected queries.'
72+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
73+
74+
- alert: AzureHighUserCpuUsage
75+
expr: |
76+
avg by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_cpu_percent_average_percent{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 80
77+
for: 10m
78+
keep_firing_for: 10m
79+
labels:
80+
severity: warning
81+
service: 'Azure SQL database'
82+
namespace: cloud-provider-azure
83+
annotations:
84+
summary: 'High User CPU usage.'
85+
description: 'Identify high CPU queries on {{ $labels.resourceName }} database and optimize them.'
86+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
87+
88+
- alert: AzureHighSystemFailedConnections
89+
expr: |
90+
sum by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_connection_failed_total_count{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 10
91+
for: 5m
92+
keep_firing_for: 10m
93+
labels:
94+
severity: warning
95+
service: 'Azure SQL database'
96+
namespace: cloud-provider-azure
97+
annotations:
98+
summary: 'High number of System Failed connections.'
99+
description: 'Check network problems, firewall restrictions or high resource consumption affecting application access to the database {{ $labels.resourceName }}.'
100+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
101+
102+
- alert: AzureHighUserFailedConnections
103+
expr: |
104+
sum by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_connection_failed_user_error_total_count{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 10
105+
for: 5m
106+
keep_firing_for: 10m
107+
labels:
108+
severity: warning
109+
service: 'Azure SQL database'
110+
namespace: cloud-provider-azure
111+
annotations:
112+
summary: 'High number of User Failed connections.'
113+
description: 'Check for authentication problems, network configuration errors, firewall issues, or resource constraints, affecting database accessibility for users on database {{ $labels.resourceName }}.'
114+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
115+
116+
- alert: AzureHighWorkerUsage
117+
expr: |
118+
avg by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_workers_percent_average_percent{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 60
119+
for: 5m
120+
keep_firing_for: 10m
121+
labels:
122+
severity: critical
123+
service: 'Azure SQL database'
124+
namespace: cloud-provider-azure
125+
annotations:
126+
summary: 'High worker usage.'
127+
description: 'Look for long execution queries, review the number of concurrent queries and requests being sent to the database or check if there are any blocking sessions or deadlocks into the {{ $labels.resourceName }} database.'
128+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
129+
130+
- alert: AzureHighDataIoUsage
131+
expr: |
132+
avg by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_physical_data_read_percent_average_percent{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 90
133+
for: 15m
134+
keep_firing_for: 10m
135+
labels:
136+
severity: info
137+
service: 'Azure SQL database'
138+
namespace: cloud-provider-azure
139+
annotations:
140+
summary: 'High data IO usage.'
141+
description: 'Review queries with high read or write activity, check if there are missing indexes or inefficient indexes that result in full table scans and assess the volume of transactions into the {{ $labels.resourceName }} database.'
142+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'
143+
144+
- alert: AzureLowTempdbLogSpace
145+
expr: |
146+
avg by (job,resourceGroup,subscriptionName,resourceName) (azure_microsoft_sql_servers_databases_tempdb_log_used_percent_average_percent{job=~".+",resourceGroup=~".+",subscriptionName=~".+",resourceName=~".+"}) > 60
147+
for: 5m
148+
keep_firing_for: 10m
149+
labels:
150+
severity: critical
151+
service: 'Azure SQL database'
152+
namespace: cloud-provider-azure
153+
annotations:
154+
summary: 'Low tempdb log space.'
155+
description: 'Look for active sessions that might be using TempDB intensively, identify stored procedures or queries that create temporary tables or objects, and also look for long-running or memory-intensive queries that rely heavily on TempDB into the {{ $labels.resourceName }} database.'
156+
dashboard_uid: '82c5b6cf30db5b601c5cc3f5d8d4284d'

0 commit comments

Comments
 (0)