Skip to content

Commit e5d6876

Browse files
authored
Merge pull request #2 from kabisa/module-description
Module description
2 parents ae9b884 + 60d9a61 commit e5d6876

File tree

5 files changed

+253
-3
lines changed

5 files changed

+253
-3
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@
22
*.tfstate.*
33
**/.terraform/
44
**/secrets.auto.tfvars
5+
/examples/.terraform.lock.hcl

.pre-commit-config.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,10 @@ repos:
66
- id: terraform-validate
77
- id: tflint
88
- id: shellcheck
9+
- repo: [email protected]:kabisa/terraform-datadog-pre-commit-hook.git
10+
rev: "1.2.2"
11+
hooks:
12+
- id: terraform-datadog-docs
13+
exclude: ^README.md$
14+
args:
15+
- "."

README.md

Lines changed: 231 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,236 @@
1-
# terraform datadog SQL Server monitoring
21

3-
## Getting Started
2+
![Datadog](https://imgix.datadoghq.com/img/about/presskit/logo-v/dd_vertical_purple.png)
43

5-
Pre-commit:
4+
[//]: # (This file is generated. Do not edit, module description can be added by editing / creating module_description.md)
5+
6+
# Terraform module for Datadog Sql Server
7+
8+
This module requires the [sql server integration](https://docs.datadoghq.com/integrations/sqlserver/?tab=host) to be configured.
9+
It has basic SQL Server monitoring. Locks, process blocked, connectivity.
10+
It's best to also use Datadog's APM instrumentation to understand the way the application is using the database.
11+
There's an upcoming feature in Datadog to fully support deep dive database monitoring.
12+
13+
This module is part of a larger suite of modules that provide alerts in Datadog.
14+
Other modules can be found on the [Terraform Registry](https://registry.terraform.io/search/modules?namespace=kabisa&provider=datadog)
15+
16+
We have two base modules we use to standardise development of our Monitor Modules:
17+
- [generic monitor](https://github.com/kabisa/terraform-datadog-generic-monitor) Used in 90% of our alerts
18+
- [service check monitor](https://github.com/kabisa/terraform-datadog-service-check-monitor)
19+
20+
Modules are generated with this tool: https://github.com/kabisa/datadog-terraform-generator
21+
22+
# Example Usage
23+
24+
```terraform
25+
module "sql_server" {
26+
source = "kabisa/sql-server/datadog"
27+
28+
notification_channel = "[email protected]"
29+
service = "SQL Server"
30+
env = "prd"
31+
alert_env = "prd"
32+
filter_str = "role:sqlserver"
33+
service_check_include_tags = ["role:sqlserver"]
34+
}
35+
```
36+
37+
Monitors:
38+
* [Terraform module for Datadog Sql Server](#terraform-module-for-datadog-sql-server)
39+
* [Connections](#connections)
40+
* [Page Life Expectancy](#page-life-expectancy)
41+
* [Can Connect](#can-connect)
42+
* [Buffer Cache Hit Ratio](#buffer-cache-hit-ratio)
43+
* [Database State](#database-state)
44+
* [Lock Waits](#lock-waits)
45+
* [Batches Compiled Percent](#batches-compiled-percent)
46+
* [Procs Blocked](#procs-blocked)
47+
* [Module Variables](#module-variables)
48+
49+
# Getting started developing
50+
[pre-commit](http://pre-commit.com/) was used to do Terraform linting and validating.
51+
52+
Steps:
653
- Install [pre-commit](http://pre-commit.com/). E.g. `brew install pre-commit`.
754
- Run `pre-commit install` in this repo. (Every time you cloud a repo with pre-commit enabled you will need to run the pre-commit install command)
855
- That’s it! Now every time you commit a code change (`.tf` file), the hooks in the `hooks:` config `.pre-commit-config.yaml` will execute.
56+
57+
## Connections
58+
59+
Query:
60+
```terraform
61+
avg(last_30m):max:sqlserver.stats.connections{tag:xxx} by {host} >= 500
62+
```
63+
64+
| variable | default | required | description |
65+
|-------------------------------|----------|----------|----------------------------------|
66+
| connections_enabled | True | No | |
67+
| connections_warning | 400 | No | |
68+
| connections_critical | 500 | No | |
69+
| connections_evaluation_period | last_30m | No | |
70+
| connections_note | "" | No | |
71+
| connections_docs | "" | No | |
72+
| connections_filter_override | "" | No | |
73+
| connections_alerting_enabled | True | No | |
74+
| connections_priority | 3 | No | Number from 1 (high) to 5 (low). |
75+
76+
77+
## Page Life Expectancy
78+
79+
When this metric is low, pages are not being cached for a short time and often read from disk. Consider allocating more memory.
80+
81+
Query:
82+
```terraform
83+
avg(last_1d):min:sqlserver.buffer.page_life_expectancy{tag:xxx} by {host} < 900
84+
```
85+
86+
| variable | default | required | description |
87+
|----------------------------------------|------------------------------------------|----------|----------------------------------|
88+
| page_life_expectancy_enabled | True | No | |
89+
| page_life_expectancy_warning | 1800 | No | |
90+
| page_life_expectancy_critical | 900 | No | |
91+
| page_life_expectancy_evaluation_period | last_1d | No | |
92+
| page_life_expectancy_note | "" | No | |
93+
| page_life_expectancy_docs | When this metric is low, pages are not being cached for a short time and often read from disk. Consider allocating more memory. | No | |
94+
| page_life_expectancy_filter_override | "" | No | |
95+
| page_life_expectancy_alerting_enabled | True | No | |
96+
| page_life_expectancy_priority | 4 | No | Number from 1 (high) to 5 (low). |
97+
98+
99+
## Can Connect
100+
101+
| variable | default | required | description |
102+
|------------------------------|----------|----------|--------------|
103+
| can_connect_enabled | True | No | |
104+
| can_connect_alerting_enabled | True | No | |
105+
| can_connect_warning | 1 | No | |
106+
| can_connect_critical | 1 | No | |
107+
| can_connect_priority | 1 | No | |
108+
| can_connect_docs | "" | No | |
109+
| can_connect_note | "" | No | |
110+
111+
112+
## Buffer Cache Hit Ratio
113+
114+
When this metric is low, pages are often read from disk. Consider allocating more memory.
115+
116+
Query:
117+
```terraform
118+
avg(last_1d):min:sqlserver.buffer.cache_hit_ratio{tag:xxx} by {host} * 100 < 75
119+
```
120+
121+
| variable | default | required | description |
122+
|------------------------------------------|------------------------------------------|----------|----------------------------------|
123+
| buffer_cache_hit_ratio_enabled | True | No | |
124+
| buffer_cache_hit_ratio_warning | 90 | No | |
125+
| buffer_cache_hit_ratio_critical | 75 | No | |
126+
| buffer_cache_hit_ratio_evaluation_period | last_1d | No | |
127+
| buffer_cache_hit_ratio_note | "" | No | |
128+
| buffer_cache_hit_ratio_docs | When this metric is low, pages are often read from disk. Consider allocating more memory. | No | |
129+
| buffer_cache_hit_ratio_filter_override | "" | No | |
130+
| buffer_cache_hit_ratio_alerting_enabled | True | No | |
131+
| buffer_cache_hit_ratio_priority | 4 | No | Number from 1 (high) to 5 (low). |
132+
133+
134+
## Database State
135+
136+
Query:
137+
```terraform
138+
max(last_5m):max:sqlserver.database.state{tag:xxx} by {host,database,database_state_desc} >= 5
139+
```
140+
141+
| variable | default | required | description |
142+
|----------------------------------|----------|----------|----------------------------------|
143+
| database_state_enabled | True | No | |
144+
| database_state_warning | 1 | No | |
145+
| database_state_critical | 5 | No | |
146+
| database_state_evaluation_period | last_5m | No | |
147+
| database_state_note | "" | No | |
148+
| database_state_docs | "" | No | |
149+
| database_state_filter_override | "" | No | |
150+
| database_state_alerting_enabled | True | No | |
151+
| database_state_priority | 1 | No | Number from 1 (high) to 5 (low). |
152+
153+
154+
## Lock Waits
155+
156+
High numbers of lock waits per second is caused by lock contention. Try reducing lock contention by using more fine grained locking in the queries.
157+
158+
Query:
159+
```terraform
160+
avg(last_30m):max:sqlserver.stats.lock_waits{tag:xxx} by {host} > 20
161+
```
162+
163+
| variable | default | required | description |
164+
|------------------------------|------------------------------------------|----------|----------------------------------|
165+
| lock_waits_enabled | True | No | |
166+
| lock_waits_warning | 10 | No | |
167+
| lock_waits_critical | 20 | No | |
168+
| lock_waits_evaluation_period | last_30m | No | |
169+
| lock_waits_note | "" | No | |
170+
| lock_waits_docs | High numbers of lock waits per second is caused by lock contention. Try reducing lock contention by using more fine grained locking in the queries. | No | |
171+
| lock_waits_filter_override | "" | No | |
172+
| lock_waits_alerting_enabled | True | No | |
173+
| lock_waits_priority | 4 | No | Number from 1 (high) to 5 (low). |
174+
175+
176+
## Batches Compiled Percent
177+
178+
When this metric is high, a lot of queries need to be recompiled. Consider parameterizing more queries by using stored procedures, using forced parameterization or allocating more memory.
179+
180+
Query:
181+
```terraform
182+
avg(last_1d):(max:sqlserver.stats.sql_compilations{tag:xxx} by {host} / max:sqlserver.stats.batch_requests{tag:xxx} by {host}) * 100 >= 20
183+
```
184+
185+
| variable | default | required | description |
186+
|--------------------------------------------|------------------------------------------|----------|----------------------------------|
187+
| batches_compiled_percent_enabled | True | No | |
188+
| batches_compiled_percent_warning | 10 | No | |
189+
| batches_compiled_percent_critical | 20 | No | |
190+
| batches_compiled_percent_evaluation_period | last_1d | No | |
191+
| batches_compiled_percent_note | "" | No | |
192+
| batches_compiled_percent_docs | When this metric is high, a lot of queries need to be recompiled. Consider parameterizing more queries by using stored procedures, using forced parameterization or allocating more memory. | No | |
193+
| batches_compiled_percent_filter_override | "" | No | |
194+
| batches_compiled_percent_alerting_enabled | True | No | |
195+
| batches_compiled_percent_priority | 4 | No | Number from 1 (high) to 5 (low). |
196+
197+
198+
## Procs Blocked
199+
200+
High number of procs blocked can indicate deadlocks. Check for deadlocks by investigating which queries are waiting for locks to be released.
201+
202+
Query:
203+
```terraform
204+
avg(last_10m):max:sqlserver.stats.procs_blocked{tag:xxx} by {host} >= 1
205+
```
206+
207+
| variable | default | required | description |
208+
|---------------------------------|------------------------------------------|----------|----------------------------------|
209+
| procs_blocked_enabled | True | No | |
210+
| procs_blocked_warning | None | No | |
211+
| procs_blocked_critical | 1 | No | |
212+
| procs_blocked_evaluation_period | last_10m | No | |
213+
| procs_blocked_note | "" | No | |
214+
| procs_blocked_docs | High number of procs blocked can indicate deadlocks. Check for deadlocks by investigating which queries are waiting for locks to be released. | No | |
215+
| procs_blocked_filter_override | "" | No | |
216+
| procs_blocked_alerting_enabled | True | No | |
217+
| procs_blocked_priority | 3 | No | Number from 1 (high) to 5 (low). |
218+
219+
220+
## Module Variables
221+
222+
| variable | default | required | description |
223+
|----------------------------|----------|----------|--------------|
224+
| env | | Yes | |
225+
| alert_env | | Yes | |
226+
| filter_str | | Yes | |
227+
| service | | Yes | |
228+
| notification_channel | | Yes | |
229+
| additional_tags | [] | No | |
230+
| name_prefix | "" | No | |
231+
| name_suffix | "" | No | |
232+
| locked | True | No | |
233+
| service_check_include_tags | None | No | |
234+
| service_check_exclude_tags | None | No | |
235+
236+

examples/example.tf

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
module "sql_server" {
2+
source = "kabisa/sql-server/datadog"
3+
4+
notification_channel = "[email protected]"
5+
service = "SQL Server"
6+
env = "prd"
7+
alert_env = "prd"
8+
filter_str = "role:sqlserver"
9+
service_check_include_tags = ["role:sqlserver"]
10+
}

module_description.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
This module requires the [sql server integration](https://docs.datadoghq.com/integrations/sqlserver/?tab=host) to be configured.
2+
It has basic SQL Server monitoring. Locks, process blocked, connectivity.
3+
It's best to also use Datadog's APM instrumentation to understand the way the application is using the database.
4+
There's an upcoming feature in Datadog to fully support deep dive database monitoring.

0 commit comments

Comments
 (0)