Skip to content

Commit e92f755

Browse files
Merge pull request #133 from sysdiglabs/staging
Add new changes to production
2 parents ddcd07c + cd4f72d commit e92f755

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+5905
-146
lines changed

apps/docker-engine.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ kind: App
44
name: "Docker"
55
keywords:
66
- Containers
7-
- Coming soon
7+
- Available
88
availableVersions:
9-
- '19'
9+
- '18.09.9'
1010
shortDescription: "Docker container engine"
1111
description: |
1212
Docker Engine powers millions of applications worldwide, providing a standardized packaging format for diverse applications.
1313
Docker Engine is the industry’s de facto container runtime that runs on various Linux (CentOS, Debian, Fedora, Oracle Linux, RHEL, SUSE, and Ubuntu) and Windows Server operating systems. Docker creates simple tooling and a universal packaging approach that bundles up all application dependencies inside a container which is then run on Docker Engine. Docker Engine enables containerized applications to run anywhere consistently on any infrastructure, solving “dependency hell” for developers and operations teams, and eliminating the “it works on my laptop!” problem.
1414
icon: https://upload.wikimedia.org/wikipedia/commons/4/4e/Docker_%28container_engine%29_logo.svg
1515
website: https://www.docker.com/
16-
available: false
16+
available: true

apps/images/mssql.png

39.8 KB
Loading

apps/mssql.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
apiVersion: v1
3+
kind: App
4+
name: "mssql"
5+
keywords:
6+
- Database
7+
- Available
8+
availableVersions:
9+
- '2019'
10+
shortDescription: "Microsoft SQL Server is a relational database management system developed by Microsoft"
11+
description: |
12+
Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network.
13+
icon: https://raw.githubusercontent.com/sysdiglabs/promcat-resources/master/apps/images/mssql.png
14+
website: https://www.microsoft.com/en-us/sql-server/sql-server-2019
15+
available: true

resources/aws-rds/ALERTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Disk will be full within 12 hours in instance.
1818
Average read latency over 250ms in instance.
1919

2020
## HighWriteLatency
21-
Average read latency over 250ms in instance.
21+
Average write latency over 250ms in instance.
2222

2323
## HighDiskQueue
2424
High disk queue depth in instance.

resources/aws-rds/alerts.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ configurations:
6262
labels:
6363
severity: warning
6464
annotations:
65-
summary: Average read latency over 250ms in instance {{$labels.dimension_DBInstanceIdentifier}}
65+
summary: Average write latency over 250ms in instance {{$labels.dimension_DBInstanceIdentifier}}
6666
- alert: HighDiskQueue
6767
expr: |
6868
aws_rds_disk_queue_depth_average > 25

resources/ceph/include/ceph_grafana.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1923,8 +1923,8 @@
19231923
{
19241924
"current": {
19251925
"selected": false,
1926-
"text": "Sysdig +carillan",
1927-
"value": "Sysdig +carillan"
1926+
"text": "Sysdig",
1927+
"value": "Sysdig"
19281928
},
19291929
"hide": 0,
19301930
"includeAll": false,

resources/docker-engine/ALERTS.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Alerts
2+
## BuilderBuildsFailRateTooHigh
3+
The build faild rate is too high(instance {{ $labels.instance }})
4+
5+
## DaemonContainerActionLatencyTooHigh
6+
The the container action {{ $labels.action }} latency is too high for the instance {{ $labels.instance }}
7+

resources/docker-engine/INSTALL.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Installing the exporter
2+
In order to expose the metrics you have to configure the docker engine with the next file like the [documentation](https://docs.docker.com/config/daemon/prometheus/) says:
3+
4+
To configure the Docker daemon as a Prometheus target, you need to specify the metrics-address. The best way to do this is via the daemon.json, which is located at one of the following locations by default. If the file does not exist, create it.
5+
6+
- **Linux**: `/etc/docker/daemon.json`
7+
- **Windows Server**: `C:\ProgramData\docker\config\daemon.json`
8+
- **Docker Desktop for Mac / Docker Desktop for Windows**: Click the Docker icon in the toolbar, select **Preferences**, then select **Daemon**. Click **Advanced**.
9+
10+
```json
11+
{
12+
"metrics-addr" : "127.0.0.1:9323",
13+
"experimental" : true
14+
}
15+
```
16+
17+
# Sysdig Agent configuration
18+
For the Sysdig Agent to discover and scrape it automatically, enable the promscrape option in the agent configuration. You will get an example of the sysdig agent in the section below
19+
20+
```yaml
21+
prometheus.yaml: |
22+
global:
23+
scrape_interval: 10s
24+
scrape_configs:
25+
- job_name: docker
26+
static_configs:
27+
- targets:
28+
- localhost:9323
29+
dragent.yaml: |-
30+
use_promscrape: true
31+
prometheus:
32+
enabled: true
33+
prom_service_discovery: true
34+
```

resources/docker-engine/README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Docker engine
2+
Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications. By taking advantage of Docker’s methodologies for shipping, testing, and deploying code quickly, you can significantly reduce the delay between writing code and running it in production.
3+
4+
# Metrics
5+
Docker offers the following metrics:
6+
- Builder metrics
7+
- Container state metrics
8+
- Subscribers metrics
9+
- Network, cpu and memory metrics
10+
11+
For further information, consult the [official Docker web](https://docs.docker.com/config/daemon/prometheus/).
12+
13+
# Number of time series generated
14+
The number of metrics generated for database is ~500.
15+
16+
# Attributions
17+
Configuration files and dashboards maintained by [Sysdig team](https://sysdig.com/).
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
apiVersion: v1
2+
kind: Alert
3+
app: Docker
4+
version: 1.0.0
5+
appVersion:
6+
- 18.09.9
7+
configurations:
8+
- kind: Prometheus
9+
data: |
10+
- alert: BuilderBuildsFailRateTooHigh
11+
expr: sum(rate(builder_builds_failed_total[5m])) >
12+
for: 5m
13+
labels:
14+
severity: critical
15+
annotations:
16+
summary: The build faild rate is too high(instance {{ $labels.instance }})
17+
- alert: DaemonContainerActionLatencyTooHigh
18+
expr: histogram_quantile(0.90, rate(engine_daemon_container_actions_seconds_bucket)) > 5
19+
for: 5m
20+
labels:
21+
severity: critical
22+
annotations:
23+
summary: The the container action {{ $labels.action }} latency is too high for the instance {{ $labels.instance }}
24+
- kind: Sysdig
25+
data: |-
26+
{
27+
"alert": {
28+
"condition": "sum(rate(builder_builds_failed_total[5m])) >",
29+
"customNotification": {
30+
"titleTemplate": "{{__alert_name__}} is {{__alert_status__}}",
31+
"useNewTemplate": false
32+
},
33+
"enabled": true,
34+
"name": "BuilderBuildsFailRateTooHigh",
35+
"rateOfChange": false,
36+
"reNotify": false,
37+
"reNotifyMinutes": 5,
38+
"severity": 4,
39+
"severityLabel": "LOW",
40+
"severityLevel": null,
41+
"timespan": 600000000,
42+
"type": "PROMETHEUS"
43+
}
44+
}
45+
- kind: Sysdig
46+
data: |-
47+
{
48+
"alert": {
49+
"condition": "histogram_quantile(0.90, rate(engine_daemon_container_actions_seconds_bucket)) > 5",
50+
"customNotification": {
51+
"titleTemplate": "{{__alert_name__}} is {{__alert_status__}}",
52+
"useNewTemplate": false
53+
},
54+
"enabled": true,
55+
"name": "DaemonContainerActionLatencyTooHigh",
56+
"rateOfChange": false,
57+
"reNotify": false,
58+
"reNotifyMinutes": 5,
59+
"severity": 4,
60+
"severityLabel": "LOW",
61+
"severityLevel": null,
62+
"timespan": 600000000,
63+
"type": "PROMETHEUS"
64+
}
65+
}
66+
descriptionFile: ALERTS.md

0 commit comments

Comments
 (0)