Skip to content

Commit 42c9264

Browse files
Merge pull request #209 from sysdiglabs/staging
Deployment to production
2 parents f0db22f + ee5f1c7 commit 42c9264

File tree

14 files changed

+4281
-0
lines changed

14 files changed

+4281
-0
lines changed

apps/images/portworx.svg

Lines changed: 2 additions & 0 deletions
Loading

apps/portworx.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
apiVersion: v1
3+
kind: App
4+
name: "Portworx"
5+
keywords:
6+
- Storage
7+
- Available
8+
availableVersions:
9+
- '2.9'
10+
shortDescription: "Portworx is an end-to-end storage and data management solution for Kubernetes"
11+
description: |
12+
Portworx provides a fully integrated solution for persistent storage, data protection, disaster recovery, data security, cross-cloud and data migrations, and automated capacity management for applications running on Kubernetes.
13+
icon: https://raw.githubusercontent.com/sysdiglabs/promcat-resources/master/apps/images/portworx.svg
14+
website: https://portworx.com/
15+
available: true

resources/portworx/ALERTS.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Alerts
2+
## No Quorum
3+
Portworx No Quorum.
4+
5+
## Node Status Not OK
6+
Portworx Node Status Not OK.
7+
8+
## Offline Nodes
9+
Portworx Offline Nodes.
10+
11+
## Nodes Storage Full or Down
12+
Portworx Nodes Storage Full or Down.
13+
14+
## Offline Storage Nodes
15+
Portworx Offline Storage Nodes.
16+
17+
## Unhealthy Node KVDB
18+
Portworx Unhealthy Node KVDB.
19+
20+
## Cache read hit rate is low
21+
Portworx Cache read hit rate is low.
22+
23+
## Cache write hit rate is low
24+
Portworx Cache write hit rate is low.
25+
26+
## High Read Latency In Disk
27+
Portworx High Read Latency In Disk.
28+
29+
## High Write Latency In Disk
30+
Portworx High Write Latency In Disk.
31+
32+
## Low Cluster Capacity
33+
Portworx Low Cluster Capacity.
34+
35+
## Disk Full In 48H
36+
Portworx Disk Full In 48H.
37+
38+
## Disk Full In 12H
39+
Portworx Disk Full In 12H.
40+
41+
## Pool Status Not Online
42+
Portworx Node Status Not Online.
43+
44+
## High Write Latency In Pool
45+
Portworx High Write Latency In Pool.
46+
47+
## Pool Full In 48H
48+
Portworx Pool Full In 48H.
49+
50+
## Pool Full In 12H
51+
Portworx Pool Full In 12H.
52+
53+
## High Write Latency In Volume
54+
Portworx High Write Latency In Volume.
55+
56+
## High Read Latency In Volume
57+
Portworx High Read Latency In Volume.
58+
59+
## License Expiry
60+
Portworx License Expiry.
61+

resources/portworx/INSTALL.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Prerequisites
2+
Portworx instruments Prometheus metrics and annotates the pods with Prometheus annotations.

resources/portworx/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Portworx
2+
Portworx provides a fully integrated solution for persistent storage, data protection, disaster recovery, data security, cross-cloud and data migrations, and automated capacity management for applications running on Kubernetes.
3+
4+
# Prometheus and exporters
5+
Portworx already has a Prometheus endpoint with all the metrics exposed on the port 9001 (port 17001 if deployed in Openshift). In Kubernetes the pod is already annotated, so with the Sysdig agent you can scrape the endpoint right away.
6+
7+
# Metrics
8+
- Portworx cluster statistics
9+
- Portworx volumes statistics
10+
11+
# Attributions
12+
Configuration files, dashboards and alerts are maintained by [Sysdig team](https://sysdig.com/).

resources/portworx/alerts.yaml

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
apiVersion: v1
2+
kind: Alert
3+
app: Portworx
4+
version: 1.0.0
5+
appVersion:
6+
- '2.9'
7+
descriptionFile: ALERTS.md
8+
configurations:
9+
- kind: Prometheus
10+
data: |-
11+
groups:
12+
- name: Portworx
13+
rules:
14+
- alert: '[Portworx] No Quorum'
15+
expr: "px_cluster_status_quorum != 1"
16+
for: 5m
17+
labels:
18+
severity: critical
19+
annotations:
20+
description: Portworx No Quorum.
21+
- alert: '[Portworx] Node Status Not OK'
22+
expr: "px_node_status_node_status != 2"
23+
for: 5m
24+
labels:
25+
severity: critical
26+
annotations:
27+
description: Portworx Node Status Not OK.
28+
- alert: '[Portworx] Offline Nodes'
29+
expr: "px_cluster_status_nodes_offline > 0"
30+
for: 5m
31+
labels:
32+
severity: critical
33+
annotations:
34+
description: Portworx Offline Nodes.
35+
- alert: '[Portworx] Nodes Storage Full or Down'
36+
expr: "px_cluster_status_nodes_storage_down > 0"
37+
for: 5m
38+
labels:
39+
severity: critical
40+
annotations:
41+
description: Portworx Nodes Storage Full or Down.
42+
- alert: '[Portworx] Offline Storage Nodes'
43+
expr: "px_cluster_status_storage_nodes_offline > 0"
44+
for: 5m
45+
labels:
46+
severity: critical
47+
annotations:
48+
description: Portworx Offline Storage Nodes.
49+
- alert: '[Portworx] Unhealthy Node KVDB'
50+
expr: "px_kvdb_health_state_node_view == 2"
51+
for: 5m
52+
labels:
53+
severity: critical
54+
annotations:
55+
description: Portworx Unhealthy Node KVDB.
56+
- alert: '[Portworx] Cache read hit rate is low'
57+
expr: "px_px_cache_read_hits/( px_px_cache_read_hits + px_px_cache_read_miss)< 0.80"
58+
for: 5m
59+
labels:
60+
severity: warning
61+
annotations:
62+
description: Portworx Cache read hit rate is low.
63+
- alert: '[Portworx] Cache write hit rate is low'
64+
expr: "px_px_cache_write_hits/( px_px_cache_write_hits + px_px_cache_write_miss)< 0.80"
65+
for: 5m
66+
labels:
67+
severity: warning
68+
annotations:
69+
description: Portworx Cache write hit rate is low.
70+
- alert: '[Portworx] High Read Latency In Disk'
71+
expr: |
72+
px_disk_stats_read_latency_seconds{ disk=~$disk} > 0.100
73+
for: 5m
74+
labels:
75+
severity: warning
76+
annotations:
77+
description: Portworx High Read Latency In Disk.
78+
- alert: '[Portworx] High Write Latency In Disk'
79+
expr: |
80+
px_disk_stats_write_latency_seconds{ disk=~$disk} > 0.250
81+
for: 5m
82+
labels:
83+
severity: warning
84+
annotations:
85+
description: Portworx High Write Latency In Disk.
86+
- alert: '[Portworx] Low Cluster Capacity'
87+
expr: |
88+
(sum (px_cluster_disk_available_bytes))/(sum (px_cluster_disk_total_bytes))< 0.10
89+
for: 5m
90+
labels:
91+
severity: critical
92+
annotations:
93+
description: Portworx Low Cluster Capacity.
94+
- alert: '[Portworx] Disk Full In 48H'
95+
expr: |
96+
predict_linear(px_cluster_disk_available_bytes[48h], 48 * 3600) < 0
97+
for: 5m
98+
labels:
99+
severity: warning
100+
annotations:
101+
description: Portworx Disk Full In 48H.
102+
- alert: '[Portworx] Disk Full In 12H'
103+
expr: |
104+
predict_linear(px_cluster_disk_available_bytes[12h], 12 * 3600) < 0
105+
for: 5m
106+
labels:
107+
severity: warning
108+
annotations:
109+
description: Portworx Disk Full In 12H.
110+
- alert: '[Portworx] Pool Status Not Online'
111+
expr: "px_pool_stats_status{ pool=~$pool} != 1"
112+
for: 5m
113+
labels:
114+
severity: warning
115+
annotations:
116+
description: Portworx Node Status Not Online.
117+
- alert: '[Portworx] High Write Latency In Pool'
118+
expr: |
119+
px_pool_stats_write_latency_seconds{ pool=~$pool} > 0.250
120+
for: 5m
121+
labels:
122+
severity: warning
123+
annotations:
124+
description: Portworx High Write Latency In Pool.
125+
- alert: '[Portworx] Pool Full In 48H'
126+
expr: |
127+
predict_linear(px_pool_stats_available_bytes{ pool=~$pool}[48h], 48 * 3600) < 0
128+
for: 5m
129+
labels:
130+
severity: warning
131+
annotations:
132+
description: Portworx Pool Full In 48H.
133+
- alert: '[Portworx] Pool Full In 12H'
134+
expr: |
135+
predict_linear(px_pool_stats_available_bytes{ pool=~$pool}[12h], 12 * 3600) < 0
136+
for: 5m
137+
labels:
138+
severity: warning
139+
annotations:
140+
description: Portworx Pool Full In 12H.
141+
- alert: '[Portworx] High Write Latency In Volume'
142+
expr: |
143+
px_volume_write_latency_seconds{ pvc=~$pvc} > 0.250
144+
for: 5m
145+
labels:
146+
severity: warning
147+
annotations:
148+
description: Portworx High Write Latency In Volume.
149+
- alert: '[Portworx] High Read Latency In Volume'
150+
expr: |
151+
px_volume_read_latency_seconds{ pvc=~$pvc} > 0.100
152+
for: 5m
153+
labels:
154+
severity: warning
155+
annotations:
156+
description: Portworx High Read Latency In Volume.
157+
- alert: '[Portworx] License Expiry'
158+
expr: |
159+
min(px_node_status_license_expiry) < 30
160+
for: 5m
161+
labels:
162+
severity: warning
163+
annotations:
164+
description: Portworx License Expiry.

resources/portworx/dashboards.yaml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
apiVersion: v1
2+
kind: Dashboard
3+
app: Portworx
4+
version: 1.0.0
5+
appVersion:
6+
- '2.9'
7+
configurations:
8+
- name: Cluster
9+
kind: Sysdig
10+
image: portworx/images/portworx_cluster_sysdig.png
11+
description: |
12+
This dashboard offers information on:
13+
* Health
14+
* Network
15+
* Disk Stats
16+
* Pool Stats
17+
* Cache
18+
file: include/Portworx_Cluster.json
19+
- name: Volumes
20+
kind: Sysdig
21+
image: portworx/images/portworx_volumes_sysdig.png
22+
description: |
23+
This dashboard offers information on:
24+
* Volume Status
25+
* Volume Capacity & Usage
26+
* Volume Replication
27+
* Volume IOPS
28+
file: include/Portworx_Volumes.json
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
apiVersion: v1
2+
kind: Description
3+
app: Portworx
4+
version: 1.0.0
5+
appVersion:
6+
- '2.9'
7+
descriptionFile: README.md
950 KB
Loading
431 KB
Loading

0 commit comments

Comments
 (0)