Skip to content

Commit 1bdc0cc

Browse files
Openshift haproxy promcat (#227)
* openshift haproxy promcat * Change json dashboards names * change openshift version from 4.3 to 4.7 * add description to openshift-haproxy * alerts format fix * install and setup guide update * dashboards update * configuration error fixed * fix absent alert and delete old images * alert change * fix promQL alert Co-authored-by: David de Torres <[email protected]>
1 parent 2137226 commit 1bdc0cc

18 files changed

+4665
-209
lines changed

apps/openshift-haproxy-router.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@ keywords:
77
- Load-balancer
88
availableVersions:
99
- "3.11"
10-
- "4.3"
10+
- "4.7"
1111
shortDescription: "HAProxy ingress router for OpenShift"
1212
description: |
13-
#
13+
A highly available load balancer and proxy server for TCP and HTTP-based applications that automatically exposes services within the cluster through routes, and offers TLS termination, re-encryption, or SNI-passthrough on ports 80 and 443.
1414
icon: https://raw.githubusercontent.com/sysdiglabs/promcat-resources/master/apps/images/openshift-haproxy.png
1515
website: https://github.com/openshift/router
1616
available: true
Lines changed: 7 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,21 @@
11
# Getting the authentication of the HAProxy router
2-
The metrics endpoint of the HAProxy router in OpenShift 3.11 has a simple HTTP authentication configuration with username and password.
2+
The metrics endpoint of the HAProxy router in OpenShift 3.11 has a basic HTTP authentication configuration with username and password.
33

44
To retrieve the username and password, run the following commands:
55
```
66
# USER
7-
kubectl -n default get deploymentConfig router -o json | jq -r '.spec.template.spec.containers[].env[] | select( .name | contains("STATS_USERNAME")) | .value'
7+
export USER=`kubectl -n default get deploymentConfig router -o json | jq -r '.spec.template.spec.containers[].env[] | select( .name | contains("STATS_USERNAME")) | .value'`
88
99
# PASSWORD
10-
kubectl -n default get deploymentConfig router -o json | jq -r '.spec.template.spec.containers[].env[] | select( .name | contains("STATS_PASSWORD")) | .value'
10+
export PASS=`kubectl -n default get deploymentConfig router -o json | jq -r '.spec.template.spec.containers[].env[] | select( .name | contains("STATS_PASSWORD")) | .value'`
1111
```
1212

1313
>Note: to execute these commands ou will need the tool [jq](https://stedolan.github.io/jq/)
1414
15-
# Sysdig Agent configuration
16-
To configure Sysdig Agent to collect metrics from the HAProxy router in OpenShift 4.3, do the following:
15+
The Prometheus Monitoring stack is installed with OpenShift Container Platform by default so there is no need of additional configuration in prometheus.yml file
1716

18-
1. Copy the values of the `USER` and `PASSWORD` retrieved in the previous step.
17+
You can now check haproxy router metrics (remember to port-forward port 1936):
1918

20-
2. Add them to the job section of the `prometheus.yaml` file as follows:
21-
```yaml
22-
scrape_configs:
23-
- job_name: 'haproxy-router'
24-
basic_auth:
25-
username: USER
26-
password: PASSWORD
27-
relabel_configs:
28-
- action: keep
29-
source_labels:
30-
- __meta_kubernetes_namespace
31-
- __meta_kubernetes_pod_name
32-
separator: '/'
33-
regex: 'default/router-1-.+'
3419
```
35-
See the example configuration given below.
20+
curl -u $USER:$PASS http://ROUTERIP:1936/metrics
21+
```

resources/openshift-haproxy-router/INSTALL.v4.3.md

Lines changed: 0 additions & 35 deletions
This file was deleted.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Getting the authentication of the HAProxy router
2+
The metrics endpoint of the HAProxy router in OpenShift 4.7 has a basic HTTP authentication configuration with username and password.
3+
4+
To retrieve the username and password, run the following commands:
5+
```
6+
# USER
7+
export USER=`echo $(kubectl -n openshift-ingress get secret router-stats-default -o json | jq -r '.data.statsUsername') | base64 --decode`
8+
9+
# PASSWORD
10+
export PASS=`echo $(kubectl -n openshift-ingress get secret router-stats-default -o json | jq -r '.data.statsPassword') | base64 --decode`
11+
```
12+
13+
>Note: to execute these commands ou will need the tool [jq](https://stedolan.github.io/jq/)
14+
15+
The Prometheus Monitoring stack is installed with OpenShift Container Platform by default so there is no need of additional configuration in prometheus.yml file
16+
17+
You can now check haproxy router metrics (remember to port-forward port 1936):
18+
19+
```
20+
curl -u $USER:$PASS http://ROUTERIP:1936/metrics
21+
```

resources/openshift-haproxy-router/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
OpenShift offers different options as ingress router, one of them is based in HAProxy 2.0.
33

44
# Metrics
5-
The HAProxy ingress router instruments Prometheus metrics, and in OpenShift the endpoint is protected with user and password.
5+
The HAProxy ingress router instruments Prometheus metrics, in OpenShift the endpoint is protected with user and password by default.
66

77
## Number of time series generated
8-
The HAProxy ingress router generates ~400 time series.
8+
The HAProxy ingress router generates ~400 time series per HAProxy router pod.
99

1010
# Attributions
1111
The configuration files, dashboards, and alerts are maintained by [Sysdig team](https://sysdig.com/).
1212

13-
Using the [HAProxy Kubernetes ingress controller](https://github.com/haproxytech/kubernetes-ingress) and [OpenShift router](https://github.com/openshift/router) with the Apache 2.0 license.
13+
Using the [HAProxy Kubernetes ingress controller](https://github.com/haproxytech/kubernetes-ingress) and [OpenShift router](https://github.com/openshift/router) with the Apache 2.0 license.
Lines changed: 66 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -1,79 +1,78 @@
11
apiVersion: v1
22
kind: Alert
3-
app: OpenShift HAProxy Router
3+
app: 'OpenShift HAProxy Router'
44
version: 1.0.0
55
appVersion:
66
- '3.11'
7-
- '4.3'
7+
- '4.7'
88
descriptionFile: ALERTS.md
99
configurations:
1010
- kind: Prometheus
1111
data: |
1212
groups:
1313
- name: OpenShift-HAProxy-Router
1414
rules:
15-
- alert: RouterDown
16-
expr: |
17-
absent((count(haproxy_process_start_time_seconds) < 1))
18-
for: 10m
19-
labels:
20-
severity: page
21-
annotations:
22-
summary: Router HAProxy down. No instances running.
23-
- alert: DownTimeInService
24-
expr: |
25-
haproxy_backend_downtime_seconds_total > 0
26-
for: 10m
27-
labels:
28-
severity: page
29-
annotations:
30-
summary: DownTime detected in service. Route {{$labels.route}}, pod {{labels.pod}}
31-
- alert: RouteDown
32-
expr: |
33-
sum by (route) (haproxy_server_up==1) == 0
34-
for: 10m
35-
labels:
36-
severity: page
37-
annotations:
38-
summary: All servers are down in route {{$labels.route}}
39-
- alert: HighLatency
40-
expr: |
41-
max by (route)(haproxy_server_http_average_response_latency_milliseconds{route!=""}) > 250
42-
for: 10m
43-
labels:
44-
severity: page
45-
annotations:
46-
summary: High latency in at least one server for the route {{$labels.route}}
47-
- alert: PodHealthCheckFailure
48-
expr: |
49-
rate(haproxy_server_check_failures_total[5m]) > 0
50-
for: 10m
51-
labels:
52-
severity: page
53-
annotations:
54-
summary: Recurrent health check failure in pod {{$labels.pod}} and route {{$labels.route}}
55-
- alert: QueueNotEmptyInRoute
56-
expr: |
57-
sum by (route)(haproxy_server_current_queue{route!=""}) > 0
58-
for: 10m
59-
labels:
60-
severity: page
61-
annotations:
62-
summary: Queue not empty in route {{$labels.route}}
63-
- alert: HighErrorRateInRoute
64-
expr: |
65-
sum by (route) (rate(haproxy_server_http_responses_total{code!="2xx"}[5m])) /
66-
sum by (route) (rate(haproxy_server_http_responses_total{}[5m]))
67-
for: 10m
68-
labels:
69-
severity: page
70-
annotations:
71-
summary: High error rate in route {{$labels.route}}
72-
- alert: ConnectionErrorsInRoute
73-
expr: |
74-
sum by (route)(rate(haproxy_server_connection_errors_total{route!=""}[5m])) > 0
75-
for: 10m
76-
labels:
77-
severity: page
78-
annotations:
79-
summary: Recurring connection errors in route {{$labels.route}}
15+
- alert: '[OpenShift-HAProxy-Router] Router Down'
16+
expr: |
17+
absent(haproxy_process_start_time_seconds) == 1
18+
for: 10m
19+
labels:
20+
severity: critical
21+
annotations:
22+
description: Router HAProxy down. No instances running.
23+
- alert: '[OpenShift-HAProxy-Router] Percentage of routers low'
24+
expr: |
25+
count (haproxy_process_start_time_seconds)/sum (kube_workload_status_desired) < 0.75
26+
for: 10m
27+
labels:
28+
severity: critical
29+
annotations:
30+
description: Less than 75% Routers are up
31+
- alert: '[OpenShift-HAProxy-Router] Route Down'
32+
expr: |
33+
sum by (namespace,route)(haproxy_server_up) < 1
34+
for: 10m
35+
labels:
36+
severity: critical
37+
annotations:
38+
description: This alert detects if all servers are down in a route
39+
- alert: '[OpenShift-HAProxy-Router] High Latency'
40+
expr: |
41+
max by (namespace,route)(haproxy_server_http_average_response_latency_milliseconds{route!=""}) > 250
42+
for: 10m
43+
labels:
44+
severity: warning
45+
annotations:
46+
description: This alert detects high latency in at least one server of the route
47+
- alert: '[OpenShift-HAProxy-Router] Pod Health Check Failure'
48+
expr: |
49+
sum by (namespace,route,pod)(rate(haproxy_server_check_failures_total[5m])) > 0
50+
for: 10m
51+
labels:
52+
severity: critical
53+
annotations:
54+
description: This alert triggers when there is a recurrent pod health check failure.
55+
- alert: '[OpenShift-HAProxy-Router] Queue not empty in route'
56+
expr: |
57+
sum by (namespace,route)(haproxy_server_current_queue{route!=""}) > 0
58+
for: 10m
59+
labels:
60+
severity: warning
61+
annotations:
62+
description: This alert triggers when a queue is not empty in a route
63+
- alert: '[OpenShift-HAProxy-Router] High error rate in route'
64+
expr: |
65+
sum by (namespace,route) (rate(haproxy_server_http_responses_total{code!="2xx"}[5m])) /sum by (namespace,route) (rate(haproxy_server_http_responses_total[5m]))> 0.15
66+
for: 10m
67+
labels:
68+
severity: critical
69+
annotations:
70+
description: This alert triggers when there is a high error rate in a route.
71+
- alert: '[OpenShift-HAProxy-Router] Connection errors in route'
72+
expr: |
73+
sum by (namespace,route)(rate(haproxy_server_connection_errors_total{route!=""}[5m])) > 0
74+
for: 10m
75+
labels:
76+
severity: warning
77+
annotations:
78+
description: This alert triggers when there are recurring connection errors in a route

resources/openshift-haproxy-router/dashboards.yaml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@ app: 'OpenShift HAProxy Router'
44
version: 1.0.0
55
appVersion:
66
- "3.11"
7-
- "4.3"
7+
- "4.7"
88
configurations:
99
- name: 'HAProxy OC Ingress Overview'
1010
kind: Sysdig
11-
image: 'openshift-haproxy-router/images/HAProxy_OC_Ingress_Overview-sysdig-dashboard.png'
11+
image: 'openshift-haproxy-router/images/HAProxy_OC_Ingress_Overview-sysdig-dashboard-v4.7.png'
1212
description: |
1313
This dashboard offers information on:
1414
* Up Time
@@ -29,10 +29,10 @@ configurations:
2929
* Frontend Connections
3030
* Frontend Bytes Out
3131
* Frontend HTTP Requests
32-
file: include/HAProxy_OC_Ingress_Overview-sysdig-dashboard.json
32+
file: include/HAProxy_OC_Ingress_Overview-sysdig-dashboard-v4.7.json
3333
- name: 'HAProxy OC Service Golden Signals'
3434
kind: Sysdig
35-
image: 'openshift-haproxy-router/images/HAProxy_OC_Service_Golden_Signals-sysdig-dashboard.png'
35+
image: 'openshift-haproxy-router/images/HAProxy_OC_Service_Golden_Signals-sysdig-dashboard-v4.7.png'
3636
description: |
3737
This dashboard offers information on:
3838
* Servers
@@ -46,7 +46,7 @@ configurations:
4646
* Responses OK
4747
* Bytes Inbound
4848
* Bytes Outbound
49-
file: include/HAProxy_OC_Service_Golden_Signals-sysdig-dashboard.json
49+
file: include/HAProxy_OC_Service_Golden_Signals-sysdig-dashboard-v4.7.json
5050
- name: 'HAProxy OC Ingress Overview'
5151
kind: Grafana
5252
image: 'openshift-haproxy-router/images/HAProxy_OC_Ingress_Overview-grafana-dashboard.png'

resources/openshift-haproxy-router/description.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,5 @@ app: 'OpenShift HAProxy Router'
44
version: 1.0.0
55
appVersion:
66
- "3.11"
7-
- "4.3"
7+
- "4.7"
88
descriptionFile: README.md
2.36 MB
Loading
Binary file not shown.

0 commit comments

Comments
 (0)