Skip to content

Commit 1ecbfe9

Browse files
Merge pull request #284 from typeid/interceptor_deployment
Interceptor deployment
2 parents a48cde1 + e059d90 commit 1ecbfe9

File tree

9 files changed

+189
-40
lines changed

9 files changed

+189
-40
lines changed

Dockerfile

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,14 @@ FROM $BUILDER_IMG as builder
44
ADD . /opt
55
WORKDIR /opt
66

7-
RUN git update-index --refresh; make CGO_ENABLED=0 cadctl-install-local-force
7+
RUN make CGO_ENABLED=0 build-cadctl
8+
RUN make CGO_ENABLED=0 build-interceptor
89

910

1011
FROM quay.io/app-sre/ubi8-ubi-minimal:8.10 as runner
1112

12-
COPY --from=builder /opt/cadctl/cadctl /bin/cadctl
13+
COPY --from=builder /opt/bin/cadctl /bin/cadctl
14+
COPY --from=builder /opt/bin/interceptor /bin/interceptor
1315

1416
ARG BUILD_DATE
1517
ARG VERSION

Makefile

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -107,15 +107,3 @@ coverage:
107107

108108
.PHONY: validate
109109
validate: generate-template-file isclean
110-
111-
# Build targets
112-
cadctl/cadctl: cadctl/**/*.go pkg/**/*.go go.mod go.sum
113-
GOBIN=$(PWD)/cadctl go install -ldflags="-s -w" -mod=readonly -trimpath $(PWD)/cadctl
114-
115-
.PHONY: cadctl-install-local
116-
cadctl-install-local: cadctl/cadctl
117-
118-
.PHONY: cadctl-install-local-force
119-
cadctl-install-local-force:
120-
rm cadctl/cadctl >/dev/null 2>&1 || true
121-
make cadctl-install-local

README.md

Lines changed: 24 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,17 @@
66

77
- [Configuration Anomaly Detection](#configuration-anomaly-detection)
88
- [About](#about)
9+
- [Overview](#overview)
10+
- [Workflow](#workflow)
911
- [Contributing](#contributing)
12+
- [Building](#building)
1013
- [Adding a new investigation](#adding-a-new-investigation)
1114
- [Testing locally](#testing-locally)
1215
- [Pre-requirements](#pre-requirements)
1316
- [Running cadctl for an incident ID](#running-cadctl-for-an-incident-id)
1417
- [Documentation](#documentation)
15-
- [CAD CLI](#cad-cli)
1618
- [Investigations](#investigations)
1719
- [Integrations](#integrations)
18-
- [Overview](#overview)
1920
- [Templates](#templates)
2021
- [Dashboards](#dashboards)
2122
- [Deployment](#deployment)
@@ -32,8 +33,29 @@
3233

3334
Configuration Anomaly Detection (CAD) is responsible for reducing manual SRE effort by pre-investigating alerts, detecting cluster anomalies and sending relevant communications to the cluster owner.
3435

36+
## Overview
37+
38+
CAD consists of:
39+
- a tekton deployment including a custom tekton interceptor
40+
- the `cadctl` command line tool implementing alert remediations and pre-investigations
41+
42+
### Workflow
43+
44+
1) [PagerDuty Webhooks](https://support.pagerduty.com/docs/webhooks) are used to trigger Configuration-Anomaly-Detection when a [PagerDuty incident](https://support.pagerduty.com/docs/incidents) is created
45+
2) The webhook routes to a [Tekton EventListener](https://tekton.dev/docs/triggers/eventlisteners/)
46+
3) Received webhooks are filtered by a [Tekton Interceptor](https://tekton.dev/docs/triggers/interceptors/) that uses the payload to evaluate whether the alert has an implemented handler function in `cadctl` or not. If there is no handler implemented, the alert is directly forwarded to a human SRE.
47+
4) If `cadctl` implements a handler for the received payload/alert, a [Tekton PipelineRun](https://tekton.dev/docs/pipelines/pipelineruns/) is started.
48+
5) The pipeline runs `cadctl` which determines the handler function by itself based on the payload.
49+
50+
![CAD Overview](./images/cad_overview/cad_architecture_dark.png#gh-dark-mode-only)
51+
![CAD Overview](./images/cad_overview/cad_architecture_light.png#gh-light-mode-only)
52+
3553
## Contributing
3654

55+
### Building
56+
57+
For build targets, see `make help`.
58+
3759
### Adding a new investigation
3860

3961
CAD investigations are triggered by PagerDuty webhooks. Currently, CAD supports the following two formats of webhooks:
@@ -70,10 +92,6 @@ Example usage:`./test/generate_incident.sh ClusterHasGoneMissing 2b94brrrrrrrrrr
7092

7193
## Documentation
7294

73-
### CAD CLI
74-
75-
* [cadctl](./cadctl/README.md) -- Performs investigation workflow.
76-
7795
### Investigations
7896

7997
Every alert managed by CAD corresponds to an investigation, representing the executed code associated with the alert.
@@ -87,16 +105,6 @@ Investigation specific documentation can be found in the according investigation
87105
* [OCM](https://github.com/openshift-online/ocm-sdk-go) -- Retrieving cluster info, sending service logs, and managing (post, delete) limited support reasons.
88106
* [osd-network-verifier](https://github.com/openshift/osd-network-verifier) -- Tool to verify the pre-configured networking components for ROSA and OSD CCS clusters.
89107

90-
### Overview
91-
92-
- CAD is a command line tool that is run in tekton pipelines.
93-
- The tekton service is running on an app-sre cluster.
94-
- CAD is triggered by PagerDuty webhooks configured on selected services, meaning that all alerts in that service trigger a CAD pipeline.
95-
- CAD uses the data received via the webhook to determine which investigation to start.
96-
97-
![CAD Overview](./images/cad_overview/cad_architecture_dark.png#gh-dark-mode-only)
98-
![CAD Overview](./images/cad_overview/cad_architecture_light.png#gh-light-mode-only)
99-
100108
### Templates
101109

102110
* [Update-Template](./hack/update-template/README.md) -- Updating configuration-anomaly-detection-template.Template.yaml.

cadctl/README.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

deploy/interceptor.yaml

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: cad-interceptor-deployment
5+
spec:
6+
replicas: 2
7+
selector:
8+
matchLabels:
9+
app: cad-interceptor
10+
template:
11+
metadata:
12+
labels:
13+
app: cad-interceptor
14+
spec:
15+
containers:
16+
- name: cad-interceptor
17+
image: ${REGISTRY_IMG}:${IMAGE_TAG}
18+
command: ["/bin/bash", "-c"]
19+
args: ["interceptor"]
20+
ports:
21+
- containerPort: 8080
22+
envFrom:
23+
- secretRef:
24+
name: cad-pd-token
25+
resources:
26+
limits:
27+
cpu: "100m"
28+
memory: "500Mi"
29+
requests:
30+
cpu: "10m"
31+
memory: "100Mi"
32+
readinessProbe:
33+
httpGet:
34+
path: /ready
35+
port: 8080
36+
initialDelaySeconds: 10
37+
periodSeconds: 10
38+
timeoutSeconds: 5
39+
successThreshold: 1
40+
failureThreshold: 3
41+
securityContext:
42+
allowPrivilegeEscalation: false
43+
capabilities:
44+
drop:
45+
- ALL
46+
runAsGroup: 65532
47+
runAsNonRoot: true
48+
runAsUser: 65532
49+
seccompProfile:
50+
type: RuntimeDefault
51+
restartPolicy: Always
52+
serviceAccountName: pipeline
53+
terminationGracePeriodSeconds: 30
54+
---
55+
apiVersion: v1
56+
kind: Service
57+
metadata:
58+
name: cad-interceptor-service
59+
spec:
60+
selector:
61+
app: cad-interceptor
62+
ports:
63+
- protocol: TCP
64+
port: 8080
65+
targetPort: 8080
66+
type: ClusterIP
67+
---
68+
apiVersion: triggers.tekton.dev/v1beta1
69+
kind: Interceptor
70+
metadata:
71+
name: cad-interceptor
72+
spec:
73+
clientConfig:
74+
service:
75+
name: cad-interceptor-service
76+
namespace: ${NAMESPACE_NAME}
77+
port: 8080

deploy/pipeline-trigger.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,10 @@ spec:
4545
params:
4646
- name: "filter"
4747
value: "header.canonical('X-Secret-Token').compareSecret('X_SECRET_TOKEN', 'cad-pd-token')"
48+
# Enable after interceptor deployment is tested
49+
# - ref:
50+
# name: "cad-interceptor"
51+
# kind: NamespacedInterceptor
4852
bindings:
4953
- ref: cad-check-trigger
5054
template:

interceptor/go.mod

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -174,4 +174,4 @@ require (
174174
sigs.k8s.io/yaml v1.4.0 // indirect
175175
)
176176

177-
replace github.com/openshift/configuration-anomaly-detection => ../
177+
replace github.com/openshift/configuration-anomaly-detection => ../

interceptor/go.sum

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1078,4 +1078,4 @@ sigs.k8s.io/structured-merge-diff/v4 v4.4.1/go.mod h1:N8hJocpFajUSSeSJ9bOZ77Vzej
10781078
sigs.k8s.io/yaml v1.2.0/go.mod h1:yfXDCHCao9+ENCvLSE62v9VSji2MKu5jeNfTrofGhJc=
10791079
sigs.k8s.io/yaml v1.3.0/go.mod h1:GeOyir5tyXNByN85N/dRIT9es5UQNerPYEKK56eTBm8=
10801080
sigs.k8s.io/yaml v1.4.0 h1:Mk1wCc2gy/F0THH0TAp1QYyJNzRm2KCLy3o5ASXVI5E=
1081-
sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY=
1081+
sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY=

openshift/template.yaml

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,84 @@ parameters:
1010
- name: NAMESPACE_NAME
1111
value: configuration-anomaly-detection
1212
objects:
13+
- apiVersion: apps/v1
14+
kind: Deployment
15+
metadata:
16+
name: cad-interceptor-deployment
17+
spec:
18+
replicas: 2
19+
selector:
20+
matchLabels:
21+
app: cad-interceptor
22+
template:
23+
metadata:
24+
labels:
25+
app: cad-interceptor
26+
spec:
27+
containers:
28+
- args:
29+
- interceptor
30+
command:
31+
- /bin/bash
32+
- -c
33+
envFrom:
34+
- secretRef:
35+
name: cad-pd-token
36+
image: ${REGISTRY_IMG}:${IMAGE_TAG}
37+
name: cad-interceptor
38+
ports:
39+
- containerPort: 8080
40+
readinessProbe:
41+
failureThreshold: 3
42+
httpGet:
43+
path: /ready
44+
port: 8080
45+
initialDelaySeconds: 10
46+
periodSeconds: 10
47+
successThreshold: 1
48+
timeoutSeconds: 5
49+
resources:
50+
limits:
51+
cpu: 100m
52+
memory: 500Mi
53+
requests:
54+
cpu: 10m
55+
memory: 100Mi
56+
securityContext:
57+
allowPrivilegeEscalation: false
58+
capabilities:
59+
drop:
60+
- ALL
61+
runAsGroup: 65532
62+
runAsNonRoot: true
63+
runAsUser: 65532
64+
seccompProfile:
65+
type: RuntimeDefault
66+
restartPolicy: Always
67+
serviceAccountName: pipeline
68+
terminationGracePeriodSeconds: 30
69+
- apiVersion: v1
70+
kind: Service
71+
metadata:
72+
name: cad-interceptor-service
73+
spec:
74+
ports:
75+
- port: 8080
76+
protocol: TCP
77+
targetPort: 8080
78+
selector:
79+
app: cad-interceptor
80+
type: ClusterIP
81+
- apiVersion: triggers.tekton.dev/v1beta1
82+
kind: Interceptor
83+
metadata:
84+
name: cad-interceptor
85+
spec:
86+
clientConfig:
87+
service:
88+
name: cad-interceptor-service
89+
namespace: ${NAMESPACE_NAME}
90+
port: 8080
1391
- apiVersion: triggers.tekton.dev/v1beta1
1492
kind: TriggerBinding
1593
metadata:

0 commit comments

Comments
 (0)