Skip to content

Commit 241ed3b

Browse files
Merge pull request #364 from RaphaelBut/describe-integrations
Improve Readme regarding usage of integrations (awsclient, ocm...)
2 parents e4623ac + 12f666f commit 241ed3b

File tree

1 file changed

+20
-9
lines changed

1 file changed

+20
-9
lines changed

README.md

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -66,8 +66,26 @@ The required investigation is identified by CAD based on the incident and its pa
6666
As PagerDuty itself does not provide finer granularity for webhooks than service-based, CAD filters out the alerts it should investigate. For more information, please refer to https://support.pagerduty.com/docs/webhooks.
6767

6868
To add a new alert investigation:
69-
- create a mapping for the alert to the `GetInvestigation` function in `mapping.go` and write a corresponding CAD investigation (e.g. `Investigate()` in `chgm.go`).
70-
- if the alert is not yet routed to CAD, add a webhook to the service your alert fires on. For production, the service should also have an escalation policy that escalates to SRE on CAD automation timeout.
69+
- Create a mapping for the alert in `registry.go` and write a corresponding CAD investigation (e.g. `Investigate()` in `chgm.go`).
70+
- investigation.Resources contain initialized clients for the clusters aws environment, ocm and more. See [Integrations](#integrations)
71+
72+
### Integrations
73+
74+
> **Note:** When writing an investiation, you can use them right away.
75+
They are initialized for you and passed to the investigation via investigation.Resources.
76+
77+
78+
* [AWS](https://github.com/aws/aws-sdk-go) -- Logging into the cluster, retreiving instance info and AWS CloudTrail events.
79+
- See `pkg/aws`
80+
* [PagerDuty](https://github.com/PagerDuty/go-pagerduty) -- Retrieving alert info, esclating or silencing incidents, and adding notes.
81+
- See `pkg/pagerduty`
82+
* [OCM](https://github.com/openshift-online/ocm-sdk-go) -- Retrieving cluster info, sending service logs, and managing (post, delete) limited support reasons.
83+
- See `pkg/ocm`
84+
- In case of missing permissions to query an ocm resource, add it to the Configuration-Anomaly-Detection role in uhc-account-manager
85+
* [osd-network-verifier](https://github.com/openshift/osd-network-verifier) -- Tool to verify the pre-configured networking components for ROSA and OSD CCS clusters.
86+
* [k8sclient](https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/client) -- Interact with clusters kube-api
87+
- Requires RBAC definitions for your investigation to be added to `metadata.yaml`
88+
7189

7290
## Testing locally
7391

@@ -98,13 +116,6 @@ Every alert managed by CAD corresponds to an investigation, representing the exe
98116

99117
Investigation specific documentation can be found in the according investigation folder, e.g. for [ClusterHasGoneMissing](./pkg/investigations/chgm/README.md).
100118

101-
### Integrations
102-
103-
* [AWS](https://github.com/aws/aws-sdk-go) -- Logging into the cluster, retreiving instance info and AWS CloudTrail events.
104-
* [PagerDuty](https://github.com/PagerDuty/go-pagerduty) -- Retrieving alert info, esclating or silencing incidents, and adding notes.
105-
* [OCM](https://github.com/openshift-online/ocm-sdk-go) -- Retrieving cluster info, sending service logs, and managing (post, delete) limited support reasons.
106-
* [osd-network-verifier](https://github.com/openshift/osd-network-verifier) -- Tool to verify the pre-configured networking components for ROSA and OSD CCS clusters.
107-
108119
### Templates
109120

110121
* [Update-Template](./hack/update-template/README.md) -- Updating configuration-anomaly-detection-template.Template.yaml.

0 commit comments

Comments
 (0)