Skip to content

Commit c135e90

Browse files
authored
docs: custom metrics collection on EKS (#107)
* Add custom metrics collection docs * fixup! Add custom metrics collection docs * Add screenshot
1 parent a8fcfe4 commit c135e90

File tree

4 files changed

+63
-25
lines changed

4 files changed

+63
-25
lines changed

docs/eks/destroy.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Destroy resources
2+
3+
If you leave this stack running, you will continue to incur charges. To remove all resources
4+
created by Terraform, [refresh your Grafana API key](https://aws-observability.github.io/terraform-aws-observability-accelerator/eks/#6-grafana-api-key) and run the command below.
5+
6+
Be careful, this command will removing everything created by Terraform. If you wish
7+
to keep your Amazon Managed Grafana or Amazon Managed Service for Prometheus workspaces. Remove them
8+
from your terraform state before running the destroy command.
9+
10+
```bash
11+
terraform destroy
12+
```
13+
14+
To remove resources from your Terraform state, run
15+
16+
```bash
17+
# grafana workspace
18+
terraform state rm "module.eks_observability_accelerator.module.managed_grafana[0].aws_grafana_workspace.this[0]"
19+
20+
# prometheus workspace
21+
terraform state rm "module.eks_observability_accelerator.aws_prometheus_workspace.this[0]"
22+
```
23+
24+
> **Note:** To view all the features proposed by this module, visit the [module documentation](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/modules/workloads/infra).

docs/eks.md renamed to docs/eks/index.md

Lines changed: 32 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,25 @@
1-
# Amazon EKS cluster monitoring
1+
# Amazon EKS cluster metrics
22

33
This example demonstrates how to monitor your Amazon Elastic Kubernetes Service
44
(Amazon EKS) cluster with the Observability Accelerator's EKS
55
[infrastructure module](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/modules/workloads/infra).
66

7-
Monitoring Amazon Elastic Kubernetes Service (Amazon EKS) has two categories:
7+
Monitoring Amazon Elastic Kubernetes Service (Amazon EKS) for metrics has two categories:
88
the control plane and the Amazon EKS nodes (with Kubernetes objects).
99
The Amazon EKS control plane consists of control plane nodes that run the Kubernetes software,
1010
such as etcd and the Kubernetes API server. To read more on the components of an Amazon EKS cluster,
1111
please read the [service documentation](https://docs.aws.amazon.com/eks/latest/userguide/clusters.html).
1212

1313
The Amazon EKS infrastructure Terraform modules focuses on metrics collection to Amazon
14-
Managed Service for Prometheus using the [AWS Distro for OpenTelemetry Operator](https://docs.aws.amazon.com/eks/latest/userguide/opentelemetry.html) for Amazon EKS.
15-
Additionally, it provides default dashboards to get a comprehensible visibility on the nodes,
14+
Managed Service for Prometheus using the [AWS Distro for OpenTelemetry Operator](https://docs.aws.amazon.com/eks/latest/userguide/opentelemetry.html) for Amazon EKS. It deploys the [node exporter](https://github.com/prometheus/node_exporter) and [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) in your cluster.
15+
16+
It provides default dashboards to get a comprehensible visibility on your nodes,
1617
namespaces, pods, and kubelet operations health. Finally, you get curated Prometheus recording rules
1718
and alerts to operate your cluster.
1819

20+
Additionally, you can optionally collect additional custom Prometheus metrics from your applications running
21+
on your EKS cluster.
22+
1923
## Prerequisites
2024

2125
Make sure to complete the [prerequisites section](https://aws-observability.github.io/terraform-aws-observability-accelerator/concepts/#prerequisites)
@@ -132,28 +136,35 @@ Open the Amazon Managed Service for Prometheus console and view the details of y
132136
To setup your alert receiver, with Amazon SNS, follow [this documentation](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-alertmanager-receiver.html)
133137

134138

135-
## Destroy resources
139+
## Custom metrics collection
136140

137-
If you leave this stack running, you will continue to incur charges. To remove all resources
138-
created by Terraform, [refresh your Grafana API key](#6-grafana-api-key) and run the command below.
141+
In addition to the cluster metrics, if you are interested in collecting Prometheus
142+
metrics from your pods, you can use setup `custom metrics collection`.
143+
This will instruct the ADOT collector to scrape your applications metrics based
144+
on the configuration you provide. You can also exclude some of the metrics and save costs.
139145

140-
Be careful, this command will removing everything created by Terraform. If you wish
141-
to keep your Amazon Managed Grafana or Amazon Managed Service for Prometheus workspaces. Remove them
142-
from your terraform state before running the destroy command.
143-
144-
```bash
145-
terraform destroy
146-
```
146+
Using the example, you can edit `examples/existing-cluster-with-base-and-infra/main.tf`.
147+
In the module `module "workloads_infra" {` add the following config (make sure the values matches your use case):
147148

148-
To remove resources from your Terraform state, run
149+
```hcl
150+
enable_custom_metrics = true
149151
150-
```bash
151-
# grafana workspace
152-
terraform state rm "module.eks_observability_accelerator.module.managed_grafana[0].aws_grafana_workspace.this[0]"
152+
custom_metrics_config = {
153+
# list of applications ports (example)
154+
ports = [8000, 8080]
153155
154-
# prometheus workspace
155-
terraform state rm "module.eks_observability_accelerator.aws_prometheus_workspace.this[0]"
156+
# list of series prefixes you want to discard from ingestion
157+
dropped_series_prefix = ["go_gcc"]
158+
}
156159
```
157160

161+
After applying Terraform, on Grafana, you can query Prometheus for your application metrics,
162+
create alerts and build on your own dashboards. On the explorer section of Grafana, the
163+
following query will give you the containers exposing metrics that matched the custom metrics
164+
collection, grouped by cluster and node.
165+
166+
```promql
167+
sum(up{job="custom-metrics"}) by (container_name, cluster, nodename)
168+
```
158169

159-
> **Note:** To view all the features proposed by this module, visit the [module documentation](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/modules/workloads/infra).
170+
<img width="2560" alt="Screenshot 2023-01-31 at 11 16 21" src="https://user-images.githubusercontent.com/10175027/215869004-e05f557d-c81a-41fb-a452-ede9f986cb27.png">

docs/index.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,18 +3,19 @@
33
Welcome to the AWS Observability Accelerator for Terraform!
44

55
The AWS Observability accelerator is a set of Terraform modules to help you
6-
configure Observability for your workloads and environemnts with AWS
6+
configure Observability for your container workloads and environemnts with AWS
77
Observability services. This project proposes a core module to bootstrap
8-
your cluster with the AWS Distro for OpenTelemetry (ADOT) Operator for EKS,
8+
your Amazon EKS cluster with the AWS Distro for OpenTelemetry (ADOT) Operator for EKS,
99
Amazon Managed Service for Prometheus, Amazon Managed Grafana.
10+
1011
Additionally we have a set of workload modules to leverage curated ADOT
1112
collector configurations, Grafana dashboards, Prometheus recording rules and alerts.
1213

1314
<img width="1501" alt="image" src="https://user-images.githubusercontent.com/10175027/193913383-94aaf4e2-58c6-4779-935b-e40528e86c03.png">
1415

1516
## Getting started
1617

17-
This project provides a set of Terraform modules to enable metrics collection,
18+
This project provides a set of Terraform modules to enable metrics and traces collection,
1819
dashboards and alerts for monitoring:
1920

2021
- Amazon EKS clusters infrastructure

mkdocs.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@ theme:
2525
nav:
2626
- Home: index.md
2727
- Concepts: concepts.md
28-
- Amazon EKS Cluster Monitoring: eks.md
28+
- Amazon EKS:
29+
- Infrastructure monitoring: eks/index.md
30+
- Teardown: eks/destroy.md
2931
- Workload Monitoring:
3032
- Java/JMX: workloads/java.md
3133
- Nginx: workloads/nginx.md

0 commit comments

Comments
 (0)