Skip to content

Commit 620516c

Browse files
RAMathewsMathews
andauthored
Add docs for adothealth monitoring (#224)
* added docs for ADOT health monitoring * Update index.md - Added screen-shots of the dashboard * Update index.md -Corrected some typos and formatting * Update index.md - Made corrections based on comments * Update index.md --------- Co-authored-by: Mathews <[email protected]>
1 parent 86cc948 commit 620516c

File tree

2 files changed

+56
-0
lines changed

2 files changed

+56
-0
lines changed

docs/adothealth/index.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Monitoring ADOT collector health
2+
3+
The OpenTelemetry collector produces metrics to monitor the entire pipeline. In the [EKS monitoring module](https://aws-observability.github.io/terraform-aws-observability-accelerator/eks/), we have enabled those metrics by default with the AWS Distro for OpenTelemetry (ADOT) collector. You get a Grafana dashboard named `OpenTelemetry Health Collector`. This dashboard shows useful telemetry information about the ADOT collector itself which can be helpful when you want to troubleshoot any issues with the collector or understand how much resources the collector is consuming.
4+
5+
!!!note
6+
The dashboard and metrics used are not specific to Amazon EKS, but applicable to any environment running an OpenTelemetry collector.
7+
8+
Below diagram shows an example data flow and the components in an ADOT collector:
9+
10+
![ADOTCollectorComponents](https://github.com/RAMathews/terraform-aws-observability-accelerator/assets/114662591/1db25d84-c1ca-4468-bb0d-42c8bafd1942)
11+
12+
In this dashboard, there are five sections. Each section has [metrics](https://aws-observability.github.io/observability-best-practices/guides/operational/adot-at-scale/operating-adot-collector/#collecting-health-metrics-from-the-collector) relevant to the various [components](https://opentelemetry.io/docs/demo/collector-data-flow-dashboard/#data-flow-overview) of the AWS Distro for OpenTelemetry (ADOT) collector :
13+
14+
### Receivers
15+
Shows the receiver’s accepted and refused rate/count of spans and metric points that are pushed into the telemetry pipeline.
16+
17+
### Processors
18+
Shows the accepted and refused rate/count of spans and metric points pushed into next component in the pipeline. The batch metrics can help to understand how often metrics are sent to exporter and the batch size.
19+
20+
![receivers_processors](https://github.com/RAMathews/terraform-aws-observability-accelerator/assets/114662591/9a2edc27-9472-4a58-a244-d69f2bc7f41f)
21+
22+
### Exporters
23+
Shows the exporter’s accepted and refused rate/count of spans and metric points that are pushed to any of the destinations. It also shows the size and capacity of the retry queue. These metrics can be used to understand if the collector is having issues in sending trace or metric data to the destination configured.
24+
25+
![exporters](https://github.com/RAMathews/terraform-aws-observability-accelerator/assets/114662591/77e20ac5-64bb-42ca-9db6-4d13ca7b27de)
26+
27+
### Collectors
28+
Shows the collector’s operational metrics (Memory, CPU, uptime). This can be used to understand how much resources the collector is consuming.
29+
30+
![collectors](https://github.com/RAMathews/terraform-aws-observability-accelerator/assets/114662591/25151edd-6132-479a-9331-71aa69a91d5e)
31+
32+
### Data Flow
33+
Shows the metrics and spans data flow through the collector’s components.
34+
35+
![dataflow](https://github.com/RAMathews/terraform-aws-observability-accelerator/assets/114662591/61fe684d-8ed3-4645-9210-f16158442b7d)
36+
37+
!!!note
38+
To read more about the metrics used, and the dashboard use, visit the upstream documentation [here](https://opentelemetry.io/docs/demo/collector-data-flow-dashboard/).
39+
40+
## Deploy instructions
41+
42+
As this is enabled by default in the EKS monitoring module, visit [this example’s instructions](https://aws-observability.github.io/terraform-aws-observability-accelerator/eks/#prerequisites) which will provide the ADOT collector health dashboard after deployment
43+
44+
## Disable ADOT health monitoring
45+
46+
You can disable ADOT collector health metrics by setting the [variable](https://github.com/aws-observability/terraform-aws-observability-accelerator/blob/main/modules/eks-monitoring/variables.tf) enable_adotcollector_metrics to false.
47+
48+
```
49+
variable "enable_adotcollector_metrics" {
50+
description = "Enables collection of ADOT collector metrics"
51+
type = bool
52+
default = true
53+
}
54+
```

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ nav:
3535
- Viewing logs: eks/logs.md
3636
- Tracing: eks/tracing.md
3737
- Teardown: eks/destroy.md
38+
- AWS Distro for OpenTelemetry (ADOT):
39+
- Monitoring ADOT collector health: adothealth/index.md
3840
- Amazon CloudWatch Container Insights:
3941
- Amazon EKS: container-insights/eks.md
4042
- Monitoring Managed Service for Prometheus Workspaces: workloads/managed-prometheus.md

0 commit comments

Comments
 (0)