Skip to content

Commit b84e9b8

Browse files
vchintalbonclay7
andauthored
Addition of EKS multi-cluster observability example (#155)
* Adding EKS multicluster observability example * EKS multicluster example - precommit fix * Corrected the path to an example * Made the multicluster example simpler * Pre-commit changes * Saved the images to Github and linked them in docs * Comments and language edits * Fix trailing spaces * Non-controversial naming and simpler variables * Added region to the data gathering * Formatting changes --------- Co-authored-by: Rodrigue Koffi <[email protected]>
1 parent 5b52c7b commit b84e9b8

File tree

8 files changed

+434
-0
lines changed

8 files changed

+434
-0
lines changed

docs/eks/multicluster.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# AWS EKS Multicluster Observability
2+
3+
This example shows how to use the [AWS Observability Accelerator](https://github.com/aws-observability/terraform-aws-observability-accelerator), with more than one EKS cluster and verify the collected metrics from all the clusters in the dashboards of a common `Amazon Managed Grafana` workspace.
4+
5+
## Prerequisites
6+
7+
#### 1. EKS clusters
8+
9+
Using the example [eks-cluster-with-vpc](../../examples/eks-cluster-with-vpc/), create two EKS clusters with the names:
10+
1. `eks-cluster-1`
11+
2. `eks-cluster-2`
12+
13+
#### 2. Amazon Managed Serivce for Prometheus (AMP) workspace
14+
15+
We recommend that you create a new AMP workspace. To do that you can run the following command.
16+
17+
Ensure you have the following necessary IAM permissions
18+
* `aps.CreateWorkspace`
19+
20+
```sh
21+
export TF_VAR_managed_prometheus_workspace_id=$(aws amp create-workspace --alias observability-accelerator --query='workspaceId' --output text)
22+
```
23+
24+
#### 3. Amazon Managed Grafana (AMG) workspace
25+
26+
To run this example you need an AMG workspace. If you have
27+
an existing workspace, create an environment variable as described below.
28+
To create a new workspace, visit our supporting example for managed Grafana.
29+
30+
!!! note
31+
For the URL `https://g-xyz.grafana-workspace.eu-central-1.amazonaws.com`, the workspace ID would be `g-xyz`
32+
33+
```sh
34+
export TF_VAR_managed_grafana_workspace_id=g-xxx
35+
```
36+
37+
#### 4. Grafana API Key
38+
39+
AMG provides a control plane API for generating Grafana API keys.
40+
As a security best practice, we will provide to Terraform a short lived API key to
41+
run the `apply` or `destroy` command.
42+
43+
Ensure you have the following necessary IAM permissions
44+
* `grafana.CreateWorkspaceApiKey`
45+
* `grafana.DeleteWorkspaceApiKey`
46+
47+
```sh
48+
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
49+
```
50+
51+
## Setup
52+
53+
#### 1. Download sources and initialize Terraform
54+
55+
```sh
56+
git clone https://github.com/aws-observability/terraform-aws-observability-accelerator.git
57+
cd terraform-aws-observability-accelerator/examples/eks-multicluster
58+
terraform init
59+
```
60+
61+
#### 2. Deploy
62+
63+
Verify by looking at the file `variables.tf` that there are two EKS clusters targeted for deployment by the names/ids:
64+
1. `eks-cluster-1`
65+
2. `eks-cluster-2`
66+
67+
The difference in deployment between these clusters is that Terraform, when setting up the EKS cluster behind variable `eks_cluster_1_id` for observability, also sets up:
68+
* Dashboard folder and files in `AMG`
69+
* Prometheus and Java, alerting and recording rules in `AMP`
70+
71+
!!! warning
72+
To override the defaults, create a `terraform.tfvars` and change the default values of the variables.
73+
74+
Run the following command to deploy
75+
76+
```sh
77+
terraform apply --auto-approve
78+
```
79+
80+
## Verifying Multicluster Observability
81+
82+
One you have successfully run the above setup, you should be able to see dashboards similar to the images shown below in `Amazon Managed Grafana` workspace.
83+
84+
Note how you are able to use the `cluster` dropdown to filter the dashboards to metrics collected from a specific EKS cluster.
85+
86+
<img width="2557" alt="eks-multicluster-1" src="https://user-images.githubusercontent.com/4762573/233949110-ce275d06-7ad8-494c-b527-d9c2a0fb6645.png">
87+
88+
<img width="2560" alt="eks-multicluster-2" src="https://user-images.githubusercontent.com/4762573/233949227-f401f81e-e0d6-4242-96ad-0bcd39ad4e2d.png">
89+
90+
## Cleanup
91+
92+
To clean up entirely, run the following command:
93+
94+
```sh
95+
terraform destroy --auto-approve
96+
```

examples/eks-multicluster/README.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# AWS EKS Multicluster Observability
2+
3+
This example shows how to use the [AWS Observability Accelerator](https://github.com/aws-observability/terraform-aws-observability-accelerator), with more than one EKS cluster and verify the collected metrics from all the clusters in the dashboards of a common `Amazon Managed Grafana` workspace.
4+
5+
## Prerequisites
6+
7+
#### 1. EKS clusters
8+
9+
Using the example [eks-cluster-with-vpc](../../examples/eks-cluster-with-vpc/), create two EKS clusters with the names:
10+
1. `eks-cluster-1`
11+
2. `eks-cluster-2`
12+
13+
#### 2. Amazon Managed Serivce for Prometheus (AMP) workspace
14+
15+
We recommend that you create a new AMP workspace. To do that you can run the following command.
16+
17+
Ensure you have the following necessary IAM permissions
18+
* `aps.CreateWorkspace`
19+
20+
```sh
21+
export TF_VAR_managed_prometheus_workspace_id=$(aws amp create-workspace --alias observability-accelerator --query='workspaceId' --output text)
22+
```
23+
24+
#### 3. Amazon Managed Grafana (AMG) workspace
25+
26+
To run this example you need an AMG workspace. If you have
27+
an existing workspace, create an environment variable as described below.
28+
To create a new workspace, visit our supporting example for managed Grafana.
29+
30+
!!! note
31+
For the URL `https://g-xyz.grafana-workspace.eu-central-1.amazonaws.com`, the workspace ID would be `g-xyz`
32+
33+
```sh
34+
export TF_VAR_managed_grafana_workspace_id=g-xxx
35+
```
36+
37+
#### 4. Grafana API Key
38+
39+
AMG provides a control plane API for generating Grafana API keys.
40+
As a security best practice, we will provide to Terraform a short lived API key to
41+
run the `apply` or `destroy` command.
42+
43+
Ensure you have the following necessary IAM permissions
44+
* `grafana.CreateWorkspaceApiKey`
45+
* `grafana.DeleteWorkspaceApiKey`
46+
47+
```sh
48+
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
49+
```
50+
51+
## Setup
52+
53+
#### 1. Download sources and initialize Terraform
54+
55+
```sh
56+
git clone https://github.com/aws-observability/terraform-aws-observability-accelerator.git
57+
cd terraform-aws-observability-accelerator/examples/eks-multicluster
58+
terraform init
59+
```
60+
61+
#### 2. Deploy
62+
63+
Verify by looking at the file `variables.tf` that there are two EKS clusters targeted for deployment by the names/ids:
64+
1. `eks-cluster-1`
65+
2. `eks-cluster-2`
66+
67+
The difference in deployment between these clusters is that Terraform, when setting up the EKS cluster behind variable `eks_cluster_1_id` for observability, also sets up:
68+
* Dashboard folder and files in `AMG`
69+
* Prometheus and Java, alerting and recording rules in `AMP`
70+
71+
!!! warning
72+
To override the defaults, create a `terraform.tfvars` and change the default values of the variables.
73+
74+
Run the following command to deploy
75+
76+
```sh
77+
terraform apply --auto-approve
78+
```
79+
80+
## Verifying Multicluster Observability
81+
82+
One you have successfully run the above setup, you should be able to see dashboards similar to the images shown below in `Amazon Managed Grafana` workspace.
83+
84+
Note how you are able to use the `cluster` dropdown to filter the dashboards to metrics collected from a specific EKS cluster.
85+
86+
<img width="2557" alt="eks-multicluster-1" src="https://user-images.githubusercontent.com/4762573/233949110-ce275d06-7ad8-494c-b527-d9c2a0fb6645.png">
87+
88+
<img width="2560" alt="eks-multicluster-2" src="https://user-images.githubusercontent.com/4762573/233949227-f401f81e-e0d6-4242-96ad-0bcd39ad4e2d.png">
89+
90+
## Cleanup
91+
92+
To clean up entirely, run the following command:
93+
94+
```sh
95+
terraform destroy --auto-approve
96+
```

examples/eks-multicluster/data.tf

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
data "aws_eks_cluster_auth" "eks_cluster_1" {
2+
name = var.eks_cluster_1_id
3+
provider = aws.eks_cluster_1
4+
}
5+
6+
data "aws_eks_cluster_auth" "eks_cluster_2" {
7+
name = var.eks_cluster_2_id
8+
provider = aws.eks_cluster_2
9+
}
10+
11+
data "aws_eks_cluster" "eks_cluster_1" {
12+
name = var.eks_cluster_1_id
13+
provider = aws.eks_cluster_1
14+
}
15+
16+
data "aws_eks_cluster" "eks_cluster_2" {
17+
name = var.eks_cluster_2_id
18+
provider = aws.eks_cluster_2
19+
}

examples/eks-multicluster/main.tf

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
module "aws_observability_accelerator" {
2+
source = "../../../terraform-aws-observability-accelerator"
3+
aws_region = var.eks_cluster_1_region
4+
enable_managed_prometheus = false
5+
enable_alertmanager = true
6+
create_dashboard_folder = true
7+
create_prometheus_data_source = true
8+
grafana_api_key = var.grafana_api_key
9+
managed_prometheus_workspace_region = null
10+
managed_prometheus_workspace_id = var.managed_prometheus_workspace_id
11+
managed_grafana_workspace_id = var.managed_grafana_workspace_id
12+
13+
providers = {
14+
aws = aws.eks_cluster_1
15+
}
16+
}
17+
18+
module "eks_cluster_1_monitoring" {
19+
source = "../../../terraform-aws-observability-accelerator//modules/eks-monitoring"
20+
eks_cluster_id = var.eks_cluster_1_id
21+
enable_amazon_eks_adot = true
22+
enable_cert_manager = true
23+
enable_java = true
24+
25+
# This configuration section results in actions performed on AMG and AMP; and it needs to be done just once
26+
# And hence, this in performed in conjunction with the setup of the eks_cluster_1 EKS cluster
27+
enable_dashboards = true
28+
enable_alerting_rules = true
29+
enable_recording_rules = true
30+
31+
grafana_api_key = var.grafana_api_key
32+
dashboards_folder_id = module.aws_observability_accelerator.grafana_dashboards_folder_id
33+
managed_prometheus_workspace_id = module.aws_observability_accelerator.managed_prometheus_workspace_id
34+
managed_prometheus_workspace_endpoint = module.aws_observability_accelerator.managed_prometheus_workspace_endpoint
35+
managed_prometheus_workspace_region = module.aws_observability_accelerator.managed_prometheus_workspace_region
36+
37+
java_config = {
38+
enable_alerting_rules = true
39+
enable_recording_rules = true
40+
scrape_sample_limit = 1
41+
}
42+
43+
prometheus_config = {
44+
global_scrape_interval = "60s"
45+
global_scrape_timeout = "15s"
46+
scrape_sample_limit = 2000
47+
}
48+
49+
providers = {
50+
aws = aws.eks_cluster_1
51+
kubernetes = kubernetes.eks_cluster_1
52+
helm = helm.eks_cluster_1
53+
grafana = grafana
54+
}
55+
56+
depends_on = [
57+
module.aws_observability_accelerator
58+
]
59+
}
60+
61+
module "eks_cluster_2_monitoring" {
62+
source = "../../../terraform-aws-observability-accelerator//modules/eks-monitoring"
63+
eks_cluster_id = var.eks_cluster_2_id
64+
enable_amazon_eks_adot = true
65+
enable_cert_manager = true
66+
enable_java = true
67+
68+
# Since the following were enabled in conjunction with the set up of the eks_cluster_1 EKS cluster, we will skip
69+
# them with the eks_cluster_2 EKS cluster
70+
enable_dashboards = false
71+
enable_alerting_rules = false
72+
enable_recording_rules = false
73+
74+
grafana_api_key = var.grafana_api_key
75+
dashboards_folder_id = module.aws_observability_accelerator.grafana_dashboards_folder_id
76+
managed_prometheus_workspace_id = module.aws_observability_accelerator.managed_prometheus_workspace_id
77+
managed_prometheus_workspace_endpoint = module.aws_observability_accelerator.managed_prometheus_workspace_endpoint
78+
managed_prometheus_workspace_region = module.aws_observability_accelerator.managed_prometheus_workspace_region
79+
80+
java_config = {
81+
enable_alerting_rules = false # addressed while setting up the eks_cluster_1 EKS cluster
82+
enable_recording_rules = false # addressed while setting up the eks_cluster_1 EKS cluster
83+
scrape_sample_limit = 1
84+
}
85+
86+
prometheus_config = {
87+
global_scrape_interval = "60s"
88+
global_scrape_timeout = "15s"
89+
scrape_sample_limit = 2000
90+
}
91+
92+
providers = {
93+
aws = aws.eks_cluster_2
94+
kubernetes = kubernetes.eks_cluster_2
95+
helm = helm.eks_cluster_2
96+
grafana = grafana
97+
}
98+
99+
depends_on = [
100+
module.aws_observability_accelerator
101+
]
102+
}

examples/eks-multicluster/outputs.tf

Whitespace-only changes.
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
provider "kubernetes" {
2+
host = data.aws_eks_cluster.eks_cluster_1.endpoint
3+
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks_cluster_1.certificate_authority[0].data)
4+
token = data.aws_eks_cluster_auth.eks_cluster_1.token
5+
alias = "eks_cluster_1"
6+
}
7+
8+
provider "kubernetes" {
9+
host = data.aws_eks_cluster.eks_cluster_2.endpoint
10+
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks_cluster_2.certificate_authority[0].data)
11+
token = data.aws_eks_cluster_auth.eks_cluster_2.token
12+
alias = "eks_cluster_2"
13+
}
14+
15+
provider "helm" {
16+
kubernetes {
17+
host = data.aws_eks_cluster.eks_cluster_1.endpoint
18+
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks_cluster_1.certificate_authority[0].data)
19+
token = data.aws_eks_cluster_auth.eks_cluster_1.token
20+
}
21+
alias = "eks_cluster_1"
22+
}
23+
24+
provider "helm" {
25+
kubernetes {
26+
host = data.aws_eks_cluster.eks_cluster_2.endpoint
27+
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks_cluster_2.certificate_authority[0].data)
28+
token = data.aws_eks_cluster_auth.eks_cluster_2.token
29+
}
30+
alias = "eks_cluster_2"
31+
}
32+
33+
provider "aws" {
34+
region = var.eks_cluster_1_region
35+
alias = "eks_cluster_1"
36+
}
37+
38+
provider "aws" {
39+
region = var.eks_cluster_2_region
40+
alias = "eks_cluster_2"
41+
}
42+
43+
provider "grafana" {
44+
url = module.aws_observability_accelerator.managed_grafana_workspace_endpoint
45+
auth = var.grafana_api_key
46+
}

0 commit comments

Comments
 (0)