Skip to content

Commit 4019db6

Browse files
authored
Support Fluentbit to CloudWatch logs (#140)
* Import and customize fluenbit add-on * Enable fluent bit logs * Bump helm addon version * Dropping account id as it seems to create scraping errors * Create separate log groups per namespace * Apply pre-commit * Remove conflicting global label * Add config object for logs * Enable logs in examples * Add logs docs * Fix broken link * Add screenshots * Update docs * Typos
1 parent c358008 commit 4019db6

File tree

20 files changed

+386
-36
lines changed

20 files changed

+386
-36
lines changed

README.md

Lines changed: 40 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,14 @@
44

55
Welcome to the AWS Observability Accelerator for Terraform!
66

7-
The AWS Observability Accelerator for Terraform is a set of opinionated modules to
8-
help you set up observability for your AWS environments with
7+
The AWS Observability Accelerator for Terraform is a set of opinionated modules
8+
to help you set up observability for your AWS environments with
99
AWS-managed observability services such as Amazon Managed Service for Prometheus,
10-
Amazon Managed Grafana and AWS Distro for OpenTelemetry (ADOT).
10+
Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch.
1111

12-
We provide curated metrics, traces collection, alerting rules and Grafana dashboards
13-
for your EKS infrastructure, Java/JMX, NGINX based workloads and custom applications.
12+
We provide curated metrics, logs, traces collection, alerting rules and Grafana
13+
dashboards for your EKS infrastructure, Java/JMX, NGINX based workloads and
14+
your custom applications.
1415

1516
You also can monitor your Amazon Managed Service for Prometheus workspaces ingestion,
1617
costs, active series with [this module](./modules/managed-prometheus-monitoring).
@@ -42,18 +43,20 @@ v2+ releases introduces couple of breaking changes compared to previous versions
4243

4344
### Base Module
4445

45-
The base module allows you to configure the AWS Observability services for your cluster and
46-
the AWS Distro for OpenTelemetry (ADOT) Operator as the signals collection mechanism.
46+
The base module allows you to configure the AWS Observability services for your
47+
cluster and the AWS Distro for OpenTelemetry (ADOT) Operator as the signals
48+
collection mechanism.
4749

48-
This is the minimum configuration to have a new Amazon Managed Service for Prometheus Workspace
49-
and ADOT Operator deployed for you and ready to receive your data.
50-
The base module serve as an anchor to the workload modules and cannot run on its own.
50+
This is the minimum configuration to have a new Amazon Managed Service for
51+
Prometheus Workspace and ADOT Operator deployed for you and ready to receive
52+
your data. The base module serve as an anchor to the workload modules and
53+
cannot run on its own.
5154

5255
```hcl
5356
module "aws_observability_accelerator" {
5457
# use release tags and check for the latest versions
5558
# https://github.com/aws-observability/terraform-aws-observability-accelerator/releases
56-
source = "github.com/aws-observability/terraform-aws-observability-accelerator?ref=v1.6.1"
59+
source = "github.com/aws-observability/terraform-aws-observability-accelerator?ref=v2.1.0"
5760
5861
aws_region = "eu-west-1"
5962
eks_cluster_id = "my-eks-cluster"
@@ -70,7 +73,7 @@ You can optionally reuse an existing Amazon Managed Servce for Prometheus Worksp
7073
module "aws_observability_accelerator" {
7174
# use release tags and check for the latest versions
7275
# https://github.com/aws-observability/terraform-aws-observability-accelerator/releases
73-
source = "github.com/aws-observability/terraform-aws-observability-accelerator?ref=v1.6.1"
76+
source = "github.com/aws-observability/terraform-aws-observability-accelerator?ref=v2.1.0"
7477
7578
aws_region = "eu-west-1"
7679
eks_cluster_id = "my-eks-cluster"
@@ -91,13 +94,13 @@ View all the configuration options in the module documentation below.
9194
### Workload modules
9295

9396
[Workloads modules](./modules) are provided, which essentially provide curated
94-
metrics collection, alerting rules and Grafana dashboards.
97+
metrics, logs, traces collection, alerting rules and Grafana dashboards.
9598

96-
#### Infrastructure monitoring
99+
#### Amazon EKS monitoring
97100

98101
```hcl
99-
module "workloads_infra" {
100-
source = "aws-observability/terraform-aws-observability-accelerator/workloads/infra"
102+
module "eks_monitoring" {
103+
source = "github.com/aws-observability/terraform-aws-observability-accelerator//modules/eks-monitoring?ref=v2.1.0"
101104
102105
eks_cluster_id = module.eks_observability_accelerator.eks_cluster_id
103106
@@ -106,6 +109,9 @@ module "workloads_infra" {
106109
107110
managed_prometheus_workspace_endpoint = module.eks_observability_accelerator.managed_prometheus_workspace_endpoint
108111
managed_prometheus_workspace_region = module.eks_observability_accelerator.managed_prometheus_workspace_region
112+
113+
enable_logs = true
114+
enable_tracing = true
109115
}
110116
```
111117

@@ -118,17 +124,30 @@ Check the the [complete example](./examples/existing-cluster-with-base-and-infra
118124

119125
## Motivation
120126

121-
Kubernetes is a powerful and extensible container orchestration technology that allows you to deploy and manage containerized applications at scale. The extensible nature of Kubernetes also allows you to use a wide range of popular open-source tools, commonly referred to as add-ons, in Kubernetes clusters. With such a large number of tools and design choices available, building a tailored EKS cluster that meets your application’s specific needs can take a significant amount of time. It involves integrating a wide range of open-source tools and AWS services and requires deep expertise in AWS and Kubernetes.
127+
To gain deep visibility into your workloads and environments, AWS proposes a
128+
set of secure, scalable, highly available, production-grade managed open
129+
source services such as Amazon Managed Service for Prometheus, Amazon Managed
130+
Grafana and Amazon OpenSearch.
131+
132+
AWS customers have asked for best-practices and guidance to collect metrics, logs
133+
and traces from their containerized applications and microservices with ease of
134+
deployment. Customers can use the AWS Observability Accelerator to configure their
135+
metrics and traces collection, leveraging [AWS Distro for OpenTelemetry](https://aws-otel.github.io/),
136+
to have opinionated dashboards and alerts available in only minutes.
122137

123-
AWS customers have asked for examples that demonstrate how to integrate the landscape of Kubernetes tools and make it easy for them to provision complete, opinionated EKS clusters that meet specific application requirements. Customers can use AWS Observability Accelerator to configure and deploy purpose built EKS clusters, and start onboarding workloads in days, rather than months.
124138

125139
## Support & Feedback
126140

127-
AWS Observability Accelerator for Terraform is maintained by AWS Solution Architects. It is not part of an AWS service and support is provided best-effort by the AWS Observability Accelerator community.
141+
AWS Observability Accelerator for Terraform is maintained by AWS Solution
142+
Architects. It is not part of an AWS service and support is provided best-effort
143+
by the AWS Observability Accelerator community.
128144

129-
To post feedback, submit feature ideas, or report bugs, please use the [Issues](https://github.com/aws-observability/terraform-aws-observability-accelerator/issues) section of this GitHub repo.
145+
To post feedback, submit feature ideas, or report bugs, please use the
146+
[Issues](https://github.com/aws-observability/terraform-aws-observability-accelerator/issues)
147+
section of this GitHub repo.
130148

131-
If you are interested in contributing, see the [Contribution guide](https://github.com/aws-observability/terraform-aws-observability-accelerator/blob/main/CONTRIBUTING.md).
149+
If you are interested in contributing, see the
150+
[Contribution guide](https://github.com/aws-observability/terraform-aws-observability-accelerator/blob/main/CONTRIBUTING.md).
132151

133152
---
134153

docs/concepts.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,4 +123,6 @@ classDiagram
123123

124124
## Getting started with AWS Observability services
125125

126-
If you are new to AWS Observability services, or want to dive deeper into them, check our [One Observability Workshop](https://catalog.workshops.aws/observability/) for a hands-on experience in a self-paced environement or at an AWS venue.
126+
If you are new to AWS Observability services, or want to dive deeper into them,
127+
check our [One Observability Workshop](https://catalog.workshops.aws/observability/)
128+
for a hands-on experience in a self-paced environement or at an AWS venue.

docs/eks/logs.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Viewing Logs
2+
3+
By default, we deploy a FluentBit daemon set in the cluster to collect worker
4+
logs for all namespaces. Logs collection can be disabled with
5+
`enable_logs = false`. Logs are collected and exported to Amazon CloudWatch Logs,
6+
which enables you to centralize the logs from all of your systems, applications,
7+
and AWS services that you use, in a single, highly scalable service.
8+
9+
Further configuration options are available in the [module documentation](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/modules/eks-monitoring#inputs).
10+
This guide shows how you can leverage CloudWatch Logs in Amazon Managed Grafana
11+
for your cluster and application logs.
12+
13+
## Using CloudWatch Logs as data source in Grafana
14+
15+
Follow [the documentation](https://docs.aws.amazon.com/grafana/latest/userguide/using-amazon-cloudwatch-in-AMG.html)
16+
to enable Amazon CloudWatch as a data source. Make sure to provide permissions.
17+
18+
!!! tip
19+
If you created your workspace with our [provided example](https://aws-observability.github.io/terraform-aws-observability-accelerator/helpers/managed-grafana/),
20+
Amazon CloudWatch data source has already been setup for you.
21+
22+
All logs are delivered in the following CloudWatch Log groups naming pattern:
23+
`/aws/eks/observability-accelerator/{cluster-name}/{namespace}`. Log streams
24+
follow `{container-name}.{pod-name}`. In Grafana, querying and analyzing logs
25+
is done with [CloudWatch Logs Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html)
26+
27+
### Example - ADOT collector logs
28+
29+
Select one or many log groups and run the following query. The example below,
30+
queries AWS Distro for OpenTelemetry (ADOT) logs
31+
32+
```console
33+
fields @timestamp, log
34+
| order @timestamp desc
35+
| limit 100
36+
```
37+
38+
<img width="1987" alt="Screenshot 2023-03-27 at 19 08 35" src="https://user-images.githubusercontent.com/10175027/228037030-95005f47-ff46-4f7a-af74-d31809c52fcd.png">
39+
40+
41+
### Example - Using time series visualizations
42+
43+
[CloudWatch Logs syntax](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html)
44+
provide powerful functions to extract data from your logs. The `stats()`
45+
function allows you to calculate aggregate statistics with log field values.
46+
This is useful to have visualization on non-metric data from your applications.
47+
48+
In the example below, we use the following query to graph the number of metrics
49+
collected by the ADOT collector
50+
51+
```console
52+
fields @timestamp, log
53+
| parse log /"#metrics": (?<metrics_count>\d+)}/
54+
| stats avg(metrics_count) by bin(5m)
55+
| limit 100
56+
```
57+
58+
!!! tip
59+
You can add logs in your dashboards with logs panel types or time series
60+
depending on your query results type.
61+
62+
<img width="2056" alt="image" src="https://user-images.githubusercontent.com/10175027/228037186-12691590-0bfe-465b-a83b-5c4f583ebf96.png">
63+
64+
!!! warning
65+
Querying CloudWatch logs will incur costs per GB scanned. Use small time
66+
windows and limits in your queries. Checkout the CloudWatch
67+
[pricing page](https://aws.amazon.com/cloudwatch/pricing/) for more infos.

docs/index.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,32 +5,36 @@ Welcome to the AWS Observability Accelerator for Terraform!
55
The AWS Observability Accelerator for Terraform is a set of opinionated modules to
66
help you set up observability for your AWS environments with
77
AWS-managed observability services such as Amazon Managed Service for Prometheus,
8-
Amazon Managed Grafana and AWS Distro for OpenTelemetry (ADOT).
8+
Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch.
99

10-
We provide curated metrics, traces collection, alerting rules and Grafana dashboards
11-
for your EKS infrastructure, Java/JMX, NGINX based workloads and custom applications.
10+
We provide curated metrics, logs, traces collection, alerting rules and Grafana
11+
dashboards for your EKS infrastructure, Java/JMX, NGINX based workloads and
12+
your custom applications.
13+
14+
You also can monitor your Amazon Managed Service for Prometheus workspaces ingestion,
15+
costs, active series with [this module](https://aws-observability.github.io/terraform-aws-observability-accelerator/workloads/managed-prometheus/).
1216

1317
<img width="1501" alt="image" src="images/dark-o11y-accelerator-amp-xray.png">
1418

1519
## Getting started
1620

17-
This project provides a set of Terraform modules to enable metrics and traces collection,
18-
dashboards and alerts for monitoring:
21+
This project provides a set of Terraform modules to enable metrics, logs and
22+
traces collection, dashboards and alerts for monitoring:
1923

20-
- Amazon EKS clusters infrastructure
24+
- Amazon EKS clusters infrastructure and applications
2125
- NGINX workloads (running on Amazon EKS)
2226
- Java/JMX workloads (running on Amazon EKS)
2327
- Amazon Managed Service for Prometheus workspaces with Amazon CloudWatch
2428

25-
These modules can be directly configured in your existing Terraform configurations or ready
26-
to be deployed in our packaged
29+
These modules can be directly configured in your existing Terraform
30+
configurations or ready to be deployed in our packaged
2731
[examples](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/examples)
2832

2933
!!! tip
3034
We have supporting examples for quick setup such as:
3135

3236
- Creating a new Amazon EKS cluster and a VPC
33-
- Creating and configure an Amazon Managed Grafana workspace with SSO (coming soon)
37+
- Creating and configure an Amazon Managed Grafana workspace with SSO
3438

3539
## Motivation
3640

examples/existing-cluster-java/main.tf

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,8 @@ module "eks_monitoring" {
8484
scrape_sample_limit = 2000
8585
}
8686

87+
enable_logs = true
88+
8789
tags = local.tags
8890

8991
depends_on = [

examples/existing-cluster-nginx/main.tf

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,8 @@ module "eks_monitoring" {
7373
managed_prometheus_workspace_endpoint = module.aws_observability_accelerator.managed_prometheus_workspace_endpoint
7474
managed_prometheus_workspace_region = module.aws_observability_accelerator.managed_prometheus_workspace_region
7575

76+
enable_logs = true
77+
7678
tags = local.tags
7779

7880
depends_on = [

examples/existing-cluster-with-base-and-infra/main.tf

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,8 @@ module "eks_monitoring" {
8989
global_scrape_timeout = "15s"
9090
}
9191

92+
enable_logs = true
93+
9294
tags = local.tags
9395

9496
depends_on = [

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ nav:
2929
- Infrastructure monitoring: eks/index.md
3030
- Java/JMX: eks/java.md
3131
- Nginx: eks/nginx.md
32+
- Viewing logs: eks/logs.md
3233
- Teardown: eks/destroy.md
3334
- Monitoring Managed Service for Prometheus Workspaces: workloads/managed-prometheus.md
3435
- Supporting Examples:

modules/eks-monitoring/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,12 @@
22

33
This module provides EKS cluster monitoring with the following resources:
44

5-
- AWS Distro For OpenTelemetry Operator and Collector
5+
- AWS Distro For OpenTelemetry Operator and Collector for Metrics and Traces
6+
- Logs with [AWS for FluentBit](https://github.com/aws/aws-for-fluent-bit)
67
- AWS Managed Grafana Dashboard and data source
78
- Alerts and recording rules with AWS Managed Service for Prometheus
89

9-
This module is inspired from the open source [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack)
10+
This module makes use of the open source [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack)
1011

1112
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
1213
## Requirements
@@ -32,7 +33,8 @@ This module is inspired from the open source [kube-prometheus-stack](https://git
3233

3334
| Name | Source | Version |
3435
|------|--------|---------|
35-
| <a name="module_helm_addon"></a> [helm\_addon](#module\_helm\_addon) | github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons/helm-addon | v4.13.1 |
36+
| <a name="module_fluentbit_logs"></a> [fluentbit\_logs](#module\_fluentbit\_logs) | ./add-ons/aws-for-fluentbit | n/a |
37+
| <a name="module_helm_addon"></a> [helm\_addon](#module\_helm\_addon) | github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons/helm-addon | v4.26.0 |
3638
| <a name="module_java_monitoring"></a> [java\_monitoring](#module\_java\_monitoring) | ./patterns/java | n/a |
3739
| <a name="module_nginx_monitoring"></a> [nginx\_monitoring](#module\_nginx\_monitoring) | ./patterns/nginx | n/a |
3840
| <a name="module_operator"></a> [operator](#module\_operator) | ./add-ons/adot-operator | n/a |
@@ -70,6 +72,7 @@ This module is inspired from the open source [kube-prometheus-stack](https://git
7072
| <a name="input_enable_dashboards"></a> [enable\_dashboards](#input\_enable\_dashboards) | Enables or disables curated dashboards | `bool` | `true` | no |
7173
| <a name="input_enable_java"></a> [enable\_java](#input\_enable\_java) | Enable Java workloads monitoring, alerting and default dashboards | `bool` | `false` | no |
7274
| <a name="input_enable_kube_state_metrics"></a> [enable\_kube\_state\_metrics](#input\_enable\_kube\_state\_metrics) | Enables or disables Kube State metrics exporter. Disabling this might affect some data in the dashboards | `bool` | `true` | no |
75+
| <a name="input_enable_logs"></a> [enable\_logs](#input\_enable\_logs) | Using AWS For FluentBit to collect cluster and application logs to Amazon CloudWatch | `bool` | `true` | no |
7376
| <a name="input_enable_nginx"></a> [enable\_nginx](#input\_enable\_nginx) | Enable NGINX workloads monitoring, alerting and default dashboards | `bool` | `false` | no |
7477
| <a name="input_enable_node_exporter"></a> [enable\_node\_exporter](#input\_enable\_node\_exporter) | Enables or disables Node exporter. Disabling this might affect some data in the dashboards | `bool` | `true` | no |
7578
| <a name="input_enable_tracing"></a> [enable\_tracing](#input\_enable\_tracing) | (Experimental) Enables tracing with AWS X-Ray. This changes the deploy mode of the collector to daemon set. Requirement: adot add-on <= 0.58-build.0 | `bool` | `false` | no |
@@ -78,6 +81,7 @@ This module is inspired from the open source [kube-prometheus-stack](https://git
7881
| <a name="input_irsa_iam_role_path"></a> [irsa\_iam\_role\_path](#input\_irsa\_iam\_role\_path) | IAM role path for IRSA roles | `string` | `"/"` | no |
7982
| <a name="input_java_config"></a> [java\_config](#input\_java\_config) | Configuration object for Java/JMX monitoring | <pre>object({<br> enable_alerting_rules = bool<br> scrape_sample_limit = number<br> })</pre> | <pre>{<br> "enable_alerting_rules": true,<br> "scrape_sample_limit": 1000<br>}</pre> | no |
8083
| <a name="input_ksm_config"></a> [ksm\_config](#input\_ksm\_config) | Kube State metrics configuration | <pre>object({<br> create_namespace = bool<br> k8s_namespace = string<br> helm_chart_name = string<br> helm_chart_version = string<br> helm_release_name = string<br> helm_repo_url = string<br> helm_settings = map(string)<br> helm_values = map(any)<br><br> scrape_interval = string<br> scrape_timeout = string<br> })</pre> | <pre>{<br> "create_namespace": true,<br> "helm_chart_name": "kube-state-metrics",<br> "helm_chart_version": "4.24.0",<br> "helm_release_name": "kube-state-metrics",<br> "helm_repo_url": "https://prometheus-community.github.io/helm-charts",<br> "helm_settings": {},<br> "helm_values": {},<br> "k8s_namespace": "kube-system",<br> "scrape_interval": "60s",<br> "scrape_timeout": "15s"<br>}</pre> | no |
84+
| <a name="input_logs_config"></a> [logs\_config](#input\_logs\_config) | Configuration object for logs collection | <pre>object({<br> cw_log_retention_days = number<br> })</pre> | <pre>{<br> "cw_log_retention_days": 90<br>}</pre> | no |
8185
| <a name="input_managed_prometheus_workspace_endpoint"></a> [managed\_prometheus\_workspace\_endpoint](#input\_managed\_prometheus\_workspace\_endpoint) | Amazon Managed Prometheus Workspace Endpoint | `string` | `""` | no |
8286
| <a name="input_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#input\_managed\_prometheus\_workspace\_id) | Amazon Managed Prometheus Workspace ID | `string` | `null` | no |
8387
| <a name="input_managed_prometheus_workspace_region"></a> [managed\_prometheus\_workspace\_region](#input\_managed\_prometheus\_workspace\_region) | Amazon Managed Prometheus Workspace's Region | `string` | `null` | no |

0 commit comments

Comments
 (0)