generated from amazon-archives/__template_Custom
-
Couldn't load subscription status.
- Fork 238
[processor/awsentity] Add documentation for entity processor #1741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
zhihonl
wants to merge
1
commit into
main
Choose a base branch
from
entity-processor-doc
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,239 @@ | ||
| # AWS Entity Processor | ||
|
|
||
| <!-- status autogenerated section --> | ||
| | Status | | | ||
| | ------------- |-----------| | ||
| | Stability | [beta]: metrics | | ||
| | Distributions | [amazon-cloudwatch-agent] | | ||
| | Issues | [](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aopen+is%3Aissue+label%3Aprocessor%2Fawsentity) [](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aclosed+is%3Aissue+label%3Aprocessor%2Fawsentity) | | ||
| | [Code Owners](https://github.com/aws/amazon-cloudwatch-agent/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@aws/amazon-cloudwatch-agent-team](https://www.github.com/orgs/aws/teams/amazon-cloudwatch-agent-team) | | ||
|
|
||
| [beta]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#beta | ||
| [amazon-cloudwatch-agent]: https://github.com/aws/amazon-cloudwatch-agent | ||
| <!-- end autogenerated section --> | ||
|
|
||
| The AWS Entity processor enriches telemetry data with AWS-specific entity information for improved observability and correlation in Amazon CloudWatch. It processes metrics to add entity attributes that help identify and categorize AWS resources and services. Please refer to [config.go](./config.go) for the config spec. | ||
|
|
||
| This processor supports two main entity types: | ||
| - **Service**: Enriches telemetry with service-level entity information | ||
| - **Resource**: Enriches telemetry with AWS resource-level entity information | ||
|
|
||
| ## Configuration | ||
|
|
||
| The processor supports the following configuration options: | ||
|
|
||
| ```yaml | ||
| processors: | ||
| awsentity: | ||
| # Determines if the processor should scrape OTEL datapoint attributes for entity information. | ||
| # Mainly used for components that emit all attributes to datapoint level instead of resource level. | ||
| scrape_datapoint_attribute: false | ||
|
|
||
| # Explicitly provide the Cluster's Name for scenarios where auto-detection is not possible. | ||
| cluster_name: "my-cluster" | ||
|
|
||
| # Kubernetes mode (eks, k8s_ec2, k8s_onprem) | ||
| kubernetes_mode: "eks" | ||
|
|
||
| # Platform mode (ec2, ecs, eks, onprem) | ||
| platform: "ec2" | ||
|
|
||
| # Entity type determines the type of entity processing (Service or Resource) | ||
| entity_type: "Service" | ||
|
|
||
| # Transform entity configuration for overriding entity attributes | ||
| transform_entity: | ||
| key_attributes: | ||
| - key: "Name" | ||
| value: "my-service" | ||
| - key: "Environment" | ||
| value: "production" | ||
| attributes: | ||
| - key: "AWS.ServiceNameSource" | ||
| value: "UserConfiguration" | ||
| ``` | ||
|
|
||
| ### Configuration Parameters | ||
|
|
||
| | Parameter | Type | Default | Description | | ||
| |-----------|------|---------|-------------| | ||
| | `scrape_datapoint_attribute` | bool | `false` | Whether to scrape datapoint attributes for entity information | | ||
| | `cluster_name` | string | `""` | Explicit cluster name when auto-detection fails | | ||
| | `kubernetes_mode` | string | `""` | Kubernetes deployment mode (`eks`, `k8s_ec2`, `k8s_onprem`) | | ||
| | `platform` | string | `""` | Platform where agent is running (`ec2`, `ecs`, `eks`, `onprem`) | | ||
| | `entity_type` | string | `""` | Type of entity processing (`Service` or `Resource`) | | ||
| | `transform_entity` | object | `nil` | Configuration for overriding entity attributes | | ||
|
|
||
| ## Entity Types | ||
|
|
||
| ### Service Entity Type | ||
|
|
||
| When `entity_type` is set to `Service`, the processor: | ||
|
|
||
| 1. **Extracts service information** from resource attributes: | ||
| - `service.name` - Service name | ||
| - `deployment.environment` - Deployment environment | ||
| - `aws.log.group.names` - Associated log groups | ||
|
|
||
| 2. **Adds platform-specific attributes** based on the deployment environment: | ||
| - **EC2**: Instance ID, Auto Scaling Group, Account ID | ||
| - **EKS**: Cluster, Namespace, Workload, Node information | ||
| - **K8s**: Cluster, Namespace, Workload, Node information | ||
|
|
||
| 3. **Implements fallback mechanisms**: | ||
| - Service name fallback to Kubernetes workload name for unknown services | ||
| - Environment fallback to cluster/namespace combination | ||
| - Service name source detection (Instrumentation, UserConfiguration, K8sWorkload, etc.) | ||
|
|
||
| ### Resource Entity Type | ||
|
|
||
| When `entity_type` is set to `Resource`, the processor: | ||
|
|
||
| 1. **Adds AWS resource identification**: | ||
| - Resource type (e.g., `AWS::EC2::Instance`) | ||
| - Resource identifier (e.g., EC2 instance ID) | ||
| - AWS account ID | ||
|
|
||
| 2. **Sets platform type** based on deployment: | ||
| - `AWS::EC2` for EC2 instances | ||
| - `AWS::EKS` for EKS clusters | ||
| - `K8s` for Kubernetes clusters | ||
|
|
||
| ## Platform Support | ||
|
|
||
| ### EC2 Platform | ||
|
|
||
| For EC2 deployments, the processor: | ||
| - Retrieves instance metadata (Instance ID, Account ID) | ||
| - Detects Auto Scaling Group membership | ||
| - Applies service name fallbacks based on IAM roles or instance tags | ||
| - Sets default environment to `ec2:default` or `ec2:{asg-name}` | ||
|
|
||
| ### EKS Platform | ||
|
|
||
| For EKS deployments, the processor: | ||
| - Extracts Kubernetes metadata (cluster, namespace, workload, node) | ||
| - Uses workload name as service name fallback for unknown services | ||
| - Sets environment to `eks:{cluster-name}/{namespace}` | ||
| - Correlates pod information with service mappings | ||
|
|
||
| ### Kubernetes Platform | ||
|
|
||
| For generic Kubernetes deployments, the processor: | ||
| - Extracts Kubernetes metadata similar to EKS | ||
| - Sets environment to `k8s:{cluster-name}/{namespace}` | ||
| - Supports both EC2-based and on-premises Kubernetes clusters | ||
|
|
||
| ## Entity Transformation | ||
|
|
||
| The `transform_entity` configuration allows overriding entity attributes: | ||
|
|
||
| ```yaml | ||
| transform_entity: | ||
| key_attributes: | ||
| - key: "Name" | ||
| value: "override-service-name" | ||
| - key: "Environment" | ||
| value: "production" | ||
| attributes: | ||
| - key: "AWS.ServiceNameSource" | ||
| value: "UserConfiguration" | ||
| ``` | ||
|
|
||
| ### Supported Key Attributes | ||
| - `Name` - Service name | ||
| - `Environment` - Deployment environment | ||
| - `Type` - Entity type | ||
| - `ResourceType` - AWS resource type | ||
| - `Identifier` - Resource identifier | ||
| - `AwsAccountId` - AWS account ID | ||
|
|
||
| ### Supported Attributes | ||
| - `K8s.Namespace` - Kubernetes namespace | ||
| - `K8s.Workload` - Kubernetes workload | ||
| - `K8s.Node` - Kubernetes node | ||
| - `PlatformType` - Platform type | ||
| - `EC2.InstanceId` - EC2 instance ID | ||
| - `EC2.AutoScalingGroup` - Auto Scaling Group | ||
| - `AWS.ServiceNameSource` - Service name source | ||
|
|
||
| ## Service Name Sources | ||
|
|
||
| The processor tracks service name sources with the following priority: | ||
|
|
||
| 1. **Instrumentation** - From OpenTelemetry SDK instrumentation | ||
| 2. **UserConfiguration** - From user-provided configuration | ||
| 3. **K8sWorkload** - From Kubernetes workload name | ||
| 4. **ClientIamRole** - From EC2 IAM role | ||
| 5. **Unknown** - When source cannot be determined | ||
|
|
||
| ## Examples | ||
|
|
||
| ### Basic Service Entity Configuration | ||
|
|
||
| ```yaml | ||
| processors: | ||
| awsentity: | ||
| entity_type: "Service" | ||
| platform: "ec2" | ||
| ``` | ||
|
|
||
| ### EKS Service Entity Configuration | ||
|
|
||
| ```yaml | ||
| processors: | ||
| awsentity: | ||
| entity_type: "Service" | ||
| platform: "ec2" | ||
| kubernetes_mode: "eks" | ||
| cluster_name: "my-eks-cluster" | ||
| ``` | ||
|
|
||
| ### Resource Entity Configuration | ||
|
|
||
| ```yaml | ||
| processors: | ||
| awsentity: | ||
| entity_type: "Resource" | ||
| platform: "ec2" | ||
| ``` | ||
|
|
||
| ### Service Entity with Transformation | ||
|
|
||
| ```yaml | ||
| processors: | ||
| awsentity: | ||
| entity_type: "Service" | ||
| platform: "ec2" | ||
| transform_entity: | ||
| key_attributes: | ||
| - key: "Name" | ||
| value: "my-application" | ||
| - key: "Environment" | ||
| value: "production" | ||
| attributes: | ||
| - key: "AWS.ServiceNameSource" | ||
| value: "UserConfiguration" | ||
| ``` | ||
|
|
||
| ## Datapoint Attribute Scraping | ||
|
|
||
| When `scrape_datapoint_attribute` is enabled, the processor examines individual metric datapoints for entity information. This is useful for components that emit attributes at the datapoint level rather than resource level (e.g., Telegraf plugins). | ||
|
|
||
| The processor will: | ||
| 1. Extract service name and environment from datapoint attributes | ||
| 2. Remove these attributes from datapoints to avoid duplication | ||
| 3. Apply the extracted information at the resource level | ||
|
|
||
| ## Log Group Association | ||
|
|
||
| For Service entities, the processor creates associations between log groups and services when both `aws.log.group.names` and service information are present. This enables correlation between metrics and logs in CloudWatch. | ||
|
|
||
| ## Validation | ||
|
|
||
| The processor validates entity configurations: | ||
| - Key attributes must use allowed attribute names | ||
| - Attribute values cannot be empty | ||
| - Platform-specific required fields must be present | ||
|
|
||
| Refer to [config_test.go](./config_test.go) for detailed configuration examples and validation scenarios. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to include public documentation on what an entity is and what it can be used for: https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_Entity.html
Also, if we just add the processor -- it doesn't work natively to the cloudwatch exporter. You would need the exporters to parse these attributes to plug into the API. Worth calling out