aws · zhihonl · Jun 23, 2025 · lisguo · Jun 30, 2025
@@ -0,0 +1,239 @@
+# AWS Entity Processor
+
+<!-- status autogenerated section -->
+| Status        |           |
+| ------------- |-----------|
+| Stability     | [beta]: metrics   |
+| Distributions | [amazon-cloudwatch-agent] |
+| Issues        | [![Open issues](https://img.shields.io/github/issues-search/open?query=is%3Aissue%20is%3Aopen%20label%3Aprocessor%2Fawsentity%20&label=open&color=orange)](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aopen+is%3Aissue+label%3Aprocessor%2Fawsentity) [![Closed issues](https://img.shields.io/github/issues-search/closed?query=is%3Aissue%20is%3Aclosed%20label%3Aprocessor%2Fawsentity%20&label=closed&color=blue)](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aclosed+is%3Aissue+label%3Aprocessor%2Fawsentity) |
+| [Code Owners](https://github.com/aws/amazon-cloudwatch-agent/blob/main/CONTRIBUTING.md#becoming-a-code-owner)    | [@aws/amazon-cloudwatch-agent-team](https://www.github.com/orgs/aws/teams/amazon-cloudwatch-agent-team) |
+
+[beta]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#beta
+[amazon-cloudwatch-agent]: https://github.com/aws/amazon-cloudwatch-agent
+<!-- end autogenerated section -->
+
+The AWS Entity processor enriches telemetry data with AWS-specific entity information for improved observability and correlation in Amazon CloudWatch. It processes metrics to add entity attributes that help identify and categorize AWS resources and services. Please refer to [config.go](./config.go) for the config spec.
+
+This processor supports two main entity types:
+- **Service**: Enriches telemetry with service-level entity information
+- **Resource**: Enriches telemetry with AWS resource-level entity information
+
+## Configuration
+
+The processor supports the following configuration options:
+
+```yaml
+processors:
+  awsentity:
+    # Determines if the processor should scrape OTEL datapoint attributes for entity information.
+    # Mainly used for components that emit all attributes to datapoint level instead of resource level.
+    scrape_datapoint_attribute: false
+
+    # Explicitly provide the Cluster's Name for scenarios where auto-detection is not possible.
+    cluster_name: "my-cluster"
+
+    # Kubernetes mode (eks, k8s_ec2, k8s_onprem)
+    kubernetes_mode: "eks"
+
+    # Platform mode (ec2, ecs, eks, onprem)
+    platform: "ec2"
+
+    # Entity type determines the type of entity processing (Service or Resource)
+    entity_type: "Service"
+
+    # Transform entity configuration for overriding entity attributes
+    transform_entity:
+      key_attributes:
+        - key: "Name"
+          value: "my-service"
+        - key: "Environment" 
+          value: "production"
+      attributes:
+        - key: "AWS.ServiceNameSource"
+          value: "UserConfiguration"
+```
+
+### Configuration Parameters
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `scrape_datapoint_attribute` | bool | `false` | Whether to scrape datapoint attributes for entity information |
+| `cluster_name` | string | `""` | Explicit cluster name when auto-detection fails |
+| `kubernetes_mode` | string | `""` | Kubernetes deployment mode (`eks`, `k8s_ec2`, `k8s_onprem`) |
+| `platform` | string | `""` | Platform where agent is running (`ec2`, `ecs`, `eks`, `onprem`) |
+| `entity_type` | string | `""` | Type of entity processing (`Service` or `Resource`) |
+| `transform_entity` | object | `nil` | Configuration for overriding entity attributes |
+
+## Entity Types
+
+### Service Entity Type
+
+When `entity_type` is set to `Service`, the processor:
+
+1. **Extracts service information** from resource attributes:
+   - `service.name` - Service name
+   - `deployment.environment` - Deployment environment
+   - `aws.log.group.names` - Associated log groups
+
+2. **Adds platform-specific attributes** based on the deployment environment:
+   - **EC2**: Instance ID, Auto Scaling Group, Account ID
+   - **EKS**: Cluster, Namespace, Workload, Node information
+   - **K8s**: Cluster, Namespace, Workload, Node information
+
+3. **Implements fallback mechanisms**:
+   - Service name fallback to Kubernetes workload name for unknown services
+   - Environment fallback to cluster/namespace combination
+   - Service name source detection (Instrumentation, UserConfiguration, K8sWorkload, etc.)
+
+### Resource Entity Type
+
+When `entity_type` is set to `Resource`, the processor:
+
+1. **Adds AWS resource identification**:
+   - Resource type (e.g., `AWS::EC2::Instance`)
+   - Resource identifier (e.g., EC2 instance ID)
+   - AWS account ID
+
+2. **Sets platform type** based on deployment:
+   - `AWS::EC2` for EC2 instances
+   - `AWS::EKS` for EKS clusters
+   - `K8s` for Kubernetes clusters
+
+## Platform Support
+
+### EC2 Platform
+
+For EC2 deployments, the processor:
+- Retrieves instance metadata (Instance ID, Account ID)
+- Detects Auto Scaling Group membership
+- Applies service name fallbacks based on IAM roles or instance tags
+- Sets default environment to `ec2:default` or `ec2:{asg-name}`
+
+### EKS Platform
+
+For EKS deployments, the processor:
+- Extracts Kubernetes metadata (cluster, namespace, workload, node)
+- Uses workload name as service name fallback for unknown services
+- Sets environment to `eks:{cluster-name}/{namespace}`
+- Correlates pod information with service mappings
+
+### Kubernetes Platform
+
+For generic Kubernetes deployments, the processor:
+- Extracts Kubernetes metadata similar to EKS
+- Sets environment to `k8s:{cluster-name}/{namespace}`
+- Supports both EC2-based and on-premises Kubernetes clusters
+
+## Entity Transformation
+
+The `transform_entity` configuration allows overriding entity attributes:
+
+```yaml
+transform_entity:
+  key_attributes:
+    - key: "Name"
+      value: "override-service-name"
+    - key: "Environment"
+      value: "production"
+  attributes:
+    - key: "AWS.ServiceNameSource"
+      value: "UserConfiguration"
+```
+
+### Supported Key Attributes
+- `Name` - Service name
+- `Environment` - Deployment environment
+- `Type` - Entity type
+- `ResourceType` - AWS resource type
+- `Identifier` - Resource identifier
+- `AwsAccountId` - AWS account ID
+
+### Supported Attributes
+- `K8s.Namespace` - Kubernetes namespace
+- `K8s.Workload` - Kubernetes workload
+- `K8s.Node` - Kubernetes node
+- `PlatformType` - Platform type
+- `EC2.InstanceId` - EC2 instance ID
+- `EC2.AutoScalingGroup` - Auto Scaling Group
+- `AWS.ServiceNameSource` - Service name source
+
+## Service Name Sources
+
+The processor tracks service name sources with the following priority:
+
+1. **Instrumentation** - From OpenTelemetry SDK instrumentation
+2. **UserConfiguration** - From user-provided configuration
+3. **K8sWorkload** - From Kubernetes workload name
+4. **ClientIamRole** - From EC2 IAM role
+5. **Unknown** - When source cannot be determined
+
+## Examples
+
+### Basic Service Entity Configuration
+
+```yaml
+processors:
+  awsentity:
+    entity_type: "Service"
+    platform: "ec2"
+```
+
+### EKS Service Entity Configuration
+
+```yaml
+processors:
+  awsentity:
+    entity_type: "Service"
+    platform: "ec2"
+    kubernetes_mode: "eks"
+    cluster_name: "my-eks-cluster"
+```
+
+### Resource Entity Configuration
+
+```yaml
+processors:
+  awsentity:
+    entity_type: "Resource"
+    platform: "ec2"
+```
+
+### Service Entity with Transformation
+
+```yaml
+processors:
+  awsentity:
+    entity_type: "Service"
+    platform: "ec2"
+    transform_entity:
+      key_attributes:
+        - key: "Name"
+          value: "my-application"
+        - key: "Environment"
+          value: "production"
+      attributes:
+        - key: "AWS.ServiceNameSource"
+          value: "UserConfiguration"
+```
+
+## Datapoint Attribute Scraping
+
+When `scrape_datapoint_attribute` is enabled, the processor examines individual metric datapoints for entity information. This is useful for components that emit attributes at the datapoint level rather than resource level (e.g., Telegraf plugins).
+
+The processor will:
+1. Extract service name and environment from datapoint attributes
+2. Remove these attributes from datapoints to avoid duplication
+3. Apply the extracted information at the resource level
+
+## Log Group Association
+
+For Service entities, the processor creates associations between log groups and services when both `aws.log.group.names` and service information are present. This enables correlation between metrics and logs in CloudWatch.
+
+## Validation
+
+The processor validates entity configurations:
+- Key attributes must use allowed attribute names
+- Attribute values cannot be empty
+- Platform-specific required fields must be present
+
+Refer to [config_test.go](./config_test.go) for detailed configuration examples and validation scenarios.