diff --git a/plugins/processors/awsentity/README.md b/plugins/processors/awsentity/README.md new file mode 100644 index 0000000000..871d34d0fc --- /dev/null +++ b/plugins/processors/awsentity/README.md @@ -0,0 +1,239 @@ +# AWS Entity Processor + + +| Status | | +| ------------- |-----------| +| Stability | [beta]: metrics | +| Distributions | [amazon-cloudwatch-agent] | +| Issues | [![Open issues](https://img.shields.io/github/issues-search/open?query=is%3Aissue%20is%3Aopen%20label%3Aprocessor%2Fawsentity%20&label=open&color=orange)](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aopen+is%3Aissue+label%3Aprocessor%2Fawsentity) [![Closed issues](https://img.shields.io/github/issues-search/closed?query=is%3Aissue%20is%3Aclosed%20label%3Aprocessor%2Fawsentity%20&label=closed&color=blue)](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aclosed+is%3Aissue+label%3Aprocessor%2Fawsentity) | +| [Code Owners](https://github.com/aws/amazon-cloudwatch-agent/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@aws/amazon-cloudwatch-agent-team](https://www.github.com/orgs/aws/teams/amazon-cloudwatch-agent-team) | + +[beta]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#beta +[amazon-cloudwatch-agent]: https://github.com/aws/amazon-cloudwatch-agent + + +The AWS Entity processor enriches telemetry data with AWS-specific entity information for improved observability and correlation in Amazon CloudWatch. It processes metrics to add entity attributes that help identify and categorize AWS resources and services. Please refer to [config.go](./config.go) for the config spec. + +This processor supports two main entity types: +- **Service**: Enriches telemetry with service-level entity information +- **Resource**: Enriches telemetry with AWS resource-level entity information + +## Configuration + +The processor supports the following configuration options: + +```yaml +processors: + awsentity: + # Determines if the processor should scrape OTEL datapoint attributes for entity information. + # Mainly used for components that emit all attributes to datapoint level instead of resource level. + scrape_datapoint_attribute: false + + # Explicitly provide the Cluster's Name for scenarios where auto-detection is not possible. + cluster_name: "my-cluster" + + # Kubernetes mode (eks, k8s_ec2, k8s_onprem) + kubernetes_mode: "eks" + + # Platform mode (ec2, ecs, eks, onprem) + platform: "ec2" + + # Entity type determines the type of entity processing (Service or Resource) + entity_type: "Service" + + # Transform entity configuration for overriding entity attributes + transform_entity: + key_attributes: + - key: "Name" + value: "my-service" + - key: "Environment" + value: "production" + attributes: + - key: "AWS.ServiceNameSource" + value: "UserConfiguration" +``` + +### Configuration Parameters + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `scrape_datapoint_attribute` | bool | `false` | Whether to scrape datapoint attributes for entity information | +| `cluster_name` | string | `""` | Explicit cluster name when auto-detection fails | +| `kubernetes_mode` | string | `""` | Kubernetes deployment mode (`eks`, `k8s_ec2`, `k8s_onprem`) | +| `platform` | string | `""` | Platform where agent is running (`ec2`, `ecs`, `eks`, `onprem`) | +| `entity_type` | string | `""` | Type of entity processing (`Service` or `Resource`) | +| `transform_entity` | object | `nil` | Configuration for overriding entity attributes | + +## Entity Types + +### Service Entity Type + +When `entity_type` is set to `Service`, the processor: + +1. **Extracts service information** from resource attributes: + - `service.name` - Service name + - `deployment.environment` - Deployment environment + - `aws.log.group.names` - Associated log groups + +2. **Adds platform-specific attributes** based on the deployment environment: + - **EC2**: Instance ID, Auto Scaling Group, Account ID + - **EKS**: Cluster, Namespace, Workload, Node information + - **K8s**: Cluster, Namespace, Workload, Node information + +3. **Implements fallback mechanisms**: + - Service name fallback to Kubernetes workload name for unknown services + - Environment fallback to cluster/namespace combination + - Service name source detection (Instrumentation, UserConfiguration, K8sWorkload, etc.) + +### Resource Entity Type + +When `entity_type` is set to `Resource`, the processor: + +1. **Adds AWS resource identification**: + - Resource type (e.g., `AWS::EC2::Instance`) + - Resource identifier (e.g., EC2 instance ID) + - AWS account ID + +2. **Sets platform type** based on deployment: + - `AWS::EC2` for EC2 instances + - `AWS::EKS` for EKS clusters + - `K8s` for Kubernetes clusters + +## Platform Support + +### EC2 Platform + +For EC2 deployments, the processor: +- Retrieves instance metadata (Instance ID, Account ID) +- Detects Auto Scaling Group membership +- Applies service name fallbacks based on IAM roles or instance tags +- Sets default environment to `ec2:default` or `ec2:{asg-name}` + +### EKS Platform + +For EKS deployments, the processor: +- Extracts Kubernetes metadata (cluster, namespace, workload, node) +- Uses workload name as service name fallback for unknown services +- Sets environment to `eks:{cluster-name}/{namespace}` +- Correlates pod information with service mappings + +### Kubernetes Platform + +For generic Kubernetes deployments, the processor: +- Extracts Kubernetes metadata similar to EKS +- Sets environment to `k8s:{cluster-name}/{namespace}` +- Supports both EC2-based and on-premises Kubernetes clusters + +## Entity Transformation + +The `transform_entity` configuration allows overriding entity attributes: + +```yaml +transform_entity: + key_attributes: + - key: "Name" + value: "override-service-name" + - key: "Environment" + value: "production" + attributes: + - key: "AWS.ServiceNameSource" + value: "UserConfiguration" +``` + +### Supported Key Attributes +- `Name` - Service name +- `Environment` - Deployment environment +- `Type` - Entity type +- `ResourceType` - AWS resource type +- `Identifier` - Resource identifier +- `AwsAccountId` - AWS account ID + +### Supported Attributes +- `K8s.Namespace` - Kubernetes namespace +- `K8s.Workload` - Kubernetes workload +- `K8s.Node` - Kubernetes node +- `PlatformType` - Platform type +- `EC2.InstanceId` - EC2 instance ID +- `EC2.AutoScalingGroup` - Auto Scaling Group +- `AWS.ServiceNameSource` - Service name source + +## Service Name Sources + +The processor tracks service name sources with the following priority: + +1. **Instrumentation** - From OpenTelemetry SDK instrumentation +2. **UserConfiguration** - From user-provided configuration +3. **K8sWorkload** - From Kubernetes workload name +4. **ClientIamRole** - From EC2 IAM role +5. **Unknown** - When source cannot be determined + +## Examples + +### Basic Service Entity Configuration + +```yaml +processors: + awsentity: + entity_type: "Service" + platform: "ec2" +``` + +### EKS Service Entity Configuration + +```yaml +processors: + awsentity: + entity_type: "Service" + platform: "ec2" + kubernetes_mode: "eks" + cluster_name: "my-eks-cluster" +``` + +### Resource Entity Configuration + +```yaml +processors: + awsentity: + entity_type: "Resource" + platform: "ec2" +``` + +### Service Entity with Transformation + +```yaml +processors: + awsentity: + entity_type: "Service" + platform: "ec2" + transform_entity: + key_attributes: + - key: "Name" + value: "my-application" + - key: "Environment" + value: "production" + attributes: + - key: "AWS.ServiceNameSource" + value: "UserConfiguration" +``` + +## Datapoint Attribute Scraping + +When `scrape_datapoint_attribute` is enabled, the processor examines individual metric datapoints for entity information. This is useful for components that emit attributes at the datapoint level rather than resource level (e.g., Telegraf plugins). + +The processor will: +1. Extract service name and environment from datapoint attributes +2. Remove these attributes from datapoints to avoid duplication +3. Apply the extracted information at the resource level + +## Log Group Association + +For Service entities, the processor creates associations between log groups and services when both `aws.log.group.names` and service information are present. This enables correlation between metrics and logs in CloudWatch. + +## Validation + +The processor validates entity configurations: +- Key attributes must use allowed attribute names +- Attribute values cannot be empty +- Platform-specific required fields must be present + +Refer to [config_test.go](./config_test.go) for detailed configuration examples and validation scenarios. \ No newline at end of file