Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
239 changes: 239 additions & 0 deletions plugins/processors/awsentity/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
# AWS Entity Processor

<!-- status autogenerated section -->
| Status | |
| ------------- |-----------|
| Stability | [beta]: metrics |
| Distributions | [amazon-cloudwatch-agent] |
| Issues | [![Open issues](https://img.shields.io/github/issues-search/open?query=is%3Aissue%20is%3Aopen%20label%3Aprocessor%2Fawsentity%20&label=open&color=orange)](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aopen+is%3Aissue+label%3Aprocessor%2Fawsentity) [![Closed issues](https://img.shields.io/github/issues-search/closed?query=is%3Aissue%20is%3Aclosed%20label%3Aprocessor%2Fawsentity%20&label=closed&color=blue)](https://github.com/aws/amazon-cloudwatch-agent/issues?q=is%3Aclosed+is%3Aissue+label%3Aprocessor%2Fawsentity) |
| [Code Owners](https://github.com/aws/amazon-cloudwatch-agent/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@aws/amazon-cloudwatch-agent-team](https://www.github.com/orgs/aws/teams/amazon-cloudwatch-agent-team) |

[beta]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#beta
[amazon-cloudwatch-agent]: https://github.com/aws/amazon-cloudwatch-agent
<!-- end autogenerated section -->

The AWS Entity processor enriches telemetry data with AWS-specific entity information for improved observability and correlation in Amazon CloudWatch. It processes metrics to add entity attributes that help identify and categorize AWS resources and services. Please refer to [config.go](./config.go) for the config spec.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to include public documentation on what an entity is and what it can be used for: https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_Entity.html

Also, if we just add the processor -- it doesn't work natively to the cloudwatch exporter. You would need the exporters to parse these attributes to plug into the API. Worth calling out


This processor supports two main entity types:
- **Service**: Enriches telemetry with service-level entity information
- **Resource**: Enriches telemetry with AWS resource-level entity information

## Configuration

The processor supports the following configuration options:

```yaml
processors:
awsentity:
# Determines if the processor should scrape OTEL datapoint attributes for entity information.
# Mainly used for components that emit all attributes to datapoint level instead of resource level.
scrape_datapoint_attribute: false

# Explicitly provide the Cluster's Name for scenarios where auto-detection is not possible.
cluster_name: "my-cluster"

# Kubernetes mode (eks, k8s_ec2, k8s_onprem)
kubernetes_mode: "eks"

# Platform mode (ec2, ecs, eks, onprem)
platform: "ec2"

# Entity type determines the type of entity processing (Service or Resource)
entity_type: "Service"

# Transform entity configuration for overriding entity attributes
transform_entity:
key_attributes:
- key: "Name"
value: "my-service"
- key: "Environment"
value: "production"
attributes:
- key: "AWS.ServiceNameSource"
value: "UserConfiguration"
```

### Configuration Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `scrape_datapoint_attribute` | bool | `false` | Whether to scrape datapoint attributes for entity information |
| `cluster_name` | string | `""` | Explicit cluster name when auto-detection fails |
| `kubernetes_mode` | string | `""` | Kubernetes deployment mode (`eks`, `k8s_ec2`, `k8s_onprem`) |
| `platform` | string | `""` | Platform where agent is running (`ec2`, `ecs`, `eks`, `onprem`) |
| `entity_type` | string | `""` | Type of entity processing (`Service` or `Resource`) |
| `transform_entity` | object | `nil` | Configuration for overriding entity attributes |

## Entity Types

### Service Entity Type

When `entity_type` is set to `Service`, the processor:

1. **Extracts service information** from resource attributes:
- `service.name` - Service name
- `deployment.environment` - Deployment environment
- `aws.log.group.names` - Associated log groups

2. **Adds platform-specific attributes** based on the deployment environment:
- **EC2**: Instance ID, Auto Scaling Group, Account ID
- **EKS**: Cluster, Namespace, Workload, Node information
- **K8s**: Cluster, Namespace, Workload, Node information

3. **Implements fallback mechanisms**:
- Service name fallback to Kubernetes workload name for unknown services
- Environment fallback to cluster/namespace combination
- Service name source detection (Instrumentation, UserConfiguration, K8sWorkload, etc.)

### Resource Entity Type

When `entity_type` is set to `Resource`, the processor:

1. **Adds AWS resource identification**:
- Resource type (e.g., `AWS::EC2::Instance`)
- Resource identifier (e.g., EC2 instance ID)
- AWS account ID

2. **Sets platform type** based on deployment:
- `AWS::EC2` for EC2 instances
- `AWS::EKS` for EKS clusters
- `K8s` for Kubernetes clusters

## Platform Support

### EC2 Platform

For EC2 deployments, the processor:
- Retrieves instance metadata (Instance ID, Account ID)
- Detects Auto Scaling Group membership
- Applies service name fallbacks based on IAM roles or instance tags
- Sets default environment to `ec2:default` or `ec2:{asg-name}`

### EKS Platform

For EKS deployments, the processor:
- Extracts Kubernetes metadata (cluster, namespace, workload, node)
- Uses workload name as service name fallback for unknown services
- Sets environment to `eks:{cluster-name}/{namespace}`
- Correlates pod information with service mappings

### Kubernetes Platform

For generic Kubernetes deployments, the processor:
- Extracts Kubernetes metadata similar to EKS
- Sets environment to `k8s:{cluster-name}/{namespace}`
- Supports both EC2-based and on-premises Kubernetes clusters

## Entity Transformation

The `transform_entity` configuration allows overriding entity attributes:

```yaml
transform_entity:
key_attributes:
- key: "Name"
value: "override-service-name"
- key: "Environment"
value: "production"
attributes:
- key: "AWS.ServiceNameSource"
value: "UserConfiguration"
```

### Supported Key Attributes
- `Name` - Service name
- `Environment` - Deployment environment
- `Type` - Entity type
- `ResourceType` - AWS resource type
- `Identifier` - Resource identifier
- `AwsAccountId` - AWS account ID

### Supported Attributes
- `K8s.Namespace` - Kubernetes namespace
- `K8s.Workload` - Kubernetes workload
- `K8s.Node` - Kubernetes node
- `PlatformType` - Platform type
- `EC2.InstanceId` - EC2 instance ID
- `EC2.AutoScalingGroup` - Auto Scaling Group
- `AWS.ServiceNameSource` - Service name source

## Service Name Sources

The processor tracks service name sources with the following priority:

1. **Instrumentation** - From OpenTelemetry SDK instrumentation
2. **UserConfiguration** - From user-provided configuration
3. **K8sWorkload** - From Kubernetes workload name
4. **ClientIamRole** - From EC2 IAM role
5. **Unknown** - When source cannot be determined

## Examples

### Basic Service Entity Configuration

```yaml
processors:
awsentity:
entity_type: "Service"
platform: "ec2"
```

### EKS Service Entity Configuration

```yaml
processors:
awsentity:
entity_type: "Service"
platform: "ec2"
kubernetes_mode: "eks"
cluster_name: "my-eks-cluster"
```

### Resource Entity Configuration

```yaml
processors:
awsentity:
entity_type: "Resource"
platform: "ec2"
```

### Service Entity with Transformation

```yaml
processors:
awsentity:
entity_type: "Service"
platform: "ec2"
transform_entity:
key_attributes:
- key: "Name"
value: "my-application"
- key: "Environment"
value: "production"
attributes:
- key: "AWS.ServiceNameSource"
value: "UserConfiguration"
```

## Datapoint Attribute Scraping

When `scrape_datapoint_attribute` is enabled, the processor examines individual metric datapoints for entity information. This is useful for components that emit attributes at the datapoint level rather than resource level (e.g., Telegraf plugins).

The processor will:
1. Extract service name and environment from datapoint attributes
2. Remove these attributes from datapoints to avoid duplication
3. Apply the extracted information at the resource level

## Log Group Association

For Service entities, the processor creates associations between log groups and services when both `aws.log.group.names` and service information are present. This enables correlation between metrics and logs in CloudWatch.

## Validation

The processor validates entity configurations:
- Key attributes must use allowed attribute names
- Attribute values cannot be empty
- Platform-specific required fields must be present

Refer to [config_test.go](./config_test.go) for detailed configuration examples and validation scenarios.