Skip to content

Conversation

@mxiamxia
Copy link
Member

@mxiamxia mxiamxia commented Jun 5, 2025

This PR introduces the core structure for the CloudWatch EMF exporter for sending custom metrics. So for some specific customers they can send their own custom OTel metrics to CloudWatch metrics backend without requiring any Collector or Agent installed at this moment.

This PR includes:
Changes:

  • Basic EMF log structure creation
  • Gauge metric conversion and export
  • Unit mapping from OTel to CloudWatch units
  • Metric grouping by attributes and timestamps
  • Supports DELTA temporality for CloudWatch compatibility

Future PRs will add:

  • Support for Sum, Histogram, and ExponentialHistogram metrics
  • Advanced batching and CloudWatch Logs constraints
  • Enhanced error handling and validation

Testing:

  • Comprehensive unit tests for core functionality
  • Mock-based testing to avoid AWS dependencies
  • Tests for initialization, conversion, and basic export flow
cpu_usage_gauge = meter.create_gauge(
        name="system_cpu_usage_percent",
        description="Current CPU usage percentage",
        unit="percent"
    )
cpu_usage_gauge.set(0.2, {"host": "server-01", "region": "us-east-1"})
image
{
    "_aws": {
        "Timestamp": 1749774096539,
        "CloudWatchMetrics": [
            {
                "Namespace": "MyApplication1",
                "Dimensions": [
                    [
                        "host",
                        "region"
                    ]
                ],
                "Metrics": [
                    {
                        "Name": "system_cpu_usage_percent",
                        "Unit": "percent"
                    }
                ]
            }
        ]
    },
    "Version": "1",
    "otel.resource.telemetry.sdk.language": "python",
    "otel.resource.telemetry.sdk.name": "opentelemetry",
    "otel.resource.telemetry.sdk.version": "1.27.0",
    "otel.resource.service.name": "my-service",
    "otel.resource.service.version": "0.1.0",
    "otel.resource.deployment.environment": "production",
    "system_cpu_usage_percent": 0.2,
    "host": "server-01",
    "region": "us-east-1"
}

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@mxiamxia mxiamxia requested a review from a team as a code owner June 5, 2025 04:12
@mxiamxia mxiamxia force-pushed the pr1-emf-exporter-foundation branch 2 times, most recently from bc2ce38 to 8491507 Compare June 5, 2025 04:19
@mxiamxia mxiamxia force-pushed the pr1-emf-exporter-foundation branch 12 times, most recently from 22ab59b to 5b6d407 Compare June 10, 2025 23:40
This PR introduces the core structure for the CloudWatch EMF (Embedded Metric Format)
exporter for OpenTelemetry Python. This foundation includes:

Core Features:
- CloudWatchEMFExporter class with full initialization
- Basic EMF log structure creation
- Gauge metric conversion and export
- Unit mapping from OTel to CloudWatch units
- CloudWatch Logs integration with log group/stream management
- Metric grouping by attributes and timestamps
- Basic error handling and logging

Architecture:
- Follows OTel MetricExporter interface
- Uses boto3 for CloudWatch Logs integration
- Implements proper resource attribute handling
- Supports DELTA temporality for CloudWatch compatibility

Testing:
- Comprehensive unit tests for core functionality
- Mock-based testing to avoid AWS dependencies
- Tests for initialization, conversion, and basic export flow

Future PRs will add:
- Support for Sum, Histogram, and ExponentialHistogram metrics
- Advanced batching and CloudWatch Logs constraints
- Enhanced error handling and validation
@mxiamxia mxiamxia force-pushed the pr1-emf-exporter-foundation branch from 5b6d407 to 1b34c6f Compare June 11, 2025 00:00
Copy link
Contributor

@thpierce thpierce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not review _create_emf_log, _send_log_event, export, or unit tests. Will try to take another crack at this this week.

@mxiamxia mxiamxia force-pushed the pr1-emf-exporter-foundation branch 3 times, most recently from f44f3be to 34a406c Compare June 13, 2025 00:44
@mxiamxia mxiamxia force-pushed the pr1-emf-exporter-foundation branch from 34a406c to 7de65e4 Compare June 13, 2025 00:50
@mxiamxia mxiamxia force-pushed the pr1-emf-exporter-foundation branch from 195e95a to a0da19f Compare June 14, 2025 00:45
Copy link
Contributor

@srprash srprash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not look at the unit tests. Will do so in the next pass.

@srprash
Copy link
Contributor

srprash commented Jun 19, 2025

LGTM. Thanks for addressing the comments. :)

@mxiamxia mxiamxia merged commit f7cd7b6 into aws-observability:main Jun 19, 2025
13 checks passed
jj22ee added a commit to aws-observability/aws-otel-js-instrumentation that referenced this pull request Jul 1, 2025
)

*Issue #, if available:*
JS Equivalent of:
-
aws-observability/aws-otel-python-instrumentation#382
-
aws-observability/aws-otel-python-instrumentation#409
-
aws-observability/aws-otel-python-instrumentation#410

*Description of changes:*


By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
liustve added a commit to aws-observability/aws-otel-java-instrumentation that referenced this pull request Dec 5, 2025
*Description of changes:*
Java Version of these PRs:
-
aws-observability/aws-otel-python-instrumentation#382
-
aws-observability/aws-otel-python-instrumentation#409
-
aws-observability/aws-otel-python-instrumentation#410
-
aws-observability/aws-otel-python-instrumentation#434

This PR introduces the complete CloudWatch EMF (Embedded Metric Format)
exporter implementation for sending OpenTelemetry metrics directly to
CloudWatch without requiring a Collector or Agent.

In order to enable this exporter, users MUST set the following
environment variables:

- `OTEL_METRICS_EXPORTER=awsemf` - 
-
`OTEL_EXPORTER_OTLP_LOGS_HEADERS=x-aws-log-group=<log-group-name>,x-aws-log-stream=<log-stream-name>,
x-aws-metric-namespace=<namespace>`
- `AWS_REGION=<region>` OR `AWS_DEFAULT_REGION=<region>`

This PR includes:

- EMF MetricRecord translation for for unified representation of all
OTel metric types with log creation with unit mapping from OpenTelemetry
to CloudWatch-compatible units
- Automatic log group and stream creation with retry logic
- Supported CloudWatch Logs integration with batching and constraint
handling (256KB event limit, 1MB request limit, timestamp limits)
- Support for metric grouping by attributes and timestamps for EMF log
generation
- Support for Gauge, Sum, Histogram, and ExponentialHistogram metric
types

**TODO**:
- On the next PR, will integrate the Console EMF exporter into AWS
Lambda environments to validate EMF log formatting and ensure consistent
behavior in Lambda runtime:
aws-observability/aws-otel-python-instrumentation#437

**Testing**:
- Added unit tests to validate EMF exporter configuration scenarios,
including parameterized tests for both valid configurations (supporting
AWS_REGION and AWS_DEFAULT_REGION) and invalid configurations (missing
headers, wrong exporter type, missing region). The tests ensure the EMF
exporter is correctly enabled only when all required environment
variables are properly configured.

- Manual end to end testing with the following environment variables to
ensure the EMF logs show up:
   - `AWS_REGION=us-east-1`
   - `OTEL_METRICS_EXPORTER=awsemf`
-
`OTEL_EXPORTER_OTLP_LOGS_HEADERS=x-aws-log-group=test,x-aws-log-stream=default,x-aws-metric-namespace=testNamespace`
-
`OTEL_RESOURCE_ATTRIBUTES=service.name=testService,aws.log.group.names=test,cloud.resource_id=agent-12345`
   - `OTEL_LOGS_EXPORTER=none`
   - `OTEL_TRACES_EXPORTER=none`

Example EMF log emitted:

```
{
    "otel.resource.process.command_args": "[/Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java, -javaagent:/Users/liustve/aws-otel-java-instrumentation/otelagent/build/libs/aws-opentelemetry-agent-2.18.0-SNAPSHOT.jar, -Dfile.encoding=UTF-8, -Dsun.stdout.encoding=UTF-8, -Dsun.stderr.encoding=UTF-8, -jar, /Users/liustve/aws-otel-java-instrumentation/sample-apps/springboot/build/libs/springboot-2.11.0-SNAPSHOT.jar]",
    "otel.resource.host.arch": "aarch64",
    "otel.resource.host.name": "7cf34dd812df",
    "otel.resource.service.instance.id": "a0399d3c-b856-43ae-b374-fe66dee41ce8",
    "otel.resource.aws.log.group.names": "test",
    "jvm.class.unloaded": 1,
    "otel.resource.service.name": "testSErvice",
    "_aws": {
        "CloudWatchMetrics": [
            {
                "Metrics": [
                    {
                        "Unit": "Count",
                        "Name": "jvm.cpu.count"
                    },
                    {
                        "Unit": "Count",
                        "Name": "jvm.class.loaded"
                    },
                    {
                        "Name": "jvm.cpu.recent_utilization"
                    },
                    {
                        "Unit": "Seconds",
                        "Name": "jvm.cpu.time"
                    },
                    {
                        "Unit": "Count",
                        "Name": "jvm.class.count"
                    },
                    {
                        "Unit": "Count",
                        "Name": "jvm.class.unloaded"
                    }
                ],
                "Namespace": "testNamespace"
            }
        ],
        "Timestamp": 1758761023497
    },
    "otel.resource.cloud.resource_id": "agent-12345",
    "jvm.class.count": 14264,
    "Version": "1",
    "otel.resource.process.pid": "46822",
    "otel.resource.os.description": "Mac OS X 15.6.1",
    "otel.resource.telemetry.distro.name": "opentelemetry-java-instrumentation",
    "otel.resource.os.type": "darwin",
    "otel.resource.telemetry.sdk.name": "opentelemetry",
    "otel.resource.telemetry.distro.version": "2.18.0-aws-SNAPSHOT",
    "otel.resource.process.runtime.description": "Amazon.com Inc. OpenJDK 64-Bit Server VM 21.0.8+9-LTS",
    "otel.resource.process.runtime.version": "21.0.8+9-LTS",
    "jvm.cpu.recent_utilization": 0,
    "otel.resource.process.executable.path": "/Library/Java/JavaVirtualMachines/amazon-corretto-21.jdk/Contents/Home/bin/java",
    "otel.resource.telemetry.sdk.version": "1.52.0",
    "jvm.cpu.count": 12,
    "jvm.class.loaded": 14265,
    "otel.resource.process.runtime.name": "OpenJDK Runtime Environment",
    "otel.resource.telemetry.sdk.language": "java",
    "jvm.cpu.time": 7.756965
}
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants