Skip to content

Commit 80be26b

Browse files
authored
Handle issues introduced by OTLP gRPC protocol (#170)
In this commit, we are handling issues that arise from gRPC. Essentially, if we build gRPC artifacts into our Docker image, it causes the Docker image to only be compatible with applications built using the same Python version. To solve this, we are doing two things: 1) we are removing gRPC artifacts from the docker image and 2) we are changing the default OTLP protocol to be HTTP. If customers attempt to set the protocol as gRPC for ApplicationSignals, we will set the default endpoint correctly. Also we are changing Docker image to build with Python 3.11, which is what we were originally doing when we encountered this issue (reference: 5b3ed74). This is what the upstream does (see [autoinstrumentation/python/Dockerfile](https://github.com/open-telemetry/opentelemetry-operator/blob/b5bb0ae34720d4be2d229dafecb87b61b37699b0/autoinstrumentation/python/Dockerfile)), and having parity here is beneficial to us. Testing: * Create `app.py`: ``` from time import sleep import boto3 try: boto3.client('s3').list_buckets() except Exception: sleep(100) ``` * Run `./scripts/build_and_install_distro.sh` * Run: ``` export OTEL_PYTHON_DISTRO="aws_distro" export OTEL_PYTHON_CONFIGURATOR="aws_configurator" export OTEL_METRICS_EXPORTER="none" unset OTEL_EXPORTER_OTLP_PROTOCOL unset OTEL_AWS_APPLICATION_SIGNALS_ENABLED ``` * Run `opentelemetry-instrument python ./app.py` ``` urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc647cd5e50>: Failed to establish a new connection: [Errno 111] Connection refused')) ``` * Run `export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf; opentelemetry-instrument python ./app.py` ``` requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f20ac96e490>: Failed to establish a new connection: [Errno 111] Connection refused')) ``` * Run `export OTEL_EXPORTER_OTLP_PROTOCOL=grpc; opentelemetry-instrument python ./app.py` ``` Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 1s. ``` * Run ``` unset OTEL_EXPORTER_OTLP_PROTOCOL export OTEL_METRIC_EXPORT_INTERVAL=1000 export OTEL_AWS_APPLICATION_SIGNALS_ENABLED=True ``` * Run `opentelemetry-instrument python ./app.py` ``` urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4316): Max retries exceeded with url: /v1/metrics (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8f6c5aa5b0>: Failed to establish a new connection: [Errno 111] Connection refused')) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8f6dd9c9d0>: Failed to establish a new connection: [Errno 111] Connection refused')) ``` * Run `export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf; opentelemetry-instrument python ./app.py` ``` urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff5868efdc0>: Failed to establish a new connection: [Errno 111] Connection refused')) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=4318): Max retries exceeded with url: /v1/traces (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff5868efdc0>: Failed to establish a new connection: [Errno 111] Connection refused')) ``` * Run `export OTEL_EXPORTER_OTLP_PROTOCOL=grpc; opentelemetry-instrument python ./app.py` ``` Transient error StatusCode.UNAVAILABLE encountered while exporting metrics to localhost:4315, retrying in 1s. Transient error StatusCode.UNAVAILABLE encountered while exporting traces to localhost:4317, retrying in 1s. ``` By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
1 parent 1219b5d commit 80be26b

File tree

5 files changed

+110
-28
lines changed

5 files changed

+110
-28
lines changed

Dockerfile

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,22 @@
33
# The packages are installed in the `/autoinstrumentation` directory. This is required as when instrumenting the pod by CWOperator,
44
# one init container will be created to copy all the content in `/autoinstrumentation` directory to app's container. Then
55
# update the `PYTHONPATH` environment variable accordingly. Then in the second stage, copy the directory to `/autoinstrumentation`.
6-
7-
# Using Python 3.10 because we are utilizing the opentelemetry-exporter-otlp-proto-grpc exporter,
8-
# which relies on grpcio as a dependency. grpcio has strict dependencies on the OS and Python version.
9-
# Also mentioned in Docker build template in the upstream repository:
10-
# https://github.com/open-telemetry/opentelemetry-operator/blob/b5bb0ae34720d4be2d229dafecb87b61b37699b0/autoinstrumentation/python/requirements.txt#L2
11-
# For further details, please refer to: https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/azure-functions/recover-python-functions.md#the-python-interpre[…]tions-python-worker
12-
FROM python:3.10 AS build
6+
FROM python:3.11 AS build
137

148
WORKDIR /operator-build
159

1610
ADD aws-opentelemetry-distro/ ./aws-opentelemetry-distro/
1711

1812
RUN mkdir workspace && pip install --target workspace ./aws-opentelemetry-distro
1913

14+
# Remove opentelemetry-exporter-otlp-proto-grpc and grpcio, as grpcio has strict dependencies on the Python version and
15+
# will cause confusing failures if gRPC protocol is used. Now if gRPC protocol is requested by the user, instrumentation
16+
# will complain that grpc is not installed, which is more understandable. References:
17+
# * https://github.com/open-telemetry/opentelemetry-operator/blob/b5bb0ae34720d4be2d229dafecb87b61b37699b0/autoinstrumentation/python/requirements.txt#L2
18+
# * https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/azure-functions/recover-python-functions.md#troubleshoot-cannot-import-cygrpc
19+
RUN pip uninstall opentelemetry-exporter-otlp-proto-grpc -y
20+
RUN pip uninstall grpcio -y
21+
2022
FROM public.ecr.aws/amazonlinux/amazonlinux:minimal
2123

2224
# Required to copy attribute files to distributed docker images

aws-opentelemetry-distro/src/amazon/opentelemetry/distro/aws_opentelemetry_configurator.py

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,6 @@
1717
)
1818
from amazon.opentelemetry.distro.aws_span_metrics_processor_builder import AwsSpanMetricsProcessorBuilder
1919
from amazon.opentelemetry.distro.sampler.aws_xray_remote_sampler import AwsXRayRemoteSampler
20-
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter as OTLPGrpcOTLPMetricExporter
2120
from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter as OTLPHttpOTLPMetricExporter
2221
from opentelemetry.sdk._configuration import (
2322
_get_exporter_names,
@@ -274,17 +273,10 @@ def __new__(cls, *args, **kwargs):
274273
# pylint: disable=no-self-use
275274
def create_exporter(self):
276275
protocol = os.environ.get(
277-
OTEL_EXPORTER_OTLP_METRICS_PROTOCOL, os.environ.get(OTEL_EXPORTER_OTLP_PROTOCOL, "grpc")
276+
OTEL_EXPORTER_OTLP_METRICS_PROTOCOL, os.environ.get(OTEL_EXPORTER_OTLP_PROTOCOL, "http/protobuf")
278277
)
279278
_logger.debug("AWS Application Signals export protocol: %s", protocol)
280279

281-
application_signals_endpoint = os.environ.get(
282-
APPLICATION_SIGNALS_EXPORTER_ENDPOINT_CONFIG,
283-
os.environ.get(APP_SIGNALS_EXPORTER_ENDPOINT_CONFIG, "http://localhost:4315"),
284-
)
285-
286-
_logger.debug("AWS Application Signals export endpoint: %s", application_signals_endpoint)
287-
288280
temporality_dict: Dict[type, AggregationTemporality] = {}
289281
for typ in [
290282
Counter,
@@ -298,10 +290,27 @@ def create_exporter(self):
298290
temporality_dict[typ] = AggregationTemporality.DELTA
299291

300292
if protocol == "http/protobuf":
293+
application_signals_endpoint = os.environ.get(
294+
APPLICATION_SIGNALS_EXPORTER_ENDPOINT_CONFIG,
295+
os.environ.get(APP_SIGNALS_EXPORTER_ENDPOINT_CONFIG, "http://localhost:4316/v1/metrics"),
296+
)
297+
_logger.debug("AWS Application Signals export endpoint: %s", application_signals_endpoint)
301298
return OTLPHttpOTLPMetricExporter(
302299
endpoint=application_signals_endpoint, preferred_temporality=temporality_dict
303300
)
304301
if protocol == "grpc":
302+
# pylint: disable=import-outside-toplevel
303+
# Delay import to only occur if gRPC specifically requested. Vended Docker image will not have gRPC bundled,
304+
# so importing it at the class level can cause runtime failures.
305+
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import (
306+
OTLPMetricExporter as OTLPGrpcOTLPMetricExporter,
307+
)
308+
309+
application_signals_endpoint = os.environ.get(
310+
APPLICATION_SIGNALS_EXPORTER_ENDPOINT_CONFIG,
311+
os.environ.get(APP_SIGNALS_EXPORTER_ENDPOINT_CONFIG, "localhost:4315"),
312+
)
313+
_logger.debug("AWS Application Signals export endpoint: %s", application_signals_endpoint)
305314
return OTLPGrpcOTLPMetricExporter(
306315
endpoint=application_signals_endpoint, preferred_temporality=temporality_dict
307316
)

aws-opentelemetry-distro/src/amazon/opentelemetry/distro/aws_opentelemetry_distro.py

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,29 @@
55
from amazon.opentelemetry.distro.patches._instrumentation_patch import apply_instrumentation_patches
66
from opentelemetry.distro import OpenTelemetryDistro
77
from opentelemetry.environment_variables import OTEL_PROPAGATORS, OTEL_PYTHON_ID_GENERATOR
8-
from opentelemetry.sdk.environment_variables import OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION
8+
from opentelemetry.sdk.environment_variables import (
9+
OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION,
10+
OTEL_EXPORTER_OTLP_PROTOCOL,
11+
)
912

1013

1114
class AwsOpenTelemetryDistro(OpenTelemetryDistro):
1215
def _configure(self, **kwargs):
13-
"""
16+
"""Sets up default environment variables and apply patches
17+
18+
Set default OTEL_EXPORTER_OTLP_PROTOCOL to be HTTP. This must be run before super(), which attempts to set the
19+
default to gRPC. If we run afterwards, we don't know if the default was set by base OpenTelemetryDistro or if it
20+
was set by the user. We are setting to HTTP as gRPC does not work out of the box for the vended docker image,
21+
due to gRPC having a strict dependency on the Python version the artifact was built for (OTEL observed this:
22+
https://github.com/open-telemetry/opentelemetry-operator/blob/461ba68e80e8ac6bf2603eb353547cd026119ed2/autoinstrumentation/python/requirements.txt#L2-L3)
23+
24+
Also sets default OTEL_PROPAGATORS, OTEL_PYTHON_ID_GENERATOR, and
25+
OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION to ensure good compatibility with X-Ray and Application
26+
Signals.
27+
28+
Also applies patches to upstream instrumentation - usually these are stopgap measures until we can contribute
29+
long-term changes to upstream.
30+
1431
kwargs:
1532
apply_patches: bool - apply patches to upstream instrumentation. Default is True.
1633
@@ -19,13 +36,15 @@ def _configure(self, **kwargs):
1936
OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION environment variable. Need to work with upstream to
2037
make it to be configurable.
2138
"""
39+
os.environ.setdefault(OTEL_EXPORTER_OTLP_PROTOCOL, "http/protobuf")
40+
2241
super(AwsOpenTelemetryDistro, self)._configure()
42+
43+
os.environ.setdefault(OTEL_PROPAGATORS, "xray,tracecontext,b3,b3multi")
44+
os.environ.setdefault(OTEL_PYTHON_ID_GENERATOR, "xray")
2345
os.environ.setdefault(
2446
OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION, "base2_exponential_bucket_histogram"
2547
)
26-
os.environ.setdefault(OTEL_PROPAGATORS, "xray,tracecontext,b3,b3multi")
27-
os.environ.setdefault(OTEL_PYTHON_ID_GENERATOR, "xray")
2848

29-
# Apply patches to upstream instrumentation - usually stopgap measures until we can contribute long-term changes
3049
if kwargs.get("apply_patches", True):
3150
apply_instrumentation_patches()

aws-opentelemetry-distro/tests/amazon/opentelemetry/distro/test_aws_opentelementry_configurator.py

Lines changed: 58 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
from amazon.opentelemetry.distro.attribute_propagating_span_processor import AttributePropagatingSpanProcessor
1010
from amazon.opentelemetry.distro.aws_metric_attributes_span_exporter import AwsMetricAttributesSpanExporter
1111
from amazon.opentelemetry.distro.aws_opentelemetry_configurator import (
12+
ApplicationSignalsExporterProvider,
1213
AwsOpenTelemetryConfigurator,
1314
_custom_import_sampler,
1415
_customize_exporter,
@@ -21,6 +22,9 @@
2122
from amazon.opentelemetry.distro.sampler._aws_xray_sampling_client import _AwsXRaySamplingClient
2223
from amazon.opentelemetry.distro.sampler.aws_xray_remote_sampler import AwsXRayRemoteSampler
2324
from opentelemetry.environment_variables import OTEL_LOGS_EXPORTER, OTEL_METRICS_EXPORTER, OTEL_TRACES_EXPORTER
25+
from opentelemetry.exporter.otlp.proto.common._internal.metrics_encoder import OTLPMetricExporterMixin
26+
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter as OTLPGrpcOTLPMetricExporter
27+
from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter as OTLPHttpOTLPMetricExporter
2428
from opentelemetry.sdk.environment_variables import OTEL_TRACES_SAMPLER, OTEL_TRACES_SAMPLER_ARG
2529
from opentelemetry.sdk.resources import Resource
2630
from opentelemetry.sdk.trace import Span, SpanProcessor, Tracer, TracerProvider
@@ -29,18 +33,28 @@
2933
from opentelemetry.trace import get_tracer_provider
3034

3135

32-
# This class setup Tracer Provider Globally, which can only set once
33-
# if there is another setup for tracer provider, may cause issue
3436
class TestAwsOpenTelemetryConfigurator(TestCase):
37+
"""Tests AwsOpenTelemetryConfigurator and AwsOpenTelemetryDistro
38+
39+
NOTE: This class setup Tracer Provider Globally, which can only be set once. If there is another setup for tracer
40+
provider, it may cause issues for those tests.
41+
"""
42+
3543
@classmethod
3644
def setUpClass(cls):
37-
os.environ.setdefault(OTEL_TRACES_EXPORTER, "none")
38-
os.environ.setdefault(OTEL_METRICS_EXPORTER, "none")
39-
os.environ.setdefault(OTEL_LOGS_EXPORTER, "none")
40-
os.environ.setdefault(OTEL_TRACES_SAMPLER, "traceidratio")
41-
os.environ.setdefault(OTEL_TRACES_SAMPLER_ARG, "0.01")
45+
# Run AwsOpenTelemetryDistro to set up environment, then validate expected env values.
4246
aws_open_telemetry_distro: AwsOpenTelemetryDistro = AwsOpenTelemetryDistro()
4347
aws_open_telemetry_distro.configure(apply_patches=False)
48+
validate_distro_environ()
49+
50+
# Overwrite exporter configs to keep tests clean, set sampler configs for tests
51+
os.environ[OTEL_TRACES_EXPORTER] = "none"
52+
os.environ[OTEL_METRICS_EXPORTER] = "none"
53+
os.environ[OTEL_LOGS_EXPORTER] = "none"
54+
os.environ[OTEL_TRACES_SAMPLER] = "traceidratio"
55+
os.environ[OTEL_TRACES_SAMPLER_ARG] = "0.01"
56+
57+
# Run configurator and get trace provider
4458
aws_otel_configurator: AwsOpenTelemetryConfigurator = AwsOpenTelemetryConfigurator()
4559
aws_otel_configurator.configure()
4660
cls.tracer_provider: TracerProvider = get_tracer_provider()
@@ -249,3 +263,40 @@ def test_customize_span_processors(self):
249263
second_processor: SpanProcessor = mock_tracer_provider.add_span_processor.call_args_list[1].args[0]
250264
self.assertIsInstance(second_processor, AwsSpanMetricsProcessor)
251265
os.environ.pop("OTEL_AWS_APPLICATION_SIGNALS_ENABLED", None)
266+
267+
def test_application_signals_exporter_provider(self):
268+
# Check default protocol - HTTP, as specified by AwsOpenTelemetryDistro.
269+
exporter: OTLPMetricExporterMixin = ApplicationSignalsExporterProvider().create_exporter()
270+
self.assertIsInstance(exporter, OTLPHttpOTLPMetricExporter)
271+
self.assertEqual("http://localhost:4316/v1/metrics", exporter._endpoint)
272+
273+
# Overwrite protocol to gRPC.
274+
os.environ["OTEL_EXPORTER_OTLP_PROTOCOL"] = "grpc"
275+
exporter: SpanExporter = ApplicationSignalsExporterProvider().create_exporter()
276+
self.assertIsInstance(exporter, OTLPGrpcOTLPMetricExporter)
277+
self.assertEqual("localhost:4315", exporter._endpoint)
278+
279+
# Overwrite protocol back to HTTP.
280+
os.environ["OTEL_EXPORTER_OTLP_PROTOCOL"] = "http/protobuf"
281+
exporter: SpanExporter = ApplicationSignalsExporterProvider().create_exporter()
282+
self.assertIsInstance(exporter, OTLPHttpOTLPMetricExporter)
283+
self.assertEqual("http://localhost:4316/v1/metrics", exporter._endpoint)
284+
285+
286+
def validate_distro_environ():
287+
tc: TestCase = TestCase()
288+
# Set by OpenTelemetryDistro
289+
tc.assertEqual("otlp", os.environ.get("OTEL_TRACES_EXPORTER"))
290+
tc.assertEqual("otlp", os.environ.get("OTEL_METRICS_EXPORTER"))
291+
292+
# Set by AwsOpenTelemetryDistro
293+
tc.assertEqual("http/protobuf", os.environ.get("OTEL_EXPORTER_OTLP_PROTOCOL"))
294+
tc.assertEqual(
295+
"base2_exponential_bucket_histogram", os.environ.get("OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION")
296+
)
297+
tc.assertEqual("xray,tracecontext,b3,b3multi", os.environ.get("OTEL_PROPAGATORS"))
298+
tc.assertEqual("xray", os.environ.get("OTEL_PYTHON_ID_GENERATOR"))
299+
300+
# Not set
301+
tc.assertEqual(None, os.environ.get("OTEL_TRACES_SAMPLER"))
302+
tc.assertEqual(None, os.environ.get("OTEL_TRACES_SAMPLER_ARG"))

contract-tests/tests/test/amazon/base/contract_test_base.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@ def setUp(self) -> None:
9090
.with_env("OTEL_METRIC_EXPORT_INTERVAL", "50")
9191
.with_env("OTEL_AWS_APPLICATION_SIGNALS_ENABLED", "true")
9292
.with_env("OTEL_METRICS_EXPORTER", "none")
93+
.with_env("OTEL_EXPORTER_OTLP_PROTOCOL", "grpc")
9394
.with_env("OTEL_BSP_SCHEDULE_DELAY", "1")
9495
.with_env("OTEL_AWS_APPLICATION_SIGNALS_EXPORTER_ENDPOINT", f"http://collector:{_MOCK_COLLECTOR_PORT}")
9596
.with_env("OTEL_EXPORTER_OTLP_TRACES_ENDPOINT", f"http://collector:{_MOCK_COLLECTOR_PORT}")

0 commit comments

Comments
 (0)