Skip to content

Commit 387441e

Browse files
srprashthpierce
andauthored
Fix Gevent patch regression with correct import order (#522)
### Issue [This PR](#218) first fixed the Gevent monkey patching issue by calling the patch operation when the ADOT Python starts up and before the modules are loaded. Later on in v0.10.0, [this commit](aed584f#diff-8edce37d4d49a64bdb9e89ba42e612aeea53757a4f317a94b80b883ce54008e9) broke the above fix by adding the line `from amazon.opentelemetry.distro.aws_opentelemetry_configurator` which transitively imports the `requests` library before the gevent patch operation is called, causing inconsistencies in the patched libraries resulting in seeing the `RecursionError: maximum recursion depth exceeded` error reappearing. ### Changes 1. Adding back the earlier logic to run the gevent's monkey patch (only if `gevent` is installed) which was removed in [this commit](1326576#diff-acd1d4a6916a3593419e6117394fe1c7d07b69d1562917dc0a2af45cd7ee8526) to unblock development. 2. Refactoring the patch into its own file and adding comprehensive comments to avoid the same issue in future. 3. Updated unit tests to mock the `gevent` module and validate only the ADOT logic without running the gevent monkey patch. The actual behavior of the patch should be tested in a higher level test such as a contract test. ### Testing Did a manual E2E testing using a sample flask application. Confirmed the following: - Saw `RecursionError` in the application with ADOT Python v0.10.0 - No such error when using ADOT with this fix. #### Setup ##### Sample app ```python from flask import Flask, request, jsonify import logging import sys from datetime import datetime import requests # Configure logging logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' ) app = Flask(__name__) logger = logging.getLogger(__name__) @app.route('/requests') def make_request(): res = requests.get('https://httpbin.org/get', timeout=5) return jsonify({ 'status': 'success', 'status_code': res.status_code, 'url': res.url, 'timestamp': datetime.now().isoformat() }) if __name__ == '__main__': app.run(debug=False, host='0.0.0.0', port=5001) ``` ##### Running with problematic version (see the `MonkeyPatchWarning`) ```shell ((venv) ) ➜ flask git:(fix_gevent_regression) ✗ opentelemetry-instrument gunicorn -w 1 -k gevent --bind 0.0.0.0:5001 app:app /Users/srprash/aws-otel-python-instrumentation/venv/lib/python3.12/site-packages/amazon/opentelemetry/distro/patches/_instrumentation_patch.py:36: MonkeyPatchWarning: Monkey-patching ssl after ssl has already been imported may lead to errors, including RecursionError on Python 3.6. It may also silently lead to incorrect behaviour on Python 3.7. Please monkey-patch earlier. See gevent/gevent#1016. Modules that had direct imports (NOT patched): ['urllib3.util.ssl_ (/Users/srprash/aws-otel-python-instrumentation/venv/lib/python3.12/site-packages/urllib3/util/ssl_.py)', 'urllib3.util (/Users/srprash/aws-otel-python-instrumentation/venv/lib/python3.12/site-packages/urllib3/util/__init__.py)']. monkey.patch_all() Failed to get k8s token: [Errno 2] No such file or directory: '/var/run/secrets/kubernetes.io/serviceaccount/token' AwsEksResourceDetector failed: [Errno 2] No such file or directory: '/var/run/secrets/kubernetes.io/serviceaccount/token' AwsEcsResourceDetector failed: Missing ECS_CONTAINER_METADATA_URI therefore process is not on ECS. AwsEc2ResourceDetector failed: <urlopen error [Errno 65] No route to host> [2025-10-29 11:00:55 -0700] [54080] [INFO] Starting gunicorn 23.0.0 [2025-10-29 11:00:55 -0700] [54080] [INFO] Listening at: http://0.0.0.0:5001 (54080) [2025-10-29 11:00:55 -0700] [54080] [INFO] Using worker: gevent [2025-10-29 11:00:55 -0700] [54290] [INFO] Booting worker with pid: 54290 2025-10-29 11:02:12,464 - app - ERROR - Exception on /requests [GET] ``` ##### Running with the fix ```shell ((venv) ) ➜ flask git:(fix_gevent_regression) ✗ opentelemetry-instrument gunicorn -w 1 -k gevent --bind 0.0.0.0:5001 app:app Failed to get k8s token: [Errno 2] No such file or directory: '/var/run/secrets/kubernetes.io/serviceaccount/token' AwsEksResourceDetector failed: [Errno 2] No such file or directory: '/var/run/secrets/kubernetes.io/serviceaccount/token' AwsEcsResourceDetector failed: Missing ECS_CONTAINER_METADATA_URI therefore process is not on ECS. AwsEc2ResourceDetector failed: <urlopen error [Errno 65] No route to host> [2025-10-29 11:04:45 -0700] [60707] [INFO] Starting gunicorn 23.0.0 [2025-10-29 11:04:45 -0700] [60707] [INFO] Listening at: http://0.0.0.0:5001 (60707) [2025-10-29 11:04:45 -0700] [60707] [INFO] Using worker: gevent [2025-10-29 11:04:45 -0700] [60734] [INFO] Booting worker with pid: 60734 ``` By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Thomas Pierce <[email protected]>
1 parent 6ae6225 commit 387441e

File tree

5 files changed

+420
-1
lines changed

5 files changed

+420
-1
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,5 @@ If your change does not need a CHANGELOG entry, add the "skip changelog" label t
1717
([#497](https://github.com/aws-observability/aws-otel-python-instrumentation/pull/497))
1818
- Fix timeout handling for exceeded deadline in retry logic in OTLPAwsLogsExporter
1919
([#501](https://github.com/aws-observability/aws-otel-python-instrumentation/pull/501))
20+
- Fix Gevent patch regression with correct import order
21+
([#522](https://github.com/aws-observability/aws-otel-python-instrumentation/pull/522))

aws-opentelemetry-distro/src/amazon/opentelemetry/distro/aws_opentelemetry_distro.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,16 @@
11
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
22
# SPDX-License-Identifier: Apache-2.0
3+
4+
# flake8: noqa: E402
5+
# pylint: disable=wrong-import-position
6+
# ========================================================================
7+
# Apply the Gevent's patching as the very first step in the distro.
8+
# IMPORTANT: Do not put any imports before the following 2 lines.
9+
# Read the comments in the _gevent_patches.py for details.
10+
from amazon.opentelemetry.distro.patches._gevent_patches import apply_gevent_monkey_patch
11+
12+
apply_gevent_monkey_patch()
13+
# ========================================================================
314
import importlib
415
import os
516
import sys
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
import os
5+
from importlib.metadata import PackageNotFoundError, version
6+
from logging import Logger, getLogger
7+
8+
from packaging.requirements import Requirement
9+
10+
_logger: Logger = getLogger(__name__)
11+
12+
# Env variable to control Gevent monkey patching behavior in ADOT.
13+
# Read more about the Gevent monkey patching: https://www.gevent.org/intro.html#monkey-patching
14+
# Possible values are 'all', 'none', and
15+
# comma separated list 'os, thread, time, sys, socket, select, ssl, subprocess, builtins, signal, queue, contextvars'.
16+
# When set to 'none', gevent's monkey patching is skipped.
17+
# When set to 'all' (default behavior), gevent patch is executed for all modules as per
18+
# https://www.gevent.org/api/gevent.monkey.html#gevent.monkey.patch_all.
19+
# When set to a comma separated list of modules, only those are processed for gevent's patch.
20+
AWS_GEVENT_PATCH_MODULES = "AWS_GEVENT_PATCH_MODULES"
21+
22+
23+
def _is_gevent_installed() -> bool:
24+
"""Is the gevent package installed?"""
25+
req = Requirement("gevent")
26+
try:
27+
dist_version = version(req.name)
28+
_logger.debug("Gevent is installed: %s", dist_version)
29+
except PackageNotFoundError as exc:
30+
_logger.debug("Gevent is not installed. %s", exc)
31+
return False
32+
return True
33+
34+
35+
def apply_gevent_monkey_patch():
36+
# This patch differs from other instrumentation patches in this directory as it addresses
37+
# application compatibility rather than telemetry functionality. It prevents breaking user
38+
# applications that run on Gevent and use libraries like boto3, requests, or urllib3 when
39+
# instrumented with ADOT.
40+
#
41+
# Without this patch, users encounter "RecursionError: maximum recursion depth exceeded"
42+
# because by the time Gevent monkey-patches modules (such as ssl), those modules have already
43+
# been imported by ADOT. Specifically, aws_xray_remote_sampler imports requests, which
44+
# transitively imports ssl, leaving these modules in an inconsistent state for Gevent.
45+
#
46+
# Gevent recommends monkey-patching as early as possible:
47+
# https://www.gevent.org/intro.html#monkey-patching
48+
#
49+
# Since ADOT initialization occurs before user application code, we perform the monkey-patch
50+
# here to ensure proper module state for Gevent-based applications.
51+
52+
# Only apply the gevent monkey patch if gevent is installed is user application space.
53+
if _is_gevent_installed():
54+
try:
55+
gevent_patch_module = os.environ.get(AWS_GEVENT_PATCH_MODULES, "all")
56+
57+
if gevent_patch_module != "none":
58+
# pylint: disable=import-outside-toplevel
59+
# Delay import to only occur if monkey patch is needed (e.g. gevent is used to run application).
60+
from gevent import monkey
61+
62+
if gevent_patch_module == "all":
63+
monkey.patch_all()
64+
else:
65+
module_list = [module.strip() for module in gevent_patch_module.split(",")]
66+
67+
monkey.patch_all(
68+
socket="socket" in module_list,
69+
time="time" in module_list,
70+
select="select" in module_list,
71+
thread="thread" in module_list,
72+
os="os" in module_list,
73+
ssl="ssl" in module_list,
74+
subprocess="subprocess" in module_list,
75+
sys="sys" in module_list,
76+
builtins="builtins" in module_list,
77+
signal="signal" in module_list,
78+
queue="queue" in module_list,
79+
contextvars="contextvars" in module_list,
80+
)
81+
except Exception as exc: # pylint: disable=broad-except
82+
_logger.error("Failed to monkey patch gevent, exception: %s", exc)

0 commit comments

Comments
 (0)