Skip to content

Commit 26c6922

Browse files
fix(bootstrap): do not load new modules after module cleanup [backport 2.3] (#7745)
Backport 5c8290a from #7649 to 2.3. We refactor the logic of the sitecustomize bootstrap script to make it harder to introduce new initialisation logic that can break some of the boilerplate bootstrapping logic, such as the module cleanup mechanism for the support of frameworks like gevent. We move all the logic into a dedicated module, preload, and expose post_preload hooks in case that some code needs to be run after the module cleanup. However, the problem is now slightly shifted towards these hooks. Care must be taken not to register any hooks that import modules that are not already imported during the execution of the preload script. ## Testing Strategy The test that was supposed to catch the presence of the `threading` module after initialisation when cleaning up the imported modules failed because RC was disabled globally in tests with `DD_REMOTE_CONFIGURATION_ENABLED=1` in the `riotfile.py` script. We have modified the test to ensure that RC is enabled. ## Further Details Because the changes in this PR include a refactor, we summarise the part of the old coding that actually caused the issue in the first place. The `sitecustomize.py` source included the following coding ```python # Only the import of the original sitecustomize.py is allowed after this # point. if sys.version_info >= (3, 7) and config._otel_enabled: from opentelemetry.trace import set_tracer_provider from ddtrace.opentelemetry import TracerProvider set_tracer_provider(TracerProvider()) ... if config._remote_config_enabled: from ddtrace.internal.remoteconfig.worker import remoteconfig_poller remoteconfig_poller.enable() if asm_config._asm_enabled or config._remote_config_enabled: from ddtrace.appsec._remoteconfiguration import enable_appsec_rc enable_appsec_rc() ``` The first `if` block introduces two problems: - it force-loads the `opentelemetry` module - it imports more modules from `ddtrace` _after_ the comment that instructs not to add imports beyond that point Note that the current implementation of the `opentelemetry` intergration also force-loads the `opentelemetry` module when `TracerProvider` is imported. Therefore, the whole logic should be registered as a module import hook. The other `if` blocks further down in the `sitecustomize.py` source also import from `ddtrace`. In fact, they import features that require threading support, which ultimately forces the gevent-crucial `threading` module to be re-loaded too early after the loaded module clean-up. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) are followed. If no release note is required, add label `changelog/no-changelog`. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment. - [x] Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) - [x] If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from `@DataDog/security-design-and-guidance`. - [x] This PR doesn't touch any of that. Co-authored-by: Gabriele N. Tornetta <[email protected]>
1 parent 0e6113e commit 26c6922

File tree

5 files changed

+139
-89
lines changed

5 files changed

+139
-89
lines changed

ddtrace/bootstrap/preload.py

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
"""
2+
Bootstrapping code that is run when using the `ddtrace-run` Python entrypoint
3+
Add all monkey-patching that needs to run by default here
4+
"""
5+
import os # noqa
6+
7+
from ddtrace import config # noqa
8+
from ddtrace.debugging._config import di_config # noqa
9+
from ddtrace.debugging._config import ed_config # noqa
10+
from ddtrace.settings.profiling import config as profiling_config # noqa
11+
from ddtrace.internal.logger import get_logger # noqa
12+
from ddtrace.internal.module import ModuleWatchdog # noqa
13+
from ddtrace.internal.runtime.runtime_metrics import RuntimeWorker # noqa
14+
from ddtrace.internal.tracemethods import _install_trace_methods # noqa
15+
from ddtrace.internal.utils.formats import asbool # noqa
16+
from ddtrace.internal.utils.formats import parse_tags_str # noqa
17+
from ddtrace.settings.asm import config as asm_config # noqa
18+
from ddtrace import tracer
19+
20+
21+
import typing as t
22+
23+
# Register operations to be performned after the preload is complete. In
24+
# general, we might need to perform some cleanup operations after the
25+
# initialisation of the library, while also execute some more code after that.
26+
# _____ ___ _________ _____ ______ _____ ___ _ _ _____
27+
# |_ _|| \/ || ___ \| _ || ___ \|_ _| / _ \ | \ | ||_ _|
28+
# | | | . . || |_/ /| | | || |_/ / | | / /_\ \| \| | | |
29+
# | | | |\/| || __/ | | | || / | | | _ || . ` | | |
30+
# _| |_ | | | || | \ \_/ /| |\ \ | | | | | || |\ | | |
31+
# \___/ \_| |_/\_| \___/ \_| \_| \_/ \_| |_/\_| \_/ \_/
32+
# Do not register any functions that import ddtrace modules that have not been
33+
# imported yet.
34+
post_preload = []
35+
36+
37+
def register_post_preload(func: t.Callable) -> None:
38+
post_preload.append(func)
39+
40+
41+
log = get_logger(__name__)
42+
43+
44+
if profiling_config.enabled:
45+
log.debug("profiler enabled via environment variable")
46+
import ddtrace.profiling.auto # noqa: F401
47+
48+
if di_config.enabled or ed_config.enabled:
49+
from ddtrace.debugging import DynamicInstrumentation
50+
51+
DynamicInstrumentation.enable()
52+
53+
if config._runtime_metrics_enabled:
54+
RuntimeWorker.enable()
55+
56+
if asbool(os.getenv("DD_IAST_ENABLED", False)):
57+
from ddtrace.appsec._iast._utils import _is_python_version_supported
58+
59+
if _is_python_version_supported():
60+
from ddtrace.appsec._iast._ast.ast_patching import _should_iast_patch
61+
from ddtrace.appsec._iast._loader import _exec_iast_patched_module
62+
63+
ModuleWatchdog.register_pre_exec_module_hook(_should_iast_patch, _exec_iast_patched_module)
64+
65+
if config._remote_config_enabled:
66+
from ddtrace.internal.remoteconfig.worker import remoteconfig_poller
67+
68+
remoteconfig_poller.enable()
69+
70+
if asm_config._asm_enabled or config._remote_config_enabled:
71+
from ddtrace.appsec._remoteconfiguration import enable_appsec_rc
72+
73+
enable_appsec_rc()
74+
75+
if config._otel_enabled:
76+
77+
@ModuleWatchdog.after_module_imported("opentelemetry.trace")
78+
def _(_):
79+
from opentelemetry.trace import set_tracer_provider
80+
81+
from ddtrace.opentelemetry import TracerProvider
82+
83+
set_tracer_provider(TracerProvider())
84+
85+
86+
if asbool(os.getenv("DD_TRACE_ENABLED", default=True)):
87+
from ddtrace import patch_all
88+
89+
@register_post_preload
90+
def _():
91+
# We need to clean up after we have imported everything we need from
92+
# ddtrace, but before we register the patch-on-import hooks for the
93+
# integrations.
94+
modules_to_patch = os.getenv("DD_PATCH_MODULES")
95+
modules_to_str = parse_tags_str(modules_to_patch)
96+
modules_to_bool = {k: asbool(v) for k, v in modules_to_str.items()}
97+
patch_all(**modules_to_bool)
98+
99+
if config.trace_methods:
100+
_install_trace_methods(config.trace_methods)
101+
102+
if "DD_TRACE_GLOBAL_TAGS" in os.environ:
103+
env_tags = os.getenv("DD_TRACE_GLOBAL_TAGS")
104+
tracer.set_tags(parse_tags_str(env_tags))
105+
106+
107+
@register_post_preload
108+
def _():
109+
tracer._generate_diagnostic_logs()

ddtrace/bootstrap/sitecustomize.py

Lines changed: 17 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,17 @@
22
Bootstrapping code that is run when using the `ddtrace-run` Python entrypoint
33
Add all monkey-patching that needs to run by default here
44
"""
5+
# _____ ___ _________ _____ ______ _____ ___ _ _ _____
6+
# |_ _|| \/ || ___ \| _ || ___ \|_ _| / _ \ | \ | ||_ _|
7+
# | | | . . || |_/ /| | | || |_/ / | | / /_\ \| \| | | |
8+
# | | | |\/| || __/ | | | || / | | | _ || . ` | | |
9+
# _| |_ | | | || | \ \_/ /| |\ \ | | | | | || |\ | | |
10+
# \___/ \_| |_/\_| \___/ \_| \_| \_/ \_| |_/\_| \_/ \_/
11+
# DO NOT MODIFY THIS FILE!
12+
# Only do so if you know what you're doing. This file contains boilerplate code
13+
# to allow injecting a custom sitecustomize.py file into the Python process to
14+
# perform the correct initialisation for the library. All the actual
15+
# initialisation logic should be placed in preload.py.
516
from ddtrace import LOADED_MODULES # isort:skip
617

718
import logging # noqa
@@ -11,17 +22,10 @@
1122

1223
from ddtrace import config # noqa
1324
from ddtrace._logger import _configure_log_injection
14-
from ddtrace.debugging._config import di_config # noqa
15-
from ddtrace.debugging._config import ed_config # noqa
16-
from ddtrace.internal.compat import PY2 # noqa
1725
from ddtrace.internal.logger import get_logger # noqa
1826
from ddtrace.internal.module import ModuleWatchdog # noqa
1927
from ddtrace.internal.module import find_loader # noqa
20-
from ddtrace.internal.runtime.runtime_metrics import RuntimeWorker # noqa
21-
from ddtrace.internal.tracemethods import _install_trace_methods # noqa
2228
from ddtrace.internal.utils.formats import asbool # noqa
23-
from ddtrace.internal.utils.formats import parse_tags_str # noqa
24-
from ddtrace.settings.asm import config as asm_config # noqa
2529

2630
# Debug mode from the tracer will do the same here, so only need to do this otherwise.
2731
if config.logs_injection:
@@ -43,21 +47,13 @@
4347
)
4448

4549

46-
if PY2:
47-
_unloaded_modules = []
48-
49-
5050
def is_module_installed(module_name):
5151
return find_loader(module_name) is not None
5252

5353

5454
def cleanup_loaded_modules():
5555
def drop(module_name):
5656
# type: (str) -> None
57-
if PY2:
58-
# Store a reference to deleted modules to avoid them being garbage
59-
# collected
60-
_unloaded_modules.append(sys.modules[module_name])
6157
del sys.modules[module_name]
6258

6359
MODULES_REQUIRING_CLEANUP = ("gevent",)
@@ -88,16 +84,10 @@ def drop(module_name):
8884
"google.protobuf", # the upb backend in >= 4.21 does not like being unloaded
8985
]
9086
)
91-
if PY2:
92-
KEEP_MODULES_PY2 = frozenset(["encodings", "codecs", "copy_reg"])
9387
for m in list(_ for _ in sys.modules if _ not in LOADED_MODULES):
9488
if any(m == _ or m.startswith(_ + ".") for _ in KEEP_MODULES):
9589
continue
9690

97-
if PY2:
98-
if any(m == _ or m.startswith(_ + ".") for _ in KEEP_MODULES_PY2):
99-
continue
100-
10191
drop(m)
10292

10393
# TODO: The better strategy is to identify the core modues in LOADED_MODULES
@@ -127,61 +117,9 @@ def _(threading):
127117

128118

129119
try:
130-
from ddtrace import tracer
131-
132-
profiling = asbool(os.getenv("DD_PROFILING_ENABLED", False))
133-
134-
if profiling:
135-
log.debug("profiler enabled via environment variable")
136-
import ddtrace.profiling.auto # noqa: F401
137-
138-
if di_config.enabled or ed_config.enabled:
139-
from ddtrace.debugging import DynamicInstrumentation
140-
141-
DynamicInstrumentation.enable()
120+
import ddtrace.bootstrap.preload as preload # Perform the actual initialisation
142121

143-
if config._runtime_metrics_enabled:
144-
RuntimeWorker.enable()
145-
146-
if asbool(os.getenv("DD_IAST_ENABLED", False)):
147-
from ddtrace.appsec._iast._utils import _is_python_version_supported
148-
149-
if _is_python_version_supported():
150-
from ddtrace.appsec._iast._ast.ast_patching import _should_iast_patch
151-
from ddtrace.appsec._iast._loader import _exec_iast_patched_module
152-
153-
ModuleWatchdog.register_pre_exec_module_hook(_should_iast_patch, _exec_iast_patched_module)
154-
155-
if asbool(os.getenv("DD_TRACE_ENABLED", default=True)):
156-
from ddtrace import patch_all
157-
158-
# We need to clean up after we have imported everything we need from
159-
# ddtrace, but before we register the patch-on-import hooks for the
160-
# integrations.
161-
cleanup_loaded_modules()
162-
modules_to_patch = os.getenv("DD_PATCH_MODULES")
163-
modules_to_str = parse_tags_str(modules_to_patch)
164-
modules_to_bool = {k: asbool(v) for k, v in modules_to_str.items()}
165-
patch_all(**modules_to_bool)
166-
167-
if config.trace_methods:
168-
_install_trace_methods(config.trace_methods)
169-
else:
170-
cleanup_loaded_modules()
171-
172-
# Only the import of the original sitecustomize.py is allowed after this
173-
# point.
174-
175-
if "DD_TRACE_GLOBAL_TAGS" in os.environ:
176-
env_tags = os.getenv("DD_TRACE_GLOBAL_TAGS")
177-
tracer.set_tags(parse_tags_str(env_tags))
178-
179-
if sys.version_info >= (3, 7) and config._otel_enabled:
180-
from opentelemetry.trace import set_tracer_provider
181-
182-
from ddtrace.opentelemetry import TracerProvider
183-
184-
set_tracer_provider(TracerProvider())
122+
cleanup_loaded_modules()
185123

186124
# Check for and import any sitecustomize that would have normally been used
187125
# had ddtrace-run not been used.
@@ -223,22 +161,15 @@ def _(threading):
223161
else:
224162
log.debug("additional sitecustomize found in: %s", sys.path)
225163

226-
if config._remote_config_enabled:
227-
from ddtrace.internal.remoteconfig.worker import remoteconfig_poller
228-
229-
remoteconfig_poller.enable()
230-
231-
if asm_config._asm_enabled or config._remote_config_enabled:
232-
from ddtrace.appsec._remoteconfiguration import enable_appsec_rc
233-
234-
enable_appsec_rc()
235-
236164
config._ddtrace_bootstrapped = True
237165
# Loading status used in tests to detect if the `sitecustomize` has been
238166
# properly loaded without exceptions. This must be the last action in the module
239167
# when the execution ends with a success.
240168
loaded = True
241-
tracer._generate_diagnostic_logs()
169+
170+
for f in preload.post_preload:
171+
f()
172+
242173
except Exception:
243174
loaded = False
244175
log.warning("error configuring Datadog tracing", exc_info=True)
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
fixes:
3+
- |
4+
Fix a regression with the support for gevent that could have occurred if
5+
some products, like ASM, telemetry, were enabled.

tests/internal/test_auto.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,12 @@
11
import pytest
22

33

4-
@pytest.mark.subprocess(env=dict(DD_UNLOAD_MODULES_FROM_SITECUSTOMIZE="true"))
4+
@pytest.mark.subprocess(
5+
env=dict(
6+
DD_UNLOAD_MODULES_FROM_SITECUSTOMIZE="true",
7+
DD_REMOTE_CONFIGURATION_ENABLED="1",
8+
)
9+
)
510
def test_auto():
611
import sys
712

tests/telemetry/test_telemetry_metrics_e2e.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ def test_span_creation_and_finished_metrics_datadog(test_agent_session, ddtrace_
122122
@pytest.mark.skipif(sys.version_info < (3, 7), reason="OpenTelemetry dropped support for python<=3.6")
123123
def test_span_creation_and_finished_metrics_otel(test_agent_session, ddtrace_run_python_code_in_subprocess):
124124
code = """
125-
import opentelemetry
125+
import opentelemetry.trace
126126
127127
ot = opentelemetry.trace.get_tracer(__name__)
128128
for _ in range(9):
@@ -172,7 +172,7 @@ def test_span_creation_and_finished_metrics_opentracing(test_agent_session, ddtr
172172
def test_span_creation_no_finish(test_agent_session, ddtrace_run_python_code_in_subprocess):
173173
code = """
174174
import ddtrace
175-
import opentelemetry
175+
import opentelemetry.trace
176176
from ddtrace import opentracer
177177
178178
ddtracer = ddtrace.tracer

0 commit comments

Comments
 (0)