-
Notifications
You must be signed in to change notification settings - Fork 766
Description
Describe your environment
OS; Ubuntu
Python version: 3.12.0
SDK version: 1.38.0
API version: 1.38.0
What happened?
When an exception interrupts a running stack of with tracer.start_as_current_span(...):, all the current spans correctly finalize and are ended, but when a signal terminates the process, this isn't the case:
with tracer.start_as_current_span("normal_run_example") as span:
span.set_attribute("example.attribute", "normal_execution")
raise Exception("Normal Exception")
this records the exception, ends the span, and shuts down the trace provider and makes sure everything is exported before shutdown. this does not:
with tracer.start_as_current_span("normal_run_example") as span:
span.set_attribute("example.attribute", "normal_execution")
time.sleep(0.2)
with tracer.start_as_current_span("nested_run_example") as span2:
time.sleep(0.5)
os.kill(os.getpid(), signal.SIGTERM)
time.sleep(10)
Steps to Reproduce
see examples above
Expected Result
Even when the process receives signals, the spans should be ended and the provider shut down, similar to how precautions are taken by using atexit handlers.
Actual Result
Running trace does not arrive at the collector.
Additional context
Currently the token needed to restore the context is only accessible inside opentelemetry.trace.use_span, but if we were to attach that as a private attribute to the created spans then from a signal handler we could do the following:
def shutdown_otel(signum=None):
# Gracefully end the current span hierarchy if it is still running
curr = trace.get_current_span()
while curr and curr.is_recording():
curr.end()
token = getattr(curr, "_ctx_token", None) # <-- attribute set in `use_span`
if token:
detach(token)
curr = trace.get_current_span()
# Gracefully shutdown the trace provider
try:
provider.shutdown()
except Exception:
pass
# Reraise the previous signal handler
if signum is not None:
signal.signal(signum, prev_handlers[signum])
signal.raise_signal(signum)
# Run shutdown on interceptable termination signals
prev_handlers = {}
for s in ("SIGINT", "SIGTERM", "SIGHUP"):
sig = getattr(signal, s, None)
if sig is None:
continue
prev_handlers[sig] = signal.signal(sig, lambda signum, frame, _s=sig: shutdown_otel(_s))
Would you like to implement a fix?
Yes
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.