-
Notifications
You must be signed in to change notification settings - Fork 597
fix: handle shutdown in logs exporter #3255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: handle shutdown in logs exporter #3255
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3255 +/- ##
=======================================
- Coverage 80.8% 80.7% -0.1%
=======================================
Files 129 129
Lines 23203 23212 +9
=======================================
Hits 18750 18750
- Misses 4453 4462 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
scottgerring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM
| } | ||
| None => Err(tonic::Status::failed_precondition( | ||
| "exporter is already shut down", | ||
| "metrics exporter is already shut down", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is helpful; I don't think the user would necessarily see which failed otherwise
Co-authored-by: Scott Gerring <[email protected]>
|
@stefanobaghino that was a quick turnaround :D |
| use opentelemetry_sdk::error::{OTelSdkError, OTelSdkResult}; | ||
| use opentelemetry_sdk::logs::{LogBatch, LogExporter}; | ||
| use std::sync::Arc; | ||
| use std::sync::{Arc, Mutex}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any repercussions of moving to std Mutex instead of tokio one inside the async export call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine, and its the same pattern used in TonicsMetricsClient for the last couple years, which as far as I can tell we've not had problems with (linky to blame).
In the export, we hold the lock only so long as to clone the client and call the interceptor - and significantly, not across any await points, which is the main contra-indicator for std mutexes:
let (mut client, metadata, extensions) = self
.inner
.lock()
.map_err(|e| tonic::Status::internal(format!("Failed to acquire lock: {e:?}")))
.and_then(|mut inner| match &mut *inner {
Some(inner) => {
let (m, e, _) = inner
.interceptor
.call(Request::new(()))
.map_err(|e| {
// Convert interceptor errors to tonic::Status for retry classification
tonic::Status::internal(format!("interceptor error: {e:?}"))
})? // lock releasedI think if you called shutdown_with_timeout on a single-threaded tokio runtime while the runtime is still trying to do an export, and the export was at that exact moment in the code I highlighted above, then you could potentially deadlock it, but this relies on amazing timing, and I think we already warn against doing exactly this:
opentelemetry-rust/opentelemetry-sdk/src/logs/batch_log_processor.rs
Lines 99 to 102 in df412fe
| /// **Warning**: When using tokio's current-thread runtime, `shutdown()`, which | |
| /// is a blocking call ,should not be called from your main thread. This can | |
| /// cause deadlock. Instead, call `shutdown()` from a separate thread or use | |
| /// tokio's `spawn_blocking`. |
I note there are some other suggestions (e.g. spawning a thread) to do this particular thing in the linked issue, but this feels a bit overwrought.
What are you thinking @cijothomas ?
Fixes #2777
Changes
This follows the approach suggested here by @scottgerring:
Merge requirement checklist
CHANGELOG.mdfiles updated for non-trivial, user-facing changes