Skip to content

Conversation

@nicolastakashi
Copy link

@nicolastakashi nicolastakashi commented Jul 27, 2025

Changes

Making forceflush_timeout configurable through BatchExporterConfigBuilder

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@nicolastakashi nicolastakashi requested a review from a team as a code owner July 27, 2025 11:27
@nicolastakashi nicolastakashi changed the title [CHORE] making forceflush_timeout configurable chore: making forceflush_timeout configurable Jul 27, 2025
@codecov
Copy link

codecov bot commented Jul 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.2%. Comparing base (42139cb) to head (29c0fb7).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #3087   +/-   ##
=====================================
  Coverage   80.1%   80.2%           
=====================================
  Files        126     126           
  Lines      21957   21985   +28     
=====================================
+ Hits       17603   17632   +29     
+ Misses      4354    4353    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

/// Maximum force flush timeout.
pub(crate) const OTEL_BLRP_FORCEFLUSH_TIMEOUT: &str = "OTEL_BLRP_FORCEFLUSH_TIMEOUT";
/// Default maximum force flush timeout.
pub(crate) const OTEL_BLRP_FORCEFLUSH_TIMEOUT_DEFAULT: Duration = Duration::from_millis(5_000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not use 5s here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTEL_BLRP_FORCEFLUSH_TIMEOUT is not part of the OTel specification. We generally don't want to add env variables not covered by a spec already.

Additional, I don't think this is the right way to support timeout for flush - the flush API itself can take a timeout
flush(); //existing one, uses default 5 sec
flush_with_timeout(timeout);/ new one, uses the passed in timeout.

@nicolastakashi
Copy link
Author

OTEL_BLRP_FORCEFLUSH_TIMEOUT is not part of the OTel specification. We generally don't want to add env variables not covered by a spec already.

Additional, I don't think this is the right way to support timeout for flush - the flush API itself can take a timeout
flush(); //existing one, uses default 5 sec
flush_with_timeout(timeout);/ new one, uses the passed in timeout.

Thanks for your review.
Do you mean have a force_flush_with_timeout method?

@cijothomas
Copy link
Member

OTEL_BLRP_FORCEFLUSH_TIMEOUT is not part of the OTel specification. We generally don't want to add env variables not covered by a spec already.
Additional, I don't think this is the right way to support timeout for flush - the flush API itself can take a timeout
flush(); //existing one, uses default 5 sec
flush_with_timeout(timeout);/ new one, uses the passed in timeout.

Thanks for your review. Do you mean have a force_flush_with_timeout method?

yes. Shutdown already has that pattern.

(I haven't seen much demand for force_flush with timeout - any particular scenario that you are facing that requires this?)

@nicolastakashi
Copy link
Author

OTEL_BLRP_FORCEFLUSH_TIMEOUT is not part of the OTel specification. We generally don't want to add env variables not covered by a spec already.

Additional, I don't think this is the right way to support timeout for flush - the flush API itself can take a timeout

flush(); //existing one, uses default 5 sec

flush_with_timeout(timeout);/ new one, uses the passed in timeout.

Thanks for your review. Do you mean have a force_flush_with_timeout method?

yes. Shutdown already has that pattern.

(I haven't seen much demand for force_flush with timeout - any particular scenario that you are facing that requires this?)

I'm facing flush timeout when pushing to certain remote endpoints.

@cijothomas
Copy link
Member

(I haven't seen much demand for force_flush with timeout - any particular scenario that you are facing that requires this?)
I'm facing flush timeout when pushing to certain remote endpoints.

@nicolastakashi Thanks for clarifying. However, my question was more about "why do you need to call flush at all?..SDK automatically flushes telemetry periodically (configurable)... and if you are about to shutdown the app, then there is shutdown too to manually perform a flush effect.. force_flush() was introduced to spec for some very specific scenarios and I have seen a lot of people mis-using it (eg: calling flush() after every log emission or metric update etc!)..
I am curious to know what is the scenario you are having that requires the usage of explicit flush?

(No objections to adding flush_with_timeout, just want to make sure it's being added for a good reason. It was only recently that we add shutdown_with_timeout...)

@nicolastakashi
Copy link
Author

(I haven't seen much demand for force_flush with timeout - any particular scenario that you are facing that requires this?)

I'm facing flush timeout when pushing to certain remote endpoints.

@nicolastakashi Thanks for clarifying. However, my question was more about "why do you need to call flush at all?..SDK automatically flushes telemetry periodically (configurable)... and if you are about to shutdown the app, then there is shutdown too to manually perform a flush effect.. force_flush() was introduced to spec for some very specific scenarios and I have seen a lot of people mis-using it (eg: calling flush() after every log emission or metric update etc!)..

I am curious to know what is the scenario you are having that requires the usage of explicit flush?

(No objections to adding flush_with_timeout, just want to make sure it's being added for a good reason. It was only recently that we add shutdown_with_timeout...)

I'm using it for synthetic telemetry testing and I tried to call provider shutdown as well, but it also timeouts since behind the scnes it calls flush

@cijothomas
Copy link
Member

I tried to call provider shutdown as well, but it also timeouts since behind the scnes it calls flush

That is not true. If shutdown is not working as designed, then we can investigate and fix that. Could you open a separate issue with minimal repro steps?
(If the repo don't already have tests for shutdown/timeout, then that is a gap we want to address too!)

@nicolastakashi
Copy link
Author

I tried to call provider shutdown as well, but it also timeouts since behind the scnes it calls flush

That is not true. If shutdown is not working as designed, then we can investigate and fix that. Could you open a separate issue with minimal repro steps? (If the repo don't already have tests for shutdown/timeout, then that is a gap we want to address too!)

Sorry, you are right, after review I noticed that and I found the issue why I'm facing timeouts.
I'm willing to apply the changes to this pr with you think this is something which makes sense, wdyt?

@cijothomas
Copy link
Member

I tried to call provider shutdown as well, but it also timeouts since behind the scnes it calls flush

That is not true. If shutdown is not working as designed, then we can investigate and fix that. Could you open a separate issue with minimal repro steps? (If the repo don't already have tests for shutdown/timeout, then that is a gap we want to address too!)

Sorry, you are right, after review I noticed that and I found the issue why I'm facing timeouts. I'm willing to apply the changes to this pr with you think this is something which makes sense, wdyt?

Thanks for your offer to contribute. However, for reasons I mentioned earlier, we don't want to continue this PR as-is. if we want to offer timeout to flush, then it should be done by accepting timeout in the flush() call itself, just like shutdown(). We can accept a PR adding that capability.
Though I haven't seen any asks before to add that capability to flush...as flush itself is a rarely required API!

Would love it if you can help with any other open issues or other fixes/improvements.

@cijothomas cijothomas closed this Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants