Skip to content

Conversation

@pellared
Copy link
Member

@pellared pellared commented Sep 1, 2025

Fixes #4643

What

Clarify that changes of disabled config MUST be eventually visible.

Why

Currently, the implementation that would not never synchronize modification and access to disabled config are compliant with the specification. This is not correct as then the user may never see the changes when the configuration changes.

Per #4645 (comment)

Languages remain free to document and implement their own refresh/caching behaviors as long as they are eventually consistent.

@pellared pellared changed the title remove non-normative, confusing remark Remove confusing non-normative remark about immediate visibility of disabled changes Sep 1, 2025
@pellared pellared marked this pull request as ready for review September 1, 2025 11:33
@pellared pellared requested review from a team as code owners September 1, 2025 11:33
@pellared pellared added clarification clarify ambiguity in specification area:sdk Related to the SDK spec:metrics Related to the specification/metrics directory spec:logs Related to the specification/logs directory spec:trace Related to the specification/trace directory labels Sep 1, 2025
@carlosalberto
Copy link
Contributor

The Enabled operation/check was explicitly created as a best-effort concept IIRC. I'd like to get the opinion of @jack-berg as he may have more context on this.

@trask
Copy link
Member

trask commented Sep 1, 2025

cc @jackshirazi

of the two most common logging frameworks in Java (log4j and logback), I believe one of them ensures immediate visibility and one of them doesn't:

@pellared
Copy link
Member Author

pellared commented Sep 3, 2025

The Enabled operation/check was explicitly created as a best-effort concept IIRC. I'd like to get the opinion of @jack-berg as he may have more context on this.

I checked the history and this was indeed introduced here by @jack-berg:

Then I removed some part here:

In the meantime PRs like these have been merged:

Therefore, I think that the non-normative remarks are no longer valid.
Let's wait for @jack-berg to double-check.

@jack-berg
Copy link
Member

We write the spec for two types of readers: end users of the API and language maintainers. I don't understand why we want to remove non-normative but useful bits of implementation advice for language maintainers. A language maintainer shouldn't need to go through issue / PR history (or open a new issue requesting clarification) to get practical advice for implementation.

@pellared
Copy link
Member Author

pellared commented Sep 3, 2025

useful bits of implementation advice for language maintainers

I question it is an useful, practical, and even good advise. I do not think we should have advices that are very subjective.

A language maintainer shouldn't need to go through issue / PR history (or open a new issue requesting clarification) to get practical advice for implementation.

I had to do it because the advice does not seem reasonable.

@jack-berg
Copy link
Member

jack-berg commented Sep 3, 2025

Then add a clarification that the purpose of this is to ensure that the implementation of enabled doesnt get in the way of performance 🤷

@pellared
Copy link
Member Author

pellared commented Sep 3, 2025

Then add a clarification that the purpose of this is to ensure that the implementation of enabled doesnt get in the way of performance 🤷

@jack-berg, PTAL 96bc8e2

@pellared pellared changed the title Remove confusing non-normative remark about immediate visibility of disabled changes Change the non-normative remark about immediate visibility of disabled changes Sep 3, 2025
@pellared pellared requested a review from Copilot September 3, 2025 15:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes a non-normative remark about immediate visibility of disabled changes and replaces it with guidance on efficient implementation. The change addresses implementation-detail ambiguity that conflicts with the normative MUST semantics of the Enabled API.

  • Removes statement that SDKs don't need to make disabled changes immediately visible
  • Adds guidance for efficient access and synchronization of the disabled flag
  • Applies consistent changes across trace, metrics, and logs SDK specifications

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
specification/trace/sdk.md Updates disabled flag documentation with efficient implementation guidance
specification/metrics/sdk.md Replaces visibility statement with performance-focused synchronization guidance
specification/logs/sdk.md Aligns disabled flag documentation with other SDK specifications

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@pellared pellared requested a review from trask September 8, 2025 07:39
@pellared pellared marked this pull request as draft September 9, 2025 16:12
@pellared pellared changed the title Change the non-normative remark about immediate visibility of disabled changes Change the non-normative remark about visibility of disabled changes Sep 11, 2025
@pellared pellared closed this Sep 11, 2025
@pellared pellared reopened this Sep 11, 2025
@pellared pellared changed the title Change the non-normative remark about visibility of disabled changes Clarify that changes of disabled config MUST be eventually visible Sep 11, 2025
@pellared pellared requested a review from Copilot September 11, 2025 15:54
Co-authored-by: Liudmila Molkova <[email protected]>
@carlosalberto
Copy link
Contributor

We are waiting on a Java (or any other) language that has the actual eventual behavior described here (golang implements an always-visible behavior already, etc)

@pellared
Copy link
Member Author

@open-telemetry/java-maintainers , PTAL

@jack-berg
Copy link
Member

jack-berg commented Sep 26, 2025

Has any language SDK implemented any bits of dynamic configuration besides Java? I see a lot of approvals on this but as @trask's benchmarks show checking a volatile variable (i.e. the fastest way to ensure eventual consistency in java) on every log creation / span creation / metric measurement recording is noteworthy from a perf standpoint. The perf impact is especially noteworthy for metric recording, which is expected to be allocation free and extremely fast.

I would be surprised if other languages implemented this particular bit of dynamic config and didn't end up with similar perf questions.

One option I suggested was to provide some sort of SDK initialization time config option where a user can indicate that they may be updating the config dynamically. If they don't indicate updates may occur, then we adjust the implementation to avoid the perf hit and prevent updates. If they do indicate updates may occur, then the implementation adjusts to guarantee changes are seen.

@pellared
Copy link
Member Author

pellared commented Sep 29, 2025

expected to be allocation free and extremely fast.

Is 80ns and allocation free slow? What part is it in an end to end to scenario are we talking about? You can have 12,500,000 80ns-long operations during a second.

If this is considered slow then I would question the whole idea of the dynamic configuration feature. Otherwise, what is the point of having a feature that provides guarantees that it "eventually" works? Maybe you want to provide one SDK implementation which handles dynamic config and second which does not?

One option open-telemetry/opentelemetry-java#7700 (comment) was to provide some sort of SDK initialization time config option where a user can indicate that they may be updating the config dynamically.

I find this is implementation-specific. The solution depends on concurrency and memory model of given runtime/language.

@jack-berg
Copy link
Member

jack-berg commented Sep 29, 2025

See this old blog post where I did a pretty deep investigation into perf of different java metric systems and found record time of otel java anywhere from 20ns-110ns. https://opentelemetry.io/blog/2024/java-metric-systems-compared/#results

Micrometer and Prometheus are similar so I def do not want to add 80ns per record for a niche use case.

Copy link
Member

@jack-berg jack-berg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve because I think this is the right sentiment. But in practice in otel java, I think we'll need to make dynamic config something users opt in to the possibility of at initialization time. That way, we can save the vast majority of users who won't use dynamic config the performance hit. I don't think this needs to be part of the spec, but if other languages implement dynamic config and come across similar challenges, perhaps we can update the spec later.

@trask
Copy link
Member

trask commented Sep 29, 2025

80ns

just wanted to clarify that the benchmark is performing 100 volatile reads in this time

this was the only way I could get a difference to show up between volatile and non-volatile reads

as @laurit notes in open-telemetry/opentelemetry-java#7700 (comment), this difference between volatile and non-volatile reads is probably due to a compiler optimization:

As far as I know on x86, where these tests are run, volatile reads don't use a barrier. Based on that I'd guess that the difference in read only perf is a not caused directly by volatile reads being slower but rather some sort of compiler optimization.

@jack-berg
Copy link
Member

.8ns is much better

@jack-berg
Copy link
Member

as @laurit notes in open-telemetry/opentelemetry-java#7700 (comment), this difference between volatile and non-volatile reads is probably due to a compiler optimization

That's what the Internet says about volatile as well. Apparently x86 reads already have memory semantics that are the same-(ish?) as volatile and all volatile does is prevent compiler optimizations. That doesn't mean that volatile reads won't be more costly on other architectures, but we can cross that bridge when needed.

I'm happy to pay an extra .8ns per operation for the simplicity of using volatile.

@pellared
Copy link
Member Author

We are waiting on a Java (or any other) language that has the actual eventual behavior described here (golang implements an always-visible behavior already, etc)

@carlosalberto , I think this PR can be merged.

@carlosalberto carlosalberto added this pull request to the merge queue Sep 30, 2025
Merged via the queue into open-telemetry:main with commit 20b87ec Sep 30, 2025
7 checks passed
@carlosalberto carlosalberto mentioned this pull request Oct 10, 2025
github-merge-queue bot pushed a commit that referenced this pull request Oct 17, 2025
### Traces

- Restore `TraceIdRatioBased` and give it a deprecation timeline. Update
recommended
warnings based on feedback in issue
[#4601](#4601).

([#4627](#4627))
- Changes of `TracerConfig.disabled` MUST be eventually visible.

([#4645](#4645))
- Remove text related to the former expermental probability sampling
specification.

([#4673](#4673))

### Metrics

- Changes of `MeterConfig.disabled` MUST be eventually visible.

([#4645](#4645))

### Logs

- Add minimum_severity and trace_based logger configuration parameters.

([#4612](#4612))
- Changes of `LoggerConfig.disabled` MUST be eventually visible.

([#4645](#4645))

---------

Co-authored-by: Armin Ruech <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:sdk Related to the SDK clarification clarify ambiguity in specification spec:logs Related to the specification/logs directory spec:metrics Related to the specification/metrics directory spec:trace Related to the specification/trace directory

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove "It is not necessary for implementations to ensure that changes to any of these parameters are immediately visible to callers of Enabled."

8 participants