-
Notifications
You must be signed in to change notification settings - Fork 1.6k
add blog post announcing proposed changes/oteps for stability work #8208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
austinlparker
merged 44 commits into
open-telemetry:main
from
austinlparker:ap/addStabProjAnnounceBlog
Nov 8, 2025
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
afd3604
add blog post draft for stability oteps
austinlparker 477b714
fixall
austinlparker 1399c8c
update java agent claim
austinlparker e79910c
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker f7765a7
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 951307c
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker a6aa277
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker f0c0b25
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 50a3116
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 25af399
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 9d91f14
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker c51f3c7
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 6dd36b0
feedback
austinlparker 0c168f2
mega-feedback pass
austinlparker c1d0aac
nit
austinlparker d4a0968
Apply suggestion from @svrnm
austinlparker 4ef11ce
feedback
austinlparker 690d45c
feedback
austinlparker 959017e
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 1f66ca2
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker f9b0c2e
feedback
austinlparker 32c2807
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 7203b1a
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker fefb107
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 4bfa151
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker bf74d4a
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 710d246
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker bac80fd
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 84f2fd6
address feedback
austinlparker fdaf70d
tweak language
austinlparker 8c3b66a
address trask's inline PR comments from nov 6
austinlparker 71ba313
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 2fe8b49
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker a226c26
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 22eb98e
chore: update refcache after rebase
austinlparker c27dbb6
fix: resolve markdown linting errors
austinlparker 000bf2f
chore: add words to cspell dictionary
austinlparker 5be7064
fix: use page-local cspell dictionary for blog post
austinlparker 4b05fb2
Revert changes to hugo.yaml
chalin b930dc7
Merge branch 'main' into ap/addStabProjAnnounceBlog
austinlparker 37a4796
tweak to intro
austinlparker c3d72de
fix lints
austinlparker 2b07a0c
Update content/en/blog/2025/stability-proposal-announcement.md
austinlparker 5c38ae6
Apply suggestions from code review
austinlparker File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
306 changes: 306 additions & 0 deletions
306
content/en/blog/2025/stability-proposal-announcement.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,306 @@ | ||
| --- | ||
| title: Evolving OpenTelemetry's Stabilization and Release Practices | ||
| linkTitle: Stability Proposal Announcement | ||
| date: 2025-11-07 | ||
| author: OpenTelemetry Governance Committee | ||
| sig: Governance Committee | ||
| cSpell:ignore: deprioritize incentivized rollouts | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| OpenTelemetry is, by any metric, one of the largest and most exciting projects | ||
| in the cloud native space. Over the past five years, this community has come | ||
| together to build one of the most essential observability projects in history. | ||
| We're not resting on our laurels, though. The project consistently seeks out, | ||
| and listens to, feedback from a wide array of stakeholders. What we're hearing | ||
| from you is that in order to move to the next level, we need to adjust our | ||
| priorities and focus on stability, reliability, and organization of project | ||
| releases and artifacts like documentation and examples. | ||
|
|
||
| Over the past year, we've run a variety of user interviews, surveys, and had | ||
| open discussions across a range of venues. These discussions have demonstrated | ||
| that the complexity and lack of stability in OpenTelemetry creates impediments | ||
| to production deployments. | ||
|
|
||
| This blog post lays out the objectives and goals that the Governance Committee | ||
| believes are crucial to addressing this feedback. We're starting with this post | ||
| in order to have these discussions in public. | ||
|
|
||
| ### Our Goals | ||
|
|
||
| - Ensure that all OpenTelemetry distributions are 'stable by default' and | ||
| provide standardized mechanisms for users to opt-in to experimental or | ||
| unstable features. | ||
| - Have a single, clear, and consistent set of criteria for stability that | ||
| includes documentation, performance testing, benchmarks, etc. | ||
| - Make it easier for instrumentation libraries to stabilize and encourage | ||
| federation of semantic conventions. | ||
| - Introduce 'epoch releases' that are easier for end-user organizations to | ||
| consume. | ||
|
|
||
| **We'd appreciate your feedback!** | ||
|
|
||
| From maintainers and contributors, we'd appreciate your feedback on this | ||
| proposal in general and on specifics, such as implementation timelines, the | ||
| requirements for moving stability levels, and how to handle telemetry output | ||
| migrations. | ||
|
|
||
| From end-users, we'd appreciate your feedback on how you'd prefer to adopt | ||
| releases of OpenTelemetry, and how you currently do so. As we evaluate different | ||
| versioning and release strategies, it would be helpful to understand how you're | ||
| currently rolling out changes -- especially in polyglot environments. We also | ||
| would appreciate your feedback on documentation and performance benchmarking for | ||
| components such as instrumentation libraries, the Collector, etc. | ||
|
|
||
| From integrators, vendors, and the wider ecosystem, we would appreciate feedback | ||
| and constructive proposals on instrumentation and semantic convention metadata | ||
| and discovery. For integrators that are building on top of, or alongside, | ||
| OpenTelemetry we would love to know how we can make it easier for you and your | ||
| users to consume OpenTelemetry, as well as how we can make it easier for you to | ||
| publish and maintain your own instrumentation. | ||
|
|
||
| Further sections of this blog have other specific asks that we'd appreciate your | ||
| feedback on. Please remember that the specific ways we accomplish these goals | ||
| are not set in stone -- that's why we want your feedback on the proposals! If | ||
| you think there's a better way to accomplish these goals, please use the | ||
| discussion to let us know. | ||
|
|
||
| [Join the discussion!](https://github.com/open-telemetry/community/discussions/3098). | ||
|
|
||
| ## Why are we doing this? | ||
|
|
||
| OpenTelemetry has grown into a massive, complex ecosystem. We support four | ||
| different telemetry signals (tracing, metrics, logs, and profiles) across more | ||
| than a dozen programming languages. Each language has its own runtime | ||
| requirements and execution environments. The | ||
| [specification compliance matrix](https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md) | ||
| shows just how much we're trying to accomplish – and it's overwhelming. | ||
|
|
||
| This complexity creates real barriers to adoption. Organizations ready to deploy | ||
| OpenTelemetry in production encounter unexpected roadblocks: configuration that | ||
| breaks between minor versions, performance regressions that only appear at | ||
| scale, and the challenge of coordinating rollouts across hundreds or thousands | ||
| of services. Many teams end up delaying or scaling back their OpenTelemetry | ||
| deployments as a result. | ||
|
|
||
| For maintainers, this complexity makes their job harder than it needs to be. | ||
| There's a lack of clear milestones and guidance about what's 'most important' at | ||
| any given time. Stability efforts involve a lot of churn and there's often | ||
| conflicting guidance about where you should focus your time. | ||
|
|
||
| Addressing these concerns should be a high priority for the project, both for | ||
| the health of our maintainers and contributors, but also allowing us to continue | ||
| to grow and scale as we mature, especially as we become more deeply integrated | ||
| into the cloud native ecosystem. | ||
|
|
||
| The Governance Committee believes that these changes need community involvement | ||
| and discussion to be a success, so we’re taking this opportunity to announce our | ||
| intention and open a | ||
| [GitHub discussion](https://github.com/open-telemetry/community/discussions/3098) | ||
| in order to get feedback from users, maintainers, and contributors. We do not | ||
| anticipate that these changes will be completed overnight, and want to assure | ||
| everyone that we will continue to prioritize our existing commitments to users | ||
| and maintainers even as we consider necessary changes for the overall wellbeing | ||
| and maturity of the project. | ||
austinlparker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## 1. Stable By Default | ||
|
|
||
| Stability guarantees have been a long-held principle in OpenTelemetry, with | ||
| exceedingly high bars. There is a tension between this and user needs that we'd | ||
| like to discuss. | ||
|
|
||
| ### Background | ||
|
|
||
| OpenTelemetry is a specification for how cloud native software -- libraries, | ||
| frameworks, infrastructure abstractions, executable code, etc. -- produces and | ||
| communicates telemetry data about its operation. This specification is designed | ||
| to be exhaustive, comprehensive, and low-level. Many of the elements of the | ||
| specification are hard-won knowledge from the combined decades of experience its | ||
| authors have with building, operating, or designing telemetry systems at planet | ||
| scale. | ||
|
|
||
| A specification with no implementation is not a useful thing for end users, | ||
| though. Developers and operators approach telemetry through a variety of lenses; | ||
| Some organizations have high standards for observability, with entire teams | ||
| dedicated to building internal monitoring and instrumentation frameworks. Other | ||
| organizations view observability and monitoring as a second or third order | ||
| priority -- something that needs to happen, but not something that's | ||
| incentivized. OpenTelemetry, as a specification, needs to serve all of these | ||
| users and their use cases. | ||
|
|
||
| To make OpenTelemetry useful, we need to provide an 'on-ramp' from existing | ||
| methods and modes, existing tools and strategies, which means we need to provide | ||
| implementations of not just the specification, but _applications_ of it as well. | ||
| In practice, this means we need to distribute libraries to add OpenTelemetry | ||
| instrumentation to existing HTTP servers and clients, or Collector receivers to | ||
| scrape metrics from MySQL and translate them into OTLP. | ||
|
|
||
| Most of the value our community derives from OpenTelemetry comes directly from | ||
| instrumentation libraries and Collector components – not the core SDKs. While we | ||
| organize these as `contrib` repositories to distinguish them from core | ||
| components, end users don't see or care about this distinction. They just want | ||
| instrumentation that works. | ||
|
|
||
| For maintainers and project leadership, our stability goals and the nature of | ||
| `contrib` present a significant challenge. Users want stable, well-tested, and | ||
| performant releases -- that _also_ perform the same function as commercial | ||
| instrumentation agents. | ||
|
|
||
| ### Goals and Objectives | ||
|
|
||
| At a high level, these are the three points in this area: | ||
|
|
||
| 1. All components across all repositories (including semantic conventions) | ||
austinlparker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| should adhere to a consistent way of communicating stability, through a | ||
| metadata file/information, that can be discovered and parsed in a | ||
| programmatic way. The exact format should be defined through an OTEP and | ||
| incorporated into the specification. | ||
| 2. Stability requirements should be expanded to include more requirements around | ||
| documentation and where it's hosted, example code, performance benchmarks | ||
| (where applicable), implementation cookbooks, and other artifacts as | ||
| necessary. | ||
| 3. Stable distributions of OpenTelemetry should only enable stable components by | ||
| default. Users should be able to select a desired minimum stability level | ||
| with a documented and consistent configuration option. | ||
|
|
||
| We appreciate that these would be a big change for maintainers, especially those | ||
| who have shipped v1+ of their libraries. We would deeply appreciate your | ||
| feedback on these objectives in the | ||
| [discussion](https://github.com/open-telemetry/community/discussions/3098). | ||
|
|
||
| ## 2. Instrumentation Stability and Semantic Conventions | ||
|
|
||
| In order to achieve our stability goals, we'll need to address semantic | ||
| convention stability and processes as well. | ||
|
|
||
| ### Semantic Convention Challenges | ||
|
|
||
| Semantic conventions evolve slowly and deliberately because they must work | ||
| across diverse telemetry systems. While OpenTelemetry is designed for | ||
| interconnected signals flowing together, users deploy many different storage and | ||
| analysis engines to consume this data. Each backend has its own constraints and | ||
| capabilities. Maintainers must balance competing concerns – keeping cardinality | ||
| manageable, ensuring attributes are useful but not overly specific, and making | ||
| conventions that work well regardless of where the data ends up. | ||
|
|
||
| The downside of this is that progress on semantic conventions can be slow, and | ||
| this slowness impacts all consumers of the conventions. Many instrumentation | ||
| libraries are currently stuck on pre-release versions because they depend on | ||
| experimental semantic conventions. Outside contributors are stuck between | ||
| emitting unspecified telemetry or trying to engage in the process, which | ||
| requires a long commitment. Finally, we're internally inconsistent in | ||
| instrumentation across the project; some libraries are mapped to conventions, | ||
| others exist independently of it. | ||
|
|
||
| ### Instrumentation and Convention Goals | ||
|
|
||
| Our goals here are designed to achieve three outcomes. | ||
|
|
||
| 1. Instrumentation stability should be decoupled from semantic convention | ||
austinlparker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| stability. We have a lot of stable instrumentation that is safe to run in | ||
| production, but has data that may change in the future. Users have told us | ||
| that conflating these two levels of stability is confusing and limits their | ||
| options. | ||
| 2. Semantic conventions should be more federated; OpenTelemetry should not be | ||
| the final word on what conventions exist, and instead should focus on | ||
| creating core conventions that can be extended and built upon. | ||
| 3. Semantic convention development and iteration should not be a blocker on | ||
| distribution maintainers. | ||
|
|
||
| To this end, we have a few recommendations we'd like to codify into the | ||
| specification. First, our position around instrumentation libraries in | ||
| OpenTelemetry is that they exist as concrete implementations of the semantic | ||
| conventions. This gives us a concrete target for 'first party' instrumentation | ||
| libraries that we wish to support in distributions. In addition, maintainers | ||
| should prioritize instrumentations that align to existing conventions and | ||
| deprioritize others. | ||
|
|
||
| Second, we'd like to make it easier for maintainers to ship stable | ||
| instrumentations. If an instrumentation's API surface is stable, then we believe | ||
| that semantic convention stability should not block the stabilization of that | ||
| instrumentation library. This means that we'll need to be thoughtful in | ||
| providing migration pathways for telemetry as operators upgrade to new major | ||
| versions of instrumentation libraries. | ||
|
|
||
| Finally, we'd like to make it easier for third-parties to publish their own | ||
| semantic conventions by formalizing and stabilizing necessary parts of the | ||
| semantic conventions in order for other organizations to ship conventions for | ||
| their libraries, frameworks, tech stacks, etc. | ||
|
|
||
| In order to accomplish this, we're looking for feedback on several areas from | ||
| maintainers and end-users -- especially around the maturity/lifecycle of | ||
| semantic conventions, as well as what's missing in terms of federating semantic | ||
| conventions. We are more flexible on proposals here, but our outcomes aren't. | ||
| Remember, a core goal of the project is to encourage other libraries, tools, and | ||
| frameworks to | ||
| [natively adopt OpenTelemetry](https://www.youtube.com/watch?v=l8xiNOCIdLY) -- | ||
| semantic conventions are a big part of that. | ||
|
|
||
| ## 3. Confident and Stable Releases | ||
|
|
||
| ### The Challenge | ||
|
|
||
| OpenTelemetry isn't just a single binary deployed into a Kubernetes cluster. | ||
| Subtle differences in everything from configuration to telemetry output between | ||
| different versions of instrumentation libraries, Collector receivers, and SDKs | ||
| can cause a real headache for adopters. In addition, the rapid release cadence | ||
| of many components causes real difficulty for end users, especially around the | ||
| Collector. Enterprise deployments and upgrades are slow, deliberate things -- | ||
| teams simply do not have the bandwidth to validate and roll out changes at the | ||
| cadence we ship. | ||
|
|
||
| ### Release Goals and Strategy | ||
|
|
||
| Ultimately, our goal here is to make it easier for large organizations to deploy | ||
| OpenTelemetry. Please keep in mind that in many organizations, 'deployment' and | ||
| 'upgrades' are non-trivial tasks that involve many teams and stakeholders across | ||
| different business units or areas of responsibility including security. | ||
|
|
||
| Our current proposal is the creation of a Release SIG that will be responsible | ||
| for creating a schedule of 'epoch' releases for OpenTelemetry. These epoch | ||
| versions would essentially be a manifest pointing to a tested, documented, and | ||
| stable set of components that meet project stability requirements. | ||
|
|
||
| This is not a trivial undertaking, to be clear. Those efforts will communicate | ||
| many of the requirements these epoch releases must follow, after all. To our | ||
| maintainers and contributors, this effort is not intended to change how | ||
| individual components, SDKs, or APIs are versioned or released. Rather, we want | ||
| to provide tested, stable release combinations that work well together for end | ||
| users who need that stability. | ||
|
|
||
| For end-users, we would appreciate feedback on how you are currently managing | ||
| upgrades, what you'd like to see in this area, and your current challenges in | ||
| deployment and upgrade of both SDKs and Collectors. | ||
|
|
||
| ## Looking Forward | ||
|
|
||
| These changes are a reflection of the impact and importance of OpenTelemetry to | ||
| the cloud native software community. | ||
| [OpenTelemetry has been the second highest velocity project in the CNCF over the past few years](https://www.cncf.io/wp-content/uploads/2025/04/CNCF-Annual-Report-2024_v2.pdf), | ||
| and | ||
| [nearly 50% of surveyed cloud native end user companies have adopted the project](https://www.cncf.io/wp-content/uploads/2025/04/cncf_annual_survey24_031225a.pdf). | ||
| These changes are setting up the next chapter of our success, and becoming truly | ||
| ubiquitous. | ||
austinlparker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Our mission as a project is not changing, but our priorities are. | ||
|
|
||
| 1. Stability and usability for all developers and users. | ||
| 2. Clear packaging, installation, and usage paths. | ||
| 3. Predictability and consistency. | ||
|
|
||
| For contributors and maintainers, what does this mean? We'll fast-track | ||
| proposals that align with these priorities. If there's feature work or | ||
| instrumentation that doesn't align to this, that's fine -- we'd ask that you | ||
| work on it outside the project and discover where our existing integration | ||
| points and patterns don't work. That's good feedback, and will help us improve | ||
| the specification for everyone. | ||
|
|
||
| For maintainers, contributors, and integrators -- we would appreciate your | ||
| feedback in | ||
| [this GitHub Discussion](https://github.com/open-telemetry/community/discussions/3098) | ||
| on the topics and proposals raised here. You can also send feedback on this | ||
| proposal to [[email protected]](mailto:[email protected]) or on | ||
| the CNCF Slack in the #opentelemetry channel. We also look forward to meeting | ||
| the cloud native community in person at KubeCon next week -- please join us | ||
| there with comments! | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.