Skip to content

Add experimental composite sampler #4714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

anuraaga
Copy link

@anuraaga anuraaga commented Aug 7, 2025

This is a reopening of open-telemetry/opentelemetry-python-contrib#3668 as was recommended to target this repo.

Looking at some other similar _ exports for experimental features, they seem to be complete concepts (i.e. logs), while this is a type of an existing concept, sampler. So I tried the _sampling_experimental name to clarify that it is an experimental part of sampling. Let me know any thoughts.

Description

Adds an implementation of consistent samplers

https://opentelemetry.io/docs/specs/otel/trace/tracestate-probability-sampling/
https://opentelemetry.io/docs/specs/otel/trace/sdk/#built-in-composablesamplers

Based on the Java implementation

https://github.com/open-telemetry/opentelemetry-java-contrib/tree/main/consistent-sampling/src/main/java/io/opentelemetry/contrib/sampler/consistent56

Some differences from Java

  • Names follow the published experimental spec rather than OTEP
  • Does not add non-standard samplers for now, e.g. ratelimited, rule based
  • Some trace state validation is assumed to be done by the SDK and invalid cases aren't tested (the API doesn't accept string but SDK TraceState)

/cc @xrmx @tammy-baylis-swi
/cc @PeterF778 as original author in Java if interested

Type of change

  • New feature (non-breaking change which adds functionality
  • This change requires a documentation update

How Has This Been Tested?

  • Unit tests

Does This PR Require a Core Repo Change?

  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@anuraaga anuraaga requested a review from a team as a code owner August 7, 2025 04:18
Copy link
Contributor

@xrmx xrmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a bunch of comments, I have yet to review the PR after looking at the doc here
https://opentelemetry.io/docs/specs/otel/trace/tracestate-probability-sampling/

import random


def random_trace_id() -> int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can use RandomIdGenerator.generate_trace_id from opentelemetry.sdk.trace.id_generator?

@@ -0,0 +1,37 @@
from typing import Optional, Sequence
Copy link
Contributor

@xrmx xrmx Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On new files we tend to prefer importing from __future__ import annotations and use | None instead of Optional

_threshold: int
_description: str

def __init__(self, sampling_probability: float):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for moving this PR!

Please could you add docstrings to these new functions, which would be extra helpful when the consistent probabilistic sampler spec is new.

For this one, you could copy relevant parts from Requirements for the basic samplers and potentially link/mention OTEPS

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added docs to the public functions including those details from the sampler. Let me know if there's anything more we can add

sampled = False
adjusted_count_correct = False

decision = Decision.RECORD_AND_SAMPLE if sampled else Decision.DROP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it in my reading of the spec so far, but were there any discussions elsewhere about the outcome Decision.RECORD_ONLY?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. According to the current version of the specification, RECORD_ONLY decision is never generated by Consistent Probability Samplers. Any cases which require such decisions will require customizations.

Comment on lines 31 to 34
parent_span = get_current_span(parent_ctx)
parent_span_ctx = parent_span.get_span_context()
is_root = not parent_span_ctx.is_valid
if is_root:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a chance of attribute error, so this could be more like og ParentBased:

Suggested change
parent_span = get_current_span(parent_ctx)
parent_span_ctx = parent_span.get_span_context()
is_root = not parent_span_ctx.is_valid
if is_root:
parent_span_ctx = get_current_span(
parent_ctx
).get_span_context()
if parent_span_ctx is not None and parent_span_ctx.is_valid:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the span context be either a valid one, or the invalid one but never None? I see the parent based one seems to have that check, but it doesn't match the type annotations, so unless there's a known reason to doubt the typing, I like to follow them if possible

Copy link
Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all. Actually I had realized while I was looking at the tracestate handling section of the spec, I completely missed the SDK portions... So I have compared with that and renamed things to match it. Notably, the word consistent isn't used much and the concept seems to composite sampling.

One note is I made an editorial decision on the public API - one thing I noticed with the sampling.py one is some samplers are constants, others are classes, which seemed inconsistent. Here I hid all the classes and only expose functions to be able to use singleton or not as needed with a consistent surface. Happy to go with anything the maintainers prefer though.

@anuraaga anuraaga changed the title Add experimental consistent sampler Add experimental composite sampler Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants