Skip to content

Conversation

@jj22ee
Copy link
Contributor

@jj22ee jj22ee commented May 13, 2025

Which problem is this PR solving?

Short description of the changes

This PR is a followup to #2750

  • Add Sampling RuleCache
    • Caches a list of SamplingRuleAppliers, ordered by rule priority then rule name. Each Rule Applier corresponds to the Sampling Rule from GetSamplingRules. Each call to GetSamplingRules will only update the Rules that have changed properties, to preserve the state of unchanged rules. This means when Reservoir and Statistics are implemented (later) in the Rules, they will persist for unchanged rules.
    • The RuleCache will determine which Rule that a span matches (via the set of {ResourceAttributes,SpanAttributes}) that has highest priority.
  • Update SamplingRuleApplier to perform Fixed Rate Sampling, and to include a method to apply matching logic against a set of {ResourceAttributes,SpanAttributes} by using the wild card and attribute matching from Utils
  • Initial class for FallbackSampler
  • Update X-Ray Remote Sampler to depend on the SamplingRuleApplier from RuleCache and the FallBack Sampler to perform shouldSample

@jj22ee jj22ee requested a review from a team as a code owner May 13, 2025 23:13
@github-actions github-actions bot requested a review from yiyuan-he May 13, 2025 23:13
@codecov
Copy link

codecov bot commented May 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.80%. Comparing base (2d4092e) to head (22f7c7f).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2824      +/-   ##
==========================================
+ Coverage   89.76%   89.80%   +0.03%     
==========================================
  Files         187      187              
  Lines        9149     9149              
  Branches     1885     1885              
==========================================
+ Hits         8213     8216       +3     
+ Misses        936      933       -3     

see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jj22ee
Copy link
Contributor Author

jj22ee commented May 14, 2025

cc @lukeina2z for review who is familiar with Sampling. Currently everything is implemented except for Updating Sampling Targets and the Rate Limiting (Reservoir) Sampler.

}

public updateRules(newRuleAppliers: SamplingRuleApplier[]): void {
const oldRuleAppliersMap: { [key: string]: SamplingRuleApplier } = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You could use a Map<string, SamplingRuleApplier> instead of a plain object here for better perf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I Incorporated your suggestion.

attributes: Attributes
): SamplingRuleApplier | undefined {
return this.ruleAppliers.find(
rule =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this, in some cases, find the default rule before a higher priority rule? Will this matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use of find() here relies on the this.ruleAppliers list being always sorted by priority (ascending integers), where the default rule is always hardcoded to be the last priority by AWS X-Ray. This sorting is assumed because it is sorted here whenever the Sampling rules are updated. This assumption is important, I've added a comment.

// If scheme is not present, assume it's bad instrumentation and ignore.
if (schemeEndIndex > -1) {
// urlparse("scheme://netloc/path;parameters?query#fragment")
httpTarget = new URL(httpUrl).pathname;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that this will never receive a malformed httpUrl as it will throw if it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, I chose to rely on a try...catch to handle this.

import { expect } from 'expect';
import { FallbackSampler } from '../src/fallback-sampler';

describe('FallBackSampler', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add additional tests here? For example:

  • Sampling rate verification - test that 5% of traces are sampled over many iterations
  • Deterministic behavior - test that the same trace id always gets the same sampling decision
  • Parameter forwarding - test that all parameters are correctly passed to the underlying sampler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the fallback sampler is just using the TraceIdRatioBasedSampler, which is covered under: https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-sdk-trace-base/test/common/sampler/TraceIdRatioBasedSampler.test.ts.

The next PR will add the Rate Limiter logic to the fallback sampler, where the Sampling Reservoir functionality will be tested.

@yiyuan-he
Copy link
Contributor

It would be helpful to add some documentation explaining the fallback behavior when remote rules are unavailable.

@jj22ee
Copy link
Contributor Author

jj22ee commented Jun 12, 2025

The Fallback Behaviour is currently documented in

  • FallbackSampler class comment
  • FallbackSampler's toString() method

@yiyuan-he yiyuan-he added the has:owner-approval Approved by Component Owner label Jun 16, 2025
Copy link
Member

@pichlermarc pichlermarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proxy-approving for @yiyuan-he (component-owner)

@pichlermarc pichlermarc enabled auto-merge (squash) June 30, 2025 08:07
@pichlermarc pichlermarc merged commit 0baef34 into open-telemetry:main Jun 30, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

has:owner-approval Approved by Component Owner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants