Skip to content

Commit 1dd239c

Browse files
Yun-KimZStriker19
andauthored
refactor(tracer): move span sampling to end of start_span (#5352)
Currently, the tracer's `_start_span()` method performs sampling rules before all tags are set on the span in question. This was indirectly the cause of the bug fixed in #5339, since the sampling happened before all tags were set. This PR moves the sampling logic to the end of the method after all other tags have been set. In doing so, this also removes the duplicate code introduced in #5339. No new tests are required since there is no new functionality introduced, just the order of when sampling decisions are made in `start_span()` has been shifted. ## Checklist - [x] Change(s) are motivated and described in the PR description. - [x] Testing strategy is described if automated tests are not included in the PR. - [x] Risk is outlined (performance impact, potential for breakage, maintainability, etc). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] [Library release note guidelines](https://ddtrace.readthedocs.io/en/stable/contributing.html#Release-Note-Guidelines) are followed. - [x] Documentation is included (in-code, generated user docs, [public corp docs](https://github.com/DataDog/documentation/)). - [x] Author is aware of the performance implications of this PR as reported in the benchmarks PR comment. ## Reviewer Checklist - [x] Title is accurate. - [x] No unnecessary changes are introduced. - [x] Description motivates each change. - [x] Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes unless absolutely necessary. - [x] Testing strategy adequately addresses listed risk(s). - [x] Change is maintainable (easy to change, telemetry, documentation). - [x] Release note makes sense to a user of the library. - [x] Reviewer is aware of, and discussed the performance implications of this PR as reported in the benchmarks PR comment. --------- Co-authored-by: Zachary Groves <[email protected]>
1 parent 554acdb commit 1dd239c

File tree

1 file changed

+28
-29
lines changed

1 file changed

+28
-29
lines changed

ddtrace/tracer.py

Lines changed: 28 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -695,34 +695,6 @@ def _start_span(
695695
if config.report_hostname:
696696
span.set_tag_str(HOSTNAME_KEY, hostname.get_hostname())
697697

698-
if config.env:
699-
span.set_tag_str(ENV_KEY, config.env) # env tag is used by _sampler.sample
700-
span.sampled = self._sampler.sample(span)
701-
# Old behavior
702-
# DEV: The new sampler sets metrics and priority sampling on the span for us
703-
if not isinstance(self._sampler, DatadogSampler):
704-
if span.sampled:
705-
# When doing client sampling in the client, keep the sample rate so that we can
706-
# scale up statistics in the next steps of the pipeline.
707-
if isinstance(self._sampler, RateSampler):
708-
span.set_metric(SAMPLE_RATE_METRIC_KEY, self._sampler.sample_rate)
709-
710-
if self._priority_sampler:
711-
# At this stage, it's important to have the service set. If unset,
712-
# priority sampler will use the default sampling rate, which might
713-
# lead to oversampling (that is, dropping too many traces).
714-
if self._priority_sampler.sample(span):
715-
context.sampling_priority = AUTO_KEEP
716-
else:
717-
context.sampling_priority = AUTO_REJECT
718-
else:
719-
if self._priority_sampler:
720-
# If dropped by the local sampler, distributed instrumentation can drop it too.
721-
context.sampling_priority = AUTO_REJECT
722-
else:
723-
# We must always mark the span as sampled so it is forwarded to the agent
724-
span.sampled = True
725-
726698
if not span._parent:
727699
span.set_tag_str("runtime-id", get_runtime_id())
728700
span._metrics[PID] = self._pid
@@ -753,12 +725,39 @@ def _start_span(
753725
if service and service not in self._services and self._is_span_internal(span):
754726
self._services.add(service)
755727

728+
if not trace_id:
729+
span.sampled = self._sampler.sample(span)
730+
# Old behavior
731+
# DEV: The new sampler sets metrics and priority sampling on the span for us
732+
if not isinstance(self._sampler, DatadogSampler):
733+
if span.sampled:
734+
# When doing client sampling in the client, keep the sample rate so that we can
735+
# scale up statistics in the next steps of the pipeline.
736+
if isinstance(self._sampler, RateSampler):
737+
span.set_metric(SAMPLE_RATE_METRIC_KEY, self._sampler.sample_rate)
738+
739+
if self._priority_sampler:
740+
# At this stage, it's important to have the service set. If unset,
741+
# priority sampler will use the default sampling rate, which might
742+
# lead to oversampling (that is, dropping too many traces).
743+
if self._priority_sampler.sample(span):
744+
context.sampling_priority = AUTO_KEEP
745+
else:
746+
context.sampling_priority = AUTO_REJECT
747+
else:
748+
if self._priority_sampler:
749+
# If dropped by the local sampler, distributed instrumentation can drop it too.
750+
context.sampling_priority = AUTO_REJECT
751+
else:
752+
# We must always mark the span as sampled so it is forwarded to the agent
753+
span.sampled = True
754+
756755
# Only call span processors if the tracer is enabled
757756
if self.enabled:
758757
for p in self._span_processors:
759758
p.on_span_start(span)
760-
761759
self._hooks.emit(self.__class__.start_span, span)
760+
762761
return span
763762

764763
start_span = _start_span

0 commit comments

Comments
 (0)