You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -399,40 +402,78 @@ The default sampler is `ParentBased(root=AlwaysOn)`.
399
402
400
403
#### TraceIdRatioBased
401
404
402
-
* The `TraceIdRatioBased` MUST ignore the parent `SampledFlag`. To respect the
403
-
parent `SampledFlag`, the `TraceIdRatioBased` should be used as a delegate of
404
-
the `ParentBased` sampler specified below.
405
-
* Description MUST return a string of the form `"TraceIdRatioBased{RATIO}"`
406
-
with `RATIO` replaced with the Sampler instance's trace sampling ratio
407
-
represented as a decimal number. The precision of the number SHOULD follow
408
-
implementation language standards and SHOULD be high enough to identify when
409
-
Samplers have different ratios. For example, if a TraceIdRatioBased Sampler
410
-
had a sampling ratio of 1 to every 10,000 spans it COULD return
411
-
`"TraceIdRatioBased{0.000100}"` as its description.
412
-
413
-
TODO: Add details about how the `TraceIdRatioBased` is implemented as a function
414
-
of the `TraceID`. [#1413](https://github.com/open-telemetry/opentelemetry-specification/issues/1413)
415
-
416
-
##### Requirements for `TraceIdRatioBased` sampler algorithm
417
-
418
-
* The sampling algorithm MUST be deterministic. A trace identified by a given
419
-
`TraceId` is sampled or not independent of language, time, etc. To achieve this,
420
-
implementations MUST use a deterministic hash of the `TraceId` when computing
421
-
the sampling decision. By ensuring this, running the sampler on any child `Span`
422
-
will produce the same decision.
423
-
* A `TraceIdRatioBased` sampler with a given sampling rate MUST also sample all
424
-
traces that any `TraceIdRatioBased` sampler with a lower sampling rate would
425
-
sample. This is important when a backend system may want to run with a higher
426
-
sampling rate than the frontend system, this way all frontend traces will
427
-
still be sampled and extra traces will be sampled on the backend only.
428
-
***WARNING:** Since the exact algorithm is not specified yet (see TODO above),
429
-
there will probably be changes to it in any language SDK once it is, which
430
-
would break code that relies on the algorithm results.
431
-
Only the configuration and creation APIs can be considered stable.
432
-
It is recommended to use this sampler algorithm only for root spans
433
-
(in combination with [`ParentBased`](#parentbased)) because different language
434
-
SDKs or even different versions of the same language SDKs may produce inconsistent
435
-
results for the same input.
405
+
**Status**: [Development](../document-status.md)
406
+
407
+
The `TraceIdRatioBased` sampler implements simple, ratio-based probability sampling using randomness features specified in the [W3C Trace Context Level 2][W3CCONTEXTMAIN] Candidate Recommendation.
408
+
OpenTelemetry follows W3C Trace Context Level 2, which specifies 56 bits of randomness,
409
+
[specifying how to make consistent probability sampling decisions using 56 bits of randomness][CONSISTENTSAMPLING].
410
+
411
+
The `TraceIdRatioBased` sampler MUST ignore the parent `SampledFlag`.
412
+
For respecting the parent `SampledFlag`, see the `ParentBased` sampler specified below.
413
+
414
+
Note that the "ratio-based" part of this Sampler's name implies that
415
+
it makes a probability decision directly from the TraceID, even though
416
+
it was not originally specified in an exact way. In the present
417
+
specification, the Sampler decision is more nuanced: only a portion of
418
+
the identifier is used, after checking whether the OpenTelemetry
419
+
TraceState field contains an explicit randomness value.
The `TraceIdRatioBased` sampler is typically configured using a 32-bit or 64-bit floating point number to express the sampling ratio.
426
+
The minimum valid sampling ratio is `2^-56`, and the maximum valid sampling ratio is 1.0.
427
+
From an input sampling ratio, a rejection threshold value is calculated; see [consistent-probability sampler requirements][CONSISTENTSAMPLING] for details on converting sampling ratios into thresholds with variable precision.
Given a Sampler configured with a sampling threshold `T` and Context with randomness value `R` (typically, the 7 rightmost bytes of the trace ID), when `ShouldSample()` is called, it uses the expression `R >= T` to decide whether to return `RECORD_AND_SAMPLE` or `DROP`.
434
+
435
+
* If randomness value (R) is greater or equal to the rejection threshold (T), meaning when (R >= T), return `RECORD_AND_SAMPLE`, otherwise, return `DROP`.
436
+
* When (R >= T), the OpenTelemetry TraceState SHOULD be modified to include the key-value `th:T` for rejection threshold value (T), as specified for the [OpenTelemetry TraceState `th` sub-key][TRACESTATEHANDLING].
@@ -510,31 +551,31 @@ For root span contexts, the SDK SHOULD implement the TraceID randomness requirem
510
551
511
552
For root span contexts, the SDK SHOULD set the `Random` flag in the trace flags when it generates TraceIDs that meet the [W3C Trace Context Level 2 randomness requirements][W3CCONTEXTTRACEID].
512
553
513
-
#### Explicit trace randomness
554
+
#### Explicit randomness
514
555
515
-
Explicit trace randomness is a mechanism that enables API users and
556
+
Explicit randomness is a mechanism that enables API users and
516
557
SDK authors to control trace randomness. The following recommendation
517
558
applies to Trace SDKs that have disregarded the recommendation on
518
559
TraceID randomness, above. It has two parts.
519
560
520
-
##### Do not overwrite explicit trace randomness
561
+
##### Do not overwrite explicit randomness
521
562
522
563
API users control the initial TraceState of a root span, so they can
523
-
provide explicit trace randomness for a trace by defining the [`rv`
564
+
provide explicit randomness for a trace by defining the [`rv`
524
565
sub-key of the OpenTelemetry TraceState][OTELRVALUE]. SDKs and Samplers
525
-
MUST NOT overwrite explicit trace randomness in an OpenTelemetry TraceState
566
+
MUST NOT overwrite explicit randomness in an OpenTelemetry TraceState
526
567
value.
527
568
528
-
##### Root samplers set explicit trace randomness for non-random TraceIDs
569
+
##### Root samplers set explicit randomness for non-random TraceIDs
529
570
530
571
When the SDK has generated a TraceID that does not meet the [W3C Trace
*[Sampling threshold value `th`](#sampling-threshold-value-th)
19
+
*[Explicit randomness value `rv`](#explicit-randomness-value-rv)
20
+
21
+
<!-- tocstop -->
22
+
23
+
</details>
24
+
9
25
In alignment to the [TraceContext](https://www.w3.org/TR/trace-context/) specification, this section uses the
10
26
Augmented Backus-Naur Form (ABNF) notation of [RFC5234](https://www.w3.org/TR/trace-context/#bib-rfc5234),
11
27
including the DIGIT rule in that document.
@@ -88,18 +104,68 @@ if ok {
88
104
89
105
The following values have been defined by OpenTelemetry.
90
106
91
-
### Explicit randomness value `rv`
107
+
### Sampling threshold value `th`
92
108
93
-
The OpenTelemetry TraceState `rv` sub-key defines an alternative source of randomness called the _explicit randomness value_.
94
-
Values of `rv` MUST be exactly 14 lower-case hexadecimal digits:
109
+
The OpenTelemetry TraceState `th` sub-key defines a sampling threshold, which conveys effective sampling probability.
110
+
Valid values of the `th` sub-fields include between 1 and 14 lowercase hexadecimal digits.
95
111
96
112
```
97
113
hexdigit = DIGIT ; a-f
98
114
```
99
115
116
+
To decode the threshold from the OpenTelemetry TraceState `th` value, first extend the value with trailing zeros to make 14 digits.
117
+
Then, parse the 14-digit value as a 56-bit unsigned hexadecimal number, yielding a rejection threshold.
118
+
119
+
OpenTelemetry defines consistent sampling in terms of a 56-bit trace randomness value compared with the 56-bit rejection threshold.
120
+
When the randomness value is less than the rejection threshold, the span is not sampled.
121
+
122
+
The threshold value `0` indicates that no spans are being rejected, corresponding with 100% sampling.
123
+
For example, the following TraceState value identifies a trace with 100% sampling:
124
+
125
+
```
126
+
tracestate: ot=th:0
127
+
```
128
+
129
+
To calculate sampling probability from the rejection threshold, define a constant `MaxAdjustedCount` equal to 2^56, the number of distinct 56-bit values.
130
+
The sampling probability is defined:
131
+
132
+
```
133
+
Probability = (MaxAdjustedCount - Threshold) / MaxAdjustedCount
134
+
```
135
+
136
+
Threshold can be calculated from Probability:
137
+
138
+
```
139
+
Threshold = MaxAdjustedCount * (1 - Probability)
140
+
```
141
+
142
+
In sampling, the term _adjusted count_ refers to the effective number of items represented by a sampled item of telemetry.
143
+
The adjusted count of a span is the inverse of its sampling probability and can be derived from the threshold as follows.
For example, here is a W3C TraceState value including an OpenTelemetry sampling threshold value:
150
+
151
+
```
152
+
tracestate: ot=th:c
153
+
```
154
+
155
+
This corresponds with 25% sampling probability, as follows:
156
+
157
+
- The hexadecimal value `c` is extended to `c0000000000000` for 56 bits
158
+
- The rejection threshold is `0xc0000000000000 / 0x100000000000000` which is 75%
159
+
- The sampling probability is 25%.
160
+
161
+
### Explicit randomness value `rv`
162
+
163
+
The OpenTelemetry TraceState `rv` sub-key defines an alternative source of randomness called the _explicit randomness value_.
164
+
Values of `rv` MUST be exactly 14 lower-case hexadecimal digits:
165
+
100
166
The explicit randomness value is meant to be used instead of extracting randomness from TraceIDs, therefore it contains the same number of bits as W3C Trace Context Level 2 recommends for TraceIDs.
101
167
102
-
Lowercase hexadecimal digits are specified to enable direct lexicographical comparison between a sampling thresohld and either the TraceID (as it appears in the `traceparent` header) or the explicit randomness value (as it appears in the `tracestate` header).
168
+
Lowercase hexadecimal digits are specified to enable direct lexicographical comparison between a sampling threshold and either the TraceID (as it appears in the `traceparent` header) or the explicit randomness value (as it appears in the `tracestate` header).
103
169
104
170
Explicit randomness values are meant to propagate through [span contexts](../context/README.md) unmodified.
105
171
Explicit randomness values SHOULD NOT be erased from the OpenTelemetry TraceState or modified once associated with a new TraceID, so that sampling decisions made using the explicit randomness value are consistent across signals.
@@ -110,4 +176,5 @@ For example, here is a W3C TraceState value including an OpenTelemetry explicit
110
176
tracestate: ot=rv:6e6d1a75832a2f
111
177
```
112
178
113
-
This corresponds with the explicit randomness value, an unsigned integer value, of 0x6e6d1a75832a2f. This randomness value is meant to be used instead of the least-significant 56 bits of the TraceID. In this example, the 56-bit fraction (i.e., 0x6e6d1a75832a2f / 0x100000000000000 = 43.1%) supports making a consistent positive sampling decision at probabilities ranging from 56.9% through 100% (i.e., rejection thresohld values 0x6e6d1a75832a2f through 0), the same as for a hexadecimal TraceID ending in 6e6d1a75832a2f without explicit randomness value.
179
+
This corresponds with the explicit randomness value, an unsigned integer value, of 0x6e6d1a75832a2f. This randomness value is meant to be used instead of the least-significant 56 bits of the TraceID.
180
+
In this example, the 56-bit fraction (i.e., 0x6e6d1a75832a2f / 0x100000000000000 = 43.1%) supports making a consistent positive sampling decision at probabilities ranging from 56.9% through 100% (i.e., rejection threshold values 0x6e6d1a75832a2f through 0), the same as for a hexadecimal TraceID ending in 6e6d1a75832a2f without explicit randomness value.
0 commit comments