[DRAFT] Add deliveryTraceparent annotation for external trace context#9377
[DRAFT] Add deliveryTraceparent annotation for external trace context#9377ci-operator wants to merge 6 commits intotektoncd:mainfrom
Conversation
Add support for the `tekton.dev/deliveryTraceparent` annotation to allow external systems to provide a W3C traceparent that becomes the parent span for PipelineRun execution traces. This enables distributed tracing across system boundaries - when an external system creates a PipelineRun, it can set the deliveryTraceparent annotation with its current trace context, and Tekton will adopt that as the parent span rather than creating a new root span. The annotation is only used when: - No SpanContext exists in status (not already tracing) - No pipelinerunSpanContext annotation exists (takes precedence) - The traceparent value is valid W3C format ```release-note PipelineRuns now accept a `tekton.dev/deliveryTraceparent` annotation containing a W3C traceparent string. When set, Tekton adopts the provided trace context as the parent span for execution traces, enabling distributed tracing across system boundaries. The annotation is only used when no existing trace context is present. ``` Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/kind feature |
|
Thank you for your PR. |
Unlike pipelinerunSpanContext which adopts the span directly, deliveryTraceparent now creates a child span under the external parent. This ensures sibling PipelineRuns sharing the same delivery parent are distinguishable in traces. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@afrittoli excellent question - there is (described) a subtle but important distinction between deliveryTraceparent and the pipelinerunSpanContext (other than the JSON encoding). However, I failed to properly implement the distinction as documented. I've updated phrasing to be more clear than "adopt as parent" - what this feature now does with the deliveryTraceparent is create all pipelinerunSpanContext traces as children under that same delivery trace, potentially as siblings of one another. Using pipelinerunSpanContext in this manner would simply allow that trace context to become the parent of multiple interleaved taskrun traces (which doesn't solve our problem and makes trace navigation difficult). I've fixed up the code and clarified the documentation - to make this a bit more clear, the following is what would occur to a trace by using the pipelinerunSpanContext as a parent for a delivery span (and how this first commit was implemented): DeliverySpan (traceparent: 00-aaa...aaa-bbb...bbb-01) Whereas, now, the following tree is what we intend to create with PipelineRun span contexts as siblings of one another. DeliverySpan (traceparent: 00-aaa...aaa-bbb...bbb-01) |
docs/developers/tracing.md
Outdated
| The `tekton.dev/pipelinerunSpanContext` annotation is used for internal Tekton | ||
| trace propagation and adopts the provided span context directly. In contrast, | ||
| `tekton.dev/deliveryTraceparent` creates a **new child span** under the external | ||
| parent. This distinction is important when multiple PipelineRuns share the same | ||
| delivery parent (e.g., multiple integration tests triggered by a single event) - | ||
| each PipelineRun will have its own execution span, making them distinguishable | ||
| in traces. |
There was a problem hiding this comment.
NIT: adding diagrams here like those you used in #9377 (comment) would be nice indeed :)
There was a problem hiding this comment.
Added diagrams like in the comment above with updated documentation, now also aligning with proper remote-parentage phrasing.
afrittoli
left a comment
There was a problem hiding this comment.
Thanks for this, see my comments.
I think the most critical question is about the span context.
| logger.Warnf("invalid delivery traceparent annotation value: %s", deliveryTraceparent) | ||
| } else { | ||
| ctxWithTrace, span := tracerProvider.Tracer(TracerName).Start(parentCtx, "PipelineRun:Reconciler") | ||
| defer span.End() |
There was a problem hiding this comment.
Do you mean the span to end at the end of this reconcile cycle or at the end of the PipelineRun?
There was a problem hiding this comment.
While this behavior is meant to mimic the root span creation (which similarly ends immediately), I'm now recognizing that this is not the behavior we need for propagating an existing remote span context - in effect, we shouldn't create a setup span at all.
I've added a commit to fix this, following standard OTel mechanics for adopting a remote parent instead of creating a new span.
| expectSpanContextStatus: true, | ||
| expectValidSpanContext: true, | ||
| }, { | ||
| name: "pipelinerun-spancontext-takes-precedence-over-delivery-traceparent", |
There was a problem hiding this comment.
NIT: Maybe add test that checks that an already-populated status.SpanContext takes precedence over both annotations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ations Add test case verifying that an already-populated Status.SpanContext takes precedence over both pipelinerunSpanContext and deliveryTraceparent annotations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
||
| logger.Debugf("adopted delivery traceparent as remote parent: %s", deliveryTraceparent) | ||
| pr.Status.SpanContext = spanContext | ||
| return parentCtx |
There was a problem hiding this comment.
This will return the context with delivery as a parent.
I think we should allow it to continue with normal Tekton root span.
The idea is we must store delivery traceparent as metadata for external controllers and to preserve the execution tracing independence.
| SpanContextAnnotation = "tekton.dev/pipelinerunSpanContext" | ||
| // TaskRunSpanContextAnnotation is the name of the Annotation used for propagating SpanContext to TaskRun | ||
| TaskRunSpanContextAnnotation = "tekton.dev/taskrunSpanContext" | ||
| // DeliveryTraceparentAnnotation is the name of the Annotation for external parent trace context |
There was a problem hiding this comment.
// DeliveryTraceparentAnnotation is the name of the Annotation for storing external delivery trace context.
|
@ci-operator please update the description to follow the PR template when submitting a pull request |
vdemeester
left a comment
There was a problem hiding this comment.
Thanks for the contribution — the use case (linking PipelineRun traces to an external parent) is valuable.
The main issue has already been flagged by @raks-tt and @afrittoli: the code adopts the external context verbatim rather than creating a child span, which contradicts the documentation and removes the key distinction from pipelinerunSpanContext. Adding a couple of additional findings on the test side.
| pro.Inject(parentCtx, propagation.MapCarrier(spanContext)) | ||
|
|
||
| logger.Debugf("adopted delivery traceparent as remote parent: %s", deliveryTraceparent) | ||
| pr.Status.SpanContext = spanContext | ||
| return parentCtx | ||
| } |
There was a problem hiding this comment.
[P1] +1 to @raks-tt's comment. The code injects the parent context directly into SpanContext and returns it — no child span is created. This makes deliveryTraceparent functionally identical to pipelinerunSpanContext (just a different input format), contradicting the docs which say it "creates a new child span under the external parent."
To match the documented behavior, this block should call tracerProvider.Tracer(TracerName).Start(parentCtx, "PipelineRun:Reconciler") (like the root span path on lines 83-86), then inject the resulting child span context into pr.Status.SpanContext.
| parentTraceID string | ||
| parentTraceIDPrefix string | ||
| parentSpanID string |
There was a problem hiding this comment.
[P2] parentTraceIDPrefix, parentSpanID, and the strings import are dead code — no test case populates these fields. The assertion logic on lines 188-197 is also unreachable. This looks like leftover scaffolding from when the code was intended to create a child span.
If the code is updated to create a child span, these fields become useful and the with-delivery-traceparent-annotation test should set parentTraceIDPrefix (same trace ID) and parentSpanID (different from child). If not, remove them.
| tracerProvider: tracesdk.NewTracerProvider(), | ||
| expectSpanContextStatus: true, | ||
| expectValidSpanContext: true, | ||
| parentTraceID: "00-0f57e147e992b304d977436289d10628-73d5909e31793992-01", |
There was a problem hiding this comment.
[P2] This asserts traceparent equals the parent's exact value including span ID (73d5909e31793992). If the code is fixed to create a child span as documented, this assertion will fail — the child span will have the same trace ID but a different span ID. This test currently codifies the doc/code mismatch.
Replace stale "creates a new child span" language for deliveryTraceparent with accurate description of the remote parent adoption mechanism. Add tree diagrams comparing the two trace propagation annotations: - pipelinerunSpanContext: nested parent-child chain for Tekton-internal PipelineRun fan-out - deliveryTraceparent: flat siblings under an external parent for cross-system tracing Include a contrast diagram showing why flat siblings keep traces navigable when multiple PipelineRuns share the same external parent. Update precedence rule to match current behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The transition from intermediate child span to direct remote parent adoption (25438a6) left behind dead test scaffolding and an outdated description in docs/labels.md. Clean up both so all artifacts match the current behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add support for the
tekton.dev/deliveryTraceparentannotation to allow external systems to provide a W3C traceparent that becomes the remote parent for PipelineRun execution traces.This enables distributed tracing across system boundaries - when an external system creates a PipelineRun, it can set the deliveryTraceparent annotation with its current trace context, and Tekton will adopt that as the remote parent rather than creating a new root span.
The annotation is only used when: