Skip to content

Commit 4c22807

Browse files
authored
Merge pull request #7737 from dashpole/trace_instrumentation
Add tracing instrumentation documentation
2 parents a174968 + 93cb3d9 commit 4c22807

File tree

2 files changed

+76
-1
lines changed

2 files changed

+76
-1
lines changed

contributors/devel/sig-instrumentation/instrumentation.md renamed to contributors/devel/sig-instrumentation/metric-instrumentation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Instrumenting Kubernetes
1+
## Instrumenting Kubernetes with Metrics
22

33
The following references and outlines general guidelines for metric instrumentation
44
in Kubernetes components. Components are instrumented using the
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
## Instrumenting Kubernetes with Traces
2+
3+
The following references and outlines general guidelines for trace instrumentation
4+
in Kubernetes components. Components are instrumented using the
5+
[OpenTelemetry Go client library](https://github.com/open-telemetry/opentelemetry-go).
6+
For non-Go components. [Libraries in other languages](https://opentelemetry.io/docs/languages/)
7+
are available.
8+
9+
Traces are exposed via gRPC using the [OpenTelemetry Protocol](https://opentelemetry.io/docs/specs/otel/protocol/)
10+
(OTLP), which is open and well-understood by a wide range of third party
11+
applications and vendors in the cloud-native eco-system.
12+
13+
The [general instrumentation advice](https://opentelemetry.io/docs/concepts/instrumentation/libraries/)
14+
from the OpenTelemetry documentation applies. This document reiterates common pitfalls and some
15+
Kubernetes specific considerations.
16+
17+
### When to instrument
18+
19+
While spans are sampled to avoid high costs, recording too many spans will
20+
force consumers to lower the sampling rate, and will "drown out" important
21+
spans. If your component has more than two or three nested spans, you are
22+
likely over-using trace instrumentation. Most trace instrumentation in
23+
Kubernetes components falls into one of two categories:
24+
25+
1. Spans for incoming or outgoing network calls
26+
2. Spans when initiating new work, such as reconciling an object, which may result in network calls.
27+
28+
For network-based telemetry, Kubernetes components should use OpenTelemetry
29+
instrumentation libraries for
30+
[HTTP](https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp) and
31+
[gRPC](https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc).
32+
33+
**Note:** When creating spans at the start of reconciling an Object, only
34+
create the span changes are actually required. Avoid creating "empty" spans
35+
which simply compare the desired and actual state of an object without
36+
performing any real work, or making any network requests.
37+
38+
### Configuration and Setup
39+
40+
Kubernetes components should expose a flag, `--tracing-config-file`, which accepts a
41+
[TracingConfiguration](https://kubernetes.io/docs/reference/config-api/apiserver-config.v1beta1/#apiserver-k8s-io-v1beta1-TracingConfiguration)
42+
object. The `component-base/tracing` library provides a `NewProvider()` helper
43+
to convert a TracingConfiguration to a TracerProvider, which can be used to
44+
record spans. Components should avoid using OpenTelemetry globals, and instead
45+
pass the configured TracerProvider to libraries where they are used. Components
46+
should use the W3C Traceparent and Baggage propagators, as provided by the
47+
`Propagators()` helper.
48+
49+
### Context Propagation
50+
51+
Generally, components should not interact directly with OpenTelemetry
52+
Propagators, other than by passing them to libraries. Context propagation
53+
across network boundaries is handled by the
54+
[HTTP](https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp) and
55+
[gRPC](https://pkg.go.dev/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc)
56+
network client and server instrumentation libraries.
57+
58+
Components need to propagate Golang's `context.Context` from incoming network
59+
calls or spans from the initiation of new work to any outgoing network calls to
60+
ensure spans are properly connected into traces.
61+
62+
### Naming and Style
63+
64+
Follow the OpenTelemetry [guidelines for span naming](https://opentelemetry.io/docs/specs/otel/trace/api/#span), and the OpenTelemetry [guidelines for attributes](https://opentelemetry.io/docs/specs/semconv/general/attribute-naming/).
65+
66+
### Tracing stability
67+
68+
Tracing instrumentation in Kubernetes components does not currently have
69+
stability guarantees, but component owners should be aware of which changes are
70+
breaking to users so such changes are done with proper consideration. In
71+
particular, it is breaking for users for a component to stop propagating
72+
context in a way that breaks parent/child relationships for spans, to remove
73+
spans without replacement, or to remove an attribute from a span without
74+
replacement. Component owners should not treat general modification spans
75+
(e.g. renaming the span, or renaming an attribute) as breaking.

0 commit comments

Comments
 (0)