You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This Kubernetes Enhancement Proposal (KEP) proposes enhancing the API Server to allow tracing requests.
47
51
48
-
49
52
## Motivation
50
53
51
54
Along with metrics and logs, traces are a useful form of telemetry to aid with debugging incoming requests. The API Server currently uses a poor-man's form of tracing (see [github.com/kubernetes/utils/trace](https://github.com/kubernetes/utils/tree/master/trace)), but we can make use of distributed tracing to improve the ease of use and enable easier analysis of trace data. Trace data is structured, providing the detail necessary to debug requests, and context propagation allows plugins, such as admission webhooks, to add to API Server requests.
@@ -60,7 +63,6 @@ Along with metrics and logs, traces are a useful form of telemetry to aid with d
60
63
61
64
* The API Server generates and exports spans for incoming and outgoing requests.
62
65
* The API Server propagates context from incoming requests to outgoing requests.
63
-
* Kubectl clients can easily specify that a request should be traced.
64
66
65
67
### Non-Goals
66
68
@@ -69,22 +71,69 @@ Along with metrics and logs, traces are a useful form of telemetry to aid with d
69
71
* Trace operations from all Kubernetes resource types in a generic manner (i.e. without manual instrumentation)
70
72
* Change metrics or logging (e.g. to support trace-metric correlation)
We will wrap the API Server's http server and http clients with [othttp](https://github.com/open-telemetry/opentelemetry-go/tree/master/plugin/othttp) to get spans for incoming and outgoing http requests, and add the [otgrpc](https://github.com/grpc-ecosystem/grpc-opentracing/tree/master/go/otgrpc) DialOption to the etcd grpc client. This generates spans for all sampled incoming requests and propagates context with all client requests. For incoming requests, this would go below [WithRequestInfo](https://github.com/kubernetes/kubernetes/blob/9eb097c4b07ea59c674a69e19c1519f0d10f2fa8/staging/src/k8s.io/apiserver/pkg/server/config.go#L676) in the filter stack, as it must be after authentication and authorization, before the panic filter, and is closest in function to the WithRequestInfo filter.
80
+
We will wrap the API Server's http server and http clients with [othttp](https://github.com/open-telemetry/opentelemetry-go/tree/master/plugin/othttp) to get spans for incoming and outgoing http requests. This generates spans for all sampled incoming requests and propagates context with all client requests. For incoming requests, this would go below [WithRequestInfo](https://github.com/kubernetes/kubernetes/blob/9eb097c4b07ea59c674a69e19c1519f0d10f2fa8/staging/src/k8s.io/apiserver/pkg/server/config.go#L676) in the filter stack, as it must be after authentication and authorization, before the panic filter, and is closest in function to the WithRequestInfo filter.
78
81
79
82
Note that some clients of the API Server, such as webhooks, may make reentrant calls to the API Server. To gain the full benefit of tracing, such clients should propagate context with requests back to the API Server.
80
83
81
-
### Vendor OpenTelemetry and the OT Exporter
84
+
### Exporting Spans
82
85
83
86
This KEP proposes the use of the [OpenTelemetry tracing framework](https://opentelemetry.io/) to create and export spans to configured backends.
84
87
85
-
The API Server will use the [OpenTelemetry exporter format](https://github.com/open-telemetry/opentelemetry-proto), which exports traces to a local port. This format is compatible with the [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector), which allows importing and configuring exporters for trace storage backends to be done out-of-tree in addition to other useful features. The exporter stores spans in memory, and uses the [batching processor](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#batching-processor) to batch requests and send them asynchronously.
88
+
The API Server will use the [OpenTelemetry exporter format](https://github.com/open-telemetry/opentelemetry-proto), and the [OTlp exporter](https://github.com/open-telemetry/opentelemetry-go/tree/master/exporters/otlp#opentelemetry-collector-go-exporter) which can export traces. This format is easy to use with the [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector), which allows importing and configuring exporters for trace storage backends to be done out-of-tree in addition to other useful features.
89
+
90
+
### Running the OpenTelemetry Collector
91
+
92
+
The [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) can be run as a sidecar, a daemonset, a deployment , or a combination in which the daemonset buffers telemetry and forwards to the deployment for aggregation (e.g. tail-base sampling) and routing to a telemetry backend. To support these various setups, the API Server should be able to send traffic either to a local (on the master) collector, or to a cluster service (in the cluster).
93
+
94
+
### APIServer Configuration and EgressSelectors
95
+
96
+
The API Server controls where traffic is sent using an [EgressSelector](https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20190226-network-proxy.md), and has separate controls for `Master`, `Cluster`, and `Etcd` traffic. As described above, we would like to support either sending telemetry to a url using the `Master` egress, or a service using the `Cluster` egress. To accomplish this, we will introduce a flag, `--opentelemetry-config-file`, that will point to the file that defines the opentelemetry exporter configuration. That file will have the following format:
Add configuration to the API Server required to configure the opentelemetry exporter, including the address and egress proxy to send spans to. The [egress proxy](https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20190226-network-proxy.md) can be added to the opentelemetry exporter by adding a ContextDialer grpc DialOption similar to the one used by the apiserver's etcd client. This will add a new "OpenTelemetry" [EgressType](https://github.com/kubernetes/kubernetes/blob/4b9b9ab75376b7b53876ab6b2be42d0940c7eb26/staging/src/k8s.io/apiserver/pkg/server/egressselector/egress_selector.go#L53) to the API Server's configuration.
101
+
// OpenTelemetryClientConfiguration provides versioned configuration for opentelemetry clients.
102
+
typeOpenTelemetryClientConfigurationstruct {
103
+
metav1.TypeMeta`json:",inline"`
104
+
105
+
// +optional
106
+
// URL of the collector that's running on the master.
107
+
// if URL is specified, APIServer uses the egressType Master when sending tracing data to the collector.
// If specified, the port on the service that hosting webhook.
131
+
// Default to 443 for backward compatibility.
132
+
// `port` should be a valid port number (1-65535, inclusive).
133
+
// +optional
134
+
Port *int32`json:"port,omitempty" protobuf:"varint,4,opt,name=port"`
135
+
}
136
+
```
88
137
89
138
### Controlling use of the OpenTelemetry library
90
139
@@ -101,11 +150,16 @@ Alpha
101
150
Beta
102
151
103
152
-[] Tracing 100% of requests does not break scalability tests.
153
+
-[] OpenTelemetry reaches GA
104
154
-[] Publish documentation on examples of how to use the OT Collector with kubernetes
105
155
106
156
107
157
## Alternatives considered
108
158
159
+
### Introducing a new EgressSelector type
160
+
161
+
Instead of a configuration file to choose between a url on the `Master` network, or a service on the `Cluster` network, we considered introducing a new `OpenTelemetry` egress type, which could be configured separately. However, we aren't actually introducing a new destination for traffic, so it is more conventional to make use of existing egress types. We will also likely want to add additional configuration for the OpenTelemetry client in the future.
162
+
109
163
### Other OpenTelemetry Exporters
110
164
111
165
This KEP suggests that we utilize the OpenTelemetry exporter format in all components. Alternative options include:
0 commit comments