Skip to content

Commit 894a89a

Browse files
authored
Merge pull request #4793 from serathius/migration
Add instructions for structured logging migration
2 parents e3db935 + 5e71635 commit 894a89a

File tree

1 file changed

+383
-0
lines changed

1 file changed

+383
-0
lines changed
Lines changed: 383 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,383 @@
1+
# Structured Logging migration instructions
2+
3+
This document describes instructions for migration proposed by [Structured Logging KEP]. It describes new structured
4+
functions introduced in `klog` (Kubernetes logging library) and how log calls should be changed to utilize new features.
5+
This document was written for the initial migration of `kubernetes/kubernetes` repository proposed for Alpha stage, but
6+
should be applicable at later stages or for other projects using `klog` logging library.
7+
8+
[Structured Logging KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging
9+
10+
## Goal of Alpha migration
11+
12+
The first step is to introduce structure to the high percentage of logs generated in Kubernetes by changing only a
13+
small number of logs API calls. Based on criteria described in the [selecting most important logs] section, the selected
14+
22 log calls are estimated to impact 99.9% of log volume. The up to date list of these log calls is provided in the
15+
[Enhancement Issue].
16+
17+
[Enhancement Issue]: https://github.com/kubernetes/enhancements/issues/1602
18+
[selecting most important logs]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging#selecting-most-important-logs
19+
20+
## Structured logging in Kubernetes
21+
22+
With this enhancement a set of new functions were added to `klog`. Structured logging functions the follow interface
23+
based on [logr], which has a different design than other `klog` functions which are based on [glog]. It is recommended
24+
to familiarize yourself with [logr].
25+
26+
[logr]: https://github.com/go-logr/logr
27+
[glog]: https://github.com/golang/glog
28+
29+
Here are the prototypes of functions added to `klog` that will be utilized during migration:
30+
```go
31+
package klog
32+
33+
// InfoS structured logs to the INFO log.
34+
// The msg argument used to add constant description to the log line.
35+
// The key/value pairs would be join by "=" ; a newline is always appended.
36+
//
37+
// Examples:
38+
// >> klog.InfoS("Pod status updated", "pod", klog.KObj(pod), "status", "ready")
39+
// output:
40+
// >> I1025 00:15:15.525108 1 controller_utils.go:116] "Pod status updated" pod="kube-system/kubedns" status="ready"
41+
func InfoS(msg string, keysAndValues ...interface{})
42+
43+
// ErrorS structured logs to the ERROR, WARNING, and INFO logs.
44+
// the err argument used as "err" field of log line.
45+
// The msg argument used to add constant description to the log line.
46+
// The key/value pairs would be join by "=" ; a newline is always appended.
47+
//
48+
// Examples:
49+
// >> klog.ErrorS(err, "Failed to update pod status")
50+
// output:
51+
// >> E1025 00:15:15.525108 1 controller_utils.go:114] "Failed to update pod status" err="timeout"
52+
func ErrorS(err error, msg string, keysAndValues ...interface{})
53+
54+
// KObj is used to create ObjectRef when logging information about Kubernetes objects
55+
// Examples:
56+
// >> klog.InfoS("Pod status updated", "pod", klog.KObj(pod), "status", "ready")
57+
// output:
58+
// >> I1025 00:15:15.525108 1 controller_utils.go:116] "Pod status updated" pod="kube-system/kubedns" status="ready"
59+
func KObj(obj KMetadata) ObjectRef
60+
61+
// KRef is used to create ObjectRef when logging information about Kubernetes objects without access to metav1.Object
62+
// Examples:
63+
// >> klog.InfoS("Pod status updated", "pod", klog.KRef(podNamespace, podName), "status", "ready")
64+
// output:
65+
// >> I1025 00:15:15.525108 1 controller_utils.go:116] "Pod status updated" pod="kube-system/kubedns" status="ready"
66+
func KRef(namespace, name string) ObjectRef
67+
68+
// ObjectRef represents a reference to a kubernetes object used for logging purpose
69+
// In text logs it is serialized into "{namespace}/{name}" or "{name}" if namespace is empty
70+
type ObjectRef struct {
71+
Name string `json:"name"`
72+
Namespace string `json:"namespace,omitempty"`
73+
}
74+
75+
// KMetadata is a subset of the kubernetes k8s.io/apimachinery/pkg/apis/meta/v1.Object interface
76+
// this interface may expand in the future, but will always be a subset of the
77+
// kubernetes k8s.io/apimachinery/pkg/apis/meta/v1.Object interface
78+
type KMetadata interface {
79+
GetName() string
80+
GetNamespace() string
81+
}
82+
```
83+
84+
## Migration
85+
86+
1. Change log functions to structured equivalent
87+
1. Remove string formatting from log message
88+
1. Name arguments
89+
1. Use `klog.KObj` and `klog.KRef` for Kubernetes object references
90+
1. Verify log output
91+
92+
## Change log functions to structured equivalent
93+
94+
Structured logging functions follow a different logging interface design than other functions in `klog`. They follow
95+
minimal design from [logr] thus there is no one-to-one mapping.
96+
97+
Simplified mapping between functions:
98+
* `klog.Infof`, `klog.Info`, `klog.Infoln`, `klog.InfoDepth` -> `klog.InfoS`
99+
* `klog.V(N).Infof`, `klog.V(N).Info`, `klog.V(N).Infoln` -> `klog.V(N).InfoS`
100+
* `klog.Warning`, `klog.Warningf`, `klog.Warningln`, `klog.WarningDepth` -> `klog.InfoS`
101+
* `klog.Error`, `klog.Errorf`, `klog.Errorln`, `klog.ErrorDepth` -> `klog.ErrorS`
102+
* `klog.Fatal`, `klog.Fatalf`, `klog.Fatalln`, `klog.FatalDepth` -> `klog.ErrorS`
103+
104+
### Removing Depth
105+
106+
Functions with depth (`klog.InfoDepth`, `klog.WarningDepth`, `klog.ErrorDepth`, `klog.FatalDepth`) are used to indicate
107+
that the source of the log (added as metadata in log) is different than the invocation of logging library. This is
108+
usually used when implementing logging util functions. As logr interface doesn't support depth, those functions should
109+
return logging arguments instead of calling `klog` directly.
110+
111+
For example
112+
```go
113+
func Handle(w http.ReponseWriter, r *http.Request) {
114+
logHTTPRequest(r)
115+
handle(w, r)
116+
}
117+
118+
func logHTTPRequest(r *http.Request) {
119+
klog.InfoDepth(1, "Received HTTP %s request", r.Method)
120+
}
121+
```
122+
should be replaced with
123+
```go
124+
func Handle(w http.ReponseWriter, r *http.Request) {
125+
klog.Info("Received HTTP request", httpRequestLog(r)...)
126+
handle(w, r)
127+
}
128+
129+
func httpRequestLog(r *http.Request) []interface{} {
130+
return []interface{}{
131+
"verb", r.Method,
132+
}
133+
}
134+
135+
```
136+
137+
### Using ErrorS
138+
139+
With `klog` structured logging borrowing the interface from [logr] it also inherits it's differences in semantic of
140+
error function. Logs generated by `ErrorS` command may be enhanced with additional debug information
141+
(such as stack traces) or be additionally sent to special error recording tools. Errors should be used to indicate
142+
unexpected behaviours in code, like unexpected errors returned by subroutine function calls.
143+
144+
Calling `ErrorS` with `nil` as error is semi-acceptable if there is error condition that deserves a stack trace at this
145+
origin point. For expected errors (`errors` that can happen during routine operations) please consider using
146+
`klog.InfoS` and pass error in `err` key instead.
147+
148+
### Replacing Fatal calls
149+
150+
Use of Fatal should be discouraged and it's not available in new functions. Instead of depending on the logger to exit
151+
the process, you should call `os.Exit()` yourself.
152+
153+
## Remove string formatting from log message
154+
155+
With structured logging, log messages are no longer formatted, leaving argument marshalling up to the logging client
156+
implementation. This allows messages to be a static description of event.
157+
158+
All string formatting (`%d`, `%v`, `%w`, `%s`) should be removed and log message string simplified.
159+
Describing arguments in log messages is no longer needed and should be removed leaving only a description of what
160+
happened.
161+
162+
Additionally we can improve messages to comply with good practices:
163+
* Start from a capital letter.
164+
* Do not end the message with a period.
165+
* Use active voice. Use complete sentences when there is an acting subject ("A could not do B") or omit the subject if
166+
the subject would be the program itself ("Could not do B").
167+
* Use past tense ("Could not delete B" instead of "Cannot delete B")
168+
* When referring to an object, state what type of object it is. ("Deleted pod" instead of "Deleted")
169+
170+
For example
171+
```go
172+
klog.Infof("delete pod %s with propagation policy %s", ...)
173+
```
174+
should be changed to
175+
```go
176+
klog.Infof("Deleted pod", ...)
177+
```
178+
179+
Some logs are constructed solely from string formats. In those cases a message needs to be derived from the context of
180+
the log call.
181+
182+
For example http access logs
183+
```go
184+
func LogHTTP(r *http.Request) {
185+
klog.Infof("%s %s: (%v) %v%v%v [%s %s]", ...)
186+
}
187+
```
188+
should be changed to
189+
```go
190+
func LogHTTP(r *http.Request) {
191+
klog.InfoS("Received HTTP request", ...)
192+
}
193+
```
194+
195+
### Name arguments
196+
197+
Even though new structured logging functions have very similar function prototype `func (string, ...interface{})` it
198+
has different meaning for variadic arguments. Instead of just passing arguments, now we are passing key value pairs of
199+
argument name and argument value. This means when migrating a log call we need to add an additional string before each
200+
argument, that will be used as it's name.
201+
202+
For example
203+
```go
204+
func LogHTTP(r *http.Request) {
205+
klog.Infof("Received HTTP request, path: %s, method: %s", r.Path, r.Method)
206+
}
207+
```
208+
should be changed to
209+
```go
210+
func LogHTTP(r *http.Request) {
211+
klog.InfoS("Received HTTP request", "path", r.Path, "method", r.Method)
212+
}
213+
```
214+
215+
Names of arguments should use [lowerCamelCase] and be alphanumeric. Arguments names in one log call should be unique.
216+
Names should be picked based on semantic meaning of value itself, not the context in which is used (log message should
217+
imply the context). For example names like `status` should be used over (`desiredStatus`, `oldStatus`, `badStatus`) thus
218+
allowing to query and join different log lines of the `status` field.
219+
220+
Kubernetes objects should be referenced using only their kind, no matter their api group or version. Example argument
221+
names: `deployment`, `pod`, `node`, `replicaSet`. For objects of unknown type, is ok to log them under `object` key with
222+
addition of `apiVersion` and `kind` fields describing the k8s object type.
223+
224+
In situations when we want to the log value of the same meaning twice (e.g. transition between state) it is ok to name
225+
an additional argument based on context, but leaving one most current/correct value with canonical name.
226+
227+
Examples of keys (strongly suggested, will be extended when pattern emerge, no standard schema yet):
228+
* `err` - error when using `klog.InfoS`. Used for expected errors that are not `klog.ErrorS`.
229+
* `object` - reference to k8s objects of unknown type. Should be used with `kind` and `apiVersion`.
230+
* `kind` - kind of k8s object of unknown type.
231+
* `apiVersion` - API version of k8s object of unknown type.
232+
233+
Example:
234+
235+
```go
236+
func ChangeStatus(newStatus, currentStatus string) {
237+
err := changeStatus(newStatus)
238+
if err != nil {
239+
klog.ErrorS(err, "Failed changing status", "desiredStatus", newStatus, "status", currentStatus)
240+
}
241+
klog.InfoS("Changed status", "previousStatus", currentStatus, "status", newStatus)
242+
}
243+
```
244+
245+
[lowerCamelCase]: https://en.wiktionary.org/wiki/lowerCamelCase
246+
247+
### Use `klog.KObj` and `klog.KRef` for Kubernetes objects
248+
249+
As part of structured logging migration we want to ensure that kubernetes objects references are consistent within the
250+
codebase. Two new utility functions were introduced to klog `klog.KObj` and `klog.KRef`. Any reference
251+
(name, uid, namespace) to Kubernetes Object (Pod, Node, Deployment, CRD) should be rewritten to utilize those functions.
252+
In situations when object `UID` is would be beneficial for log, it should be added as separate field with `UID` suffix.
253+
254+
For example
255+
```go
256+
func updatePod(pod *covev1.Pod) {
257+
...
258+
klog.Infof("Updated pod %s in namespace %s", pod.Name, pod.Namespace)
259+
}
260+
```
261+
should be changed to
262+
```go
263+
func updatePod(pod *covev1.Pod) {
264+
...
265+
klog.InfoS("Updated pod", "pod", klog.KObj(pod))
266+
}
267+
```
268+
And
269+
```go
270+
func updatePod(pod *covev1.Pod) {
271+
...
272+
klog.Infof("Updated pod with uid: %s", pod.Uid)
273+
}
274+
```
275+
should be changed to
276+
```go
277+
func updatePod(pod *covev1.Pod) {
278+
...
279+
klog.InfoS("Updated pod", "pod", klog.KObj(pod), "podUID", pod.Uid)
280+
}
281+
```
282+
283+
`klog.KObj` requires passing a kubernetes object (struct implementing `metav1.Object` interface). In situations where
284+
the object is not available, we can use `klog.KRef`. Still it is suggested to rewrite the code to use object pointer
285+
instead of strings where possible.
286+
287+
```go
288+
func updatePod(podName, podNamespace string) {
289+
...
290+
klog.InfoS("Updated pod", "pod", klog.KRef(podNamespace, podName))
291+
}
292+
```
293+
294+
For non-namespaced object we can pass empty string to namespace argument
295+
296+
```go
297+
func updateNode(nodeName string) {
298+
...
299+
klog.InfoS("Updated node", "node", klog.KRef("", nodeName))
300+
}
301+
```
302+
303+
### Verify log output
304+
305+
With the introduction of structured functions log arguments will be formatted automatically instead of depending on the
306+
caller. This means that we can remove the burden of picking the format by caller and ensure greater log consistency, but during
307+
migration it's important to ensure that we avoid degradation of log quality. We should ensure that during migration we
308+
preserve properties like:
309+
* meaning of event described by log
310+
* verbosity of stored information
311+
312+
PRs migrating logs should include examples of outputted logs before and after the change, thus helping reviewers
313+
understand the impact of change.
314+
315+
Example code to compare [httplog.go#168](https://github.com/kubernetes/kubernetes/blob/15c3f1b11/staging/src/k8s.io/apiserver/pkg/server/httplog/httplog.go#L168)
316+
```
317+
package main
318+
319+
import (
320+
"fmt"
321+
"k8s.io/klog/v2"
322+
"net/http"
323+
"time"
324+
)
325+
326+
type respLogger struct {
327+
status int
328+
statusStack string
329+
addedInfo string
330+
req *http.Request
331+
}
332+
333+
func (rl *respLogger) Log(latency time.Duration) {
334+
klog.InfoDepth(1, fmt.Sprintf("verb=%q URI=%q latency=%v resp=%v UserAgent=%q srcIP=%q: %v%v",
335+
rl.req.Method, rl.req.RequestURI,
336+
latency, rl.status,
337+
rl.req.UserAgent(), rl.req.RemoteAddr,
338+
rl.statusStack, rl.addedInfo,
339+
))
340+
}
341+
342+
func (rl *respLogger) LogArgs(latency time.Duration) []interface{} {
343+
return []interface{}{
344+
"verb", rl.req.Method,
345+
"URI", rl.req.RequestURI,
346+
"latency", latency,
347+
"resp", rl.status,
348+
"userAgent", rl.req.UserAgent(),
349+
"srcIP", rl.req.RemoteAddr,
350+
}
351+
}
352+
353+
func main() {
354+
klog.InitFlags(nil)
355+
356+
// Setup
357+
rl := respLogger{
358+
status: 200,
359+
req: &http.Request{
360+
Method: "GET",
361+
Header: map[string][]string{"User-Agent": {"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0. 2272.118 Safari/537.36."}},
362+
RemoteAddr: "127.0.0.1",
363+
RequestURI: "/metrics",
364+
},
365+
}
366+
latency := time.Second
367+
368+
// Before migration
369+
rl.Log(latency)
370+
371+
// After migration
372+
klog.InfoS("Received HTTP request", rl.LogArgs(latency)...)
373+
}
374+
```
375+
376+
Log output before migration
377+
```
378+
I0528 19:15:22.737538 47512 logtest.go:52] verb="GET" URI="/metrics" latency=1s resp=200 UserAgent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0. 2272.118 Safari/537.36." srcIP="127.0.0.1":
379+
```
380+
After
381+
```
382+
I0528 19:15:22.737588 47512 logtest.go:55] "Received HTTP request" verb="GET" URI="/metrics" latency="1s" resp=200 userAgent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0. 2272.118 Safari/537.36." srcIP="127.0.0.1"
383+
```

0 commit comments

Comments
 (0)