|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +title: "Contextual logging in Kubernetes 1.29: Better troubleshooting and enhanced logging" |
| 4 | +slug: contextual-logging-in-kubernetes-1-29 |
| 5 | +date: 2023-12-20T09:30:00-08:00 |
| 6 | +canonicalUrl: https://www.kubernetes.dev/blog/2023/12/20/contextual-logging-in-kubernetes-1-29/ |
| 7 | +--- |
| 8 | + |
| 9 | +**Authors**: [Mengjiao Liu](https://github.com/mengjiao-liu/) (DaoCloud), [Patrick Ohly](https://github.com/pohly) (Intel) |
| 10 | + |
| 11 | +On behalf of the [Structured Logging Working Group](https://github.com/kubernetes/community/blob/master/wg-structured-logging/README.md) |
| 12 | +and [SIG Instrumentation](https://github.com/kubernetes/community/tree/master/sig-instrumentation#readme), |
| 13 | +we are pleased to announce that the contextual logging feature |
| 14 | +introduced in Kubernetes v1.24 has now been successfully migrated to |
| 15 | +two components (kube-scheduler and kube-controller-manager) |
| 16 | +as well as some directories. This feature aims to provide more useful logs |
| 17 | +for better troubleshooting of Kubernetes and to empower developers to enhance Kubernetes. |
| 18 | + |
| 19 | +## What is contextual logging? |
| 20 | + |
| 21 | +[Contextual logging](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/3077-contextual-logging) |
| 22 | +is based on the [go-logr](https://github.com/go-logr/logr#a-minimal-logging-api-for-go) API. |
| 23 | +The key idea is that libraries are passed a logger instance by their caller |
| 24 | +and use that for logging instead of accessing a global logger. |
| 25 | +The binary decides the logging implementation, not the libraries. |
| 26 | +The go-logr API is designed around structured logging and supports attaching |
| 27 | +additional information to a logger. |
| 28 | + |
| 29 | +This enables additional use cases: |
| 30 | + |
| 31 | +- The caller can attach additional information to a logger: |
| 32 | + - [WithName](<https://pkg.go.dev/github.com/go-logr/logr#Logger.WithName>) adds a "logger" key with the names concatenated by a dot as value |
| 33 | + - [WithValues](<https://pkg.go.dev/github.com/go-logr/logr#Logger.WithValues>) adds key/value pairs |
| 34 | + |
| 35 | + When passing this extended logger into a function, and the function uses it |
| 36 | + instead of the global logger, the additional information is then included |
| 37 | + in all log entries, without having to modify the code that generates the log entries. |
| 38 | + This is useful in highly parallel applications where it can become hard to identify |
| 39 | + all log entries for a certain operation, because the output from different operations gets interleaved. |
| 40 | + |
| 41 | +- When running unit tests, log output can be associated with the current test. |
| 42 | + Then, when a test fails, only the log output of the failed test gets shown by go test. |
| 43 | + That output can also be more verbose by default because it will not get shown for successful tests. |
| 44 | + Tests can be run in parallel without interleaving their output. |
| 45 | + |
| 46 | +One of the design decisions for contextual logging was to allow attaching a logger as value to a `context.Context`. |
| 47 | +Since the logger encapsulates all aspects of the intended logging for the call, |
| 48 | +it is *part* of the context, and not just *using* it. A practical advantage is that many APIs |
| 49 | +already have a `ctx` parameter or can add one. This provides additional advantages, like being able to |
| 50 | +get rid of `context.TODO()` calls inside the functions. |
| 51 | + |
| 52 | +## How to use it |
| 53 | + |
| 54 | +The contextual logging feature is alpha starting from Kubernetes v1.24, |
| 55 | +so it requires the `ContextualLogging` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled. |
| 56 | +If you want to test the feature while it is alpha, you need to enable this feature gate |
| 57 | +on the `kube-controller-manager` and the `kube-scheduler`. |
| 58 | + |
| 59 | +For the `kube-scheduler`, there is one thing to note, in addition to enabling |
| 60 | +the `ContextualLogging` feature gate, instrumentation also depends on log verbosity. |
| 61 | +To avoid slowing down the scheduler with the logging instrumentation for contextual logging added for 1.29, |
| 62 | +it is important to choose carefully when to add additional information: |
| 63 | +- At `-v3` or lower, only `WithValues("pod")` is used once per scheduling cycle. |
| 64 | + This has the intended effect that all log messages for the cycle include the pod information. |
| 65 | + Once contextual logging is GA, "pod" key/value pairs can be removed from all log calls. |
| 66 | +- At `-v4` or higher, richer log entries get produced where `WithValues` is also used for the node (when applicable) |
| 67 | + and `WithName` is used for the current operation and plugin. |
| 68 | + |
| 69 | +Here is an example that demonstrates the effect: |
| 70 | +> I1113 08:43:37.029524 87144 default_binder.go:53] "Attempting to bind pod to node" **logger="Bind.DefaultBinder"** **pod**="kube-system/coredns-69cbfb9798-ms4pq" **node**="127.0.0.1" |
| 71 | +
|
| 72 | +The immediate benefit is that the operation and plugin name are visible in `logger`. |
| 73 | +`pod` and `node` are already logged as parameters in individual log calls in `kube-scheduler` code. |
| 74 | +Once contextual logging is supported by more packages outside of `kube-scheduler`, |
| 75 | +they will also be visible there (for example, client-go). Once it is GA, |
| 76 | +log calls can be simplified to avoid repeating those values. |
| 77 | + |
| 78 | +In `kube-controller-manager`, `WithName` is used to add the user-visible controller name to log output, |
| 79 | +for example: |
| 80 | + |
| 81 | +> I1113 08:43:29.284360 87141 graph_builder.go:285] "garbage controller monitor not synced: no monitors" **logger="garbage-collector-controller"** |
| 82 | +
|
| 83 | +The `logger=”garbage-collector-controller”` was added by the `kube-controller-manager` core |
| 84 | +when instantiating that controller and appears in all of its log entries - at least as long as the code |
| 85 | +that it calls supports contextual logging. Further work is needed to convert shared packages like client-go. |
| 86 | + |
| 87 | +## Performance impact |
| 88 | + |
| 89 | +Supporting contextual logging in a package, i.e. accepting a logger from a caller, is cheap. |
| 90 | +No performance impact was observed for the `kube-scheduler`. As noted above, |
| 91 | +adding `WithName` and `WithValues` needs to be done more carefully. |
| 92 | + |
| 93 | +In Kubernetes 1.29, enabling contextual logging at production verbosity (`-v3` or lower) |
| 94 | +caused no measurable slowdown for the `kube-scheduler` and is not expected for the `kube-controller-manager` either. |
| 95 | +At debug levels, a 28% slowdown for some test cases is still reasonable given that the resulting logs make debugging easier. |
| 96 | +For details, see the [discussion around promoting the feature to beta](https://github.com/kubernetes/enhancements/pull/4219#issuecomment-1807811995). |
| 97 | + |
| 98 | +## Impact on downstream users |
| 99 | +Log output is not part of the Kubernetes API and changes regularly in each release, |
| 100 | +whether it is because developers work on the code or because of the ongoing conversion |
| 101 | +to structured and contextual logging. |
| 102 | + |
| 103 | +If downstream users have dependencies on specific logs, |
| 104 | +they need to be aware of how this change affects them. |
| 105 | + |
| 106 | +## Further reading |
| 107 | + |
| 108 | +- Read the [Contextual Logging in Kubernetes 1.24](https://www.kubernetes.dev/blog/2022/05/25/contextual-logging/) article. |
| 109 | +- Read the [KEP-3077: contextual logging](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/3077-contextual-logging). |
| 110 | + |
| 111 | +## Get involved |
| 112 | + |
| 113 | +If you're interested in getting involved, we always welcome new contributors to join us. |
| 114 | +Contextual logging provides a fantastic opportunity for you to contribute to Kubernetes development and make a meaningful impact. |
| 115 | +By joining [Structured Logging WG](https://github.com/kubernetes/community/tree/master/wg-structured-logging), |
| 116 | +you can actively participate in the development of Kubernetes and make your first contribution. |
| 117 | +It's a great way to learn and engage with the community while gaining valuable experience. |
| 118 | + |
| 119 | +We encourage you to explore the repository and familiarize yourself with the ongoing discussions and projects. |
| 120 | +It's a collaborative environment where you can exchange ideas, ask questions, and work together with other contributors. |
| 121 | + |
| 122 | +If you have any questions or need guidance, don't hesitate to reach out to us |
| 123 | +and you can do so on our [public Slack channel](https://kubernetes.slack.com/messages/wg-structured-logging). |
| 124 | +If you're not already part of that Slack workspace, you can visit [https://slack.k8s.io/](https://slack.k8s.io/) |
| 125 | +for an invitation. |
| 126 | + |
| 127 | +We would like to express our gratitude to all the contributors who provided excellent reviews, |
| 128 | +shared valuable insights, and assisted in the implementation of this feature (in alphabetical order): |
| 129 | + |
| 130 | +- Aldo Culquicondor ([alculquicondor](https://github.com/alculquicondor)) |
| 131 | +- Andy Goldstein ([ncdc](https://github.com/ncdc)) |
| 132 | +- Feruzjon Muyassarov ([fmuyassarov](https://github.com/fmuyassarov)) |
| 133 | +- Freddie ([freddie400](https://github.com/freddie400)) |
| 134 | +- JUN YANG ([yangjunmyfm192085](https://github.com/yangjunmyfm192085)) |
| 135 | +- Kante Yin ([kerthcet](https://github.com/kerthcet)) |
| 136 | +- Kiki ([carlory](https://github.com/carlory)) |
| 137 | +- Lucas Severo Alve ([knelasevero](https://github.com/knelasevero)) |
| 138 | +- Maciej Szulik ([soltysh](https://github.com/soltysh)) |
| 139 | +- Mengjiao Liu ([mengjiao-liu](https://github.com/mengjiao-liu)) |
| 140 | +- Naman Lakhwani ([Namanl2001](https://github.com/Namanl2001)) |
| 141 | +- Oksana Baranova ([oxxenix](https://github.com/oxxenix)) |
| 142 | +- Patrick Ohly ([pohly](https://github.com/pohly)) |
| 143 | +- songxiao-wang87 ([songxiao-wang87](https://github.com/songxiao-wang87)) |
| 144 | +- Tim Allclai ([tallclair](https://github.com/tallclair)) |
| 145 | +- ZhangYu ([Octopusjust](https://github.com/Octopusjust)) |
| 146 | +- Ziqi Zhao ([fatsheep9146](https://github.com/fatsheep9146)) |
| 147 | +- Zac ([249043822](https://github.com/249043822)) |
0 commit comments