Skip to content

Commit 43d81d8

Browse files
authored
Merge pull request kubernetes#2846 from serathius/klog
KEP-2845: Initial draft
2 parents bf1ebdf + 036a2b1 commit 43d81d8

File tree

2 files changed

+362
-0
lines changed
  • keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components

2 files changed

+362
-0
lines changed
Lines changed: 335 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,335 @@
1+
# KEP-2845: Deprecate klog specific flags in Kubernetes Compnents
2+
3+
<!-- toc -->
4+
- [Release Signoff Checklist](#release-signoff-checklist)
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [Removed klog flags](#removed-klog-flags)
11+
- [Logging defaults](#logging-defaults)
12+
- [Split stdout and stderr](#split-stdout-and-stderr)
13+
- [Logging headers](#logging-headers)
14+
- [User Stories](#user-stories)
15+
- [Writing logs to files](#writing-logs-to-files)
16+
- [Caveats](#caveats)
17+
- [Risks and Mitigations](#risks-and-mitigations)
18+
- [Users don't want to use go-runner as replacement.](#users-dont-want-to-use-go-runner-as-replacement)
19+
- [Log processing in parent process causes performance problems](#log-processing-in-parent-process-causes-performance-problems)
20+
- [Design Details](#design-details)
21+
- [Test Plan](#test-plan)
22+
- [Graduation Criteria](#graduation-criteria)
23+
- [Alpha](#alpha)
24+
- [Beta](#beta)
25+
- [GA](#ga)
26+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
27+
- [Version Skew Strategy](#version-skew-strategy)
28+
- [Implementation History](#implementation-history)
29+
- [Drawbacks](#drawbacks)
30+
- [Alternatives](#alternatives)
31+
- [Continue supporting all klog features](#continue-supporting-all-klog-features)
32+
- [Release klog 3.0 with removed features](#release-klog-30-with-removed-features)
33+
<!-- /toc -->
34+
35+
## Release Signoff Checklist
36+
37+
Items marked with (R) are required *prior to targeting to a milestone / release*.
38+
39+
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
40+
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
41+
- [ ] (R) Design details are appropriately documented
42+
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
43+
- [ ] e2e Tests for all Beta API Operations (endpoints)
44+
- [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
45+
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
46+
- [ ] (R) Graduation criteria is in place
47+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
48+
- [ ] (R) Production readiness review completed
49+
- [ ] (R) Production readiness review approved
50+
- [ ] "Implementation History" section is up-to-date for milestone
51+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
52+
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
53+
54+
[kubernetes.io]: https://kubernetes.io/
55+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
56+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
57+
[kubernetes/website]: https://git.k8s.io/website
58+
59+
## Summary
60+
61+
This KEP proposes to deprecate and in the future to remove a subset of the klog
62+
command line flags from Kubernetes components, with goal of making logging of
63+
k8s core components simpler, easier to maintain and extend by community.
64+
65+
## Motivation
66+
67+
Early on Kubernetes adopted glog logging library for logging. There was no
68+
larger motivation for picking glog, as the Go ecosystem was in its infancy at
69+
that time and there were no alternatives. As Kubernetes community needs grew
70+
glog was not flexible enough, prompting creation of its fork klog. By forking we
71+
inherited a lot of glog features that we never intended to support. Introduction
72+
of alternative log formats like JSON created a conundrum, should we implement
73+
all klog features for JSON? Most of them don't make sense and method for their
74+
configuration leaves much to be desired. Klog features are controlled by set of
75+
global flags that remain last bastion of global state in k/k repository. Those
76+
flags don't have a single naming standard (some start with log prefix, some
77+
not), don't comply to k8s flag naming (use underscore instead of hyphen) and
78+
many other problems. We need to revisit how logging configuration is done in
79+
klog, so it can work with alternative log formats and comply with current best
80+
practices.
81+
82+
Lack of investment and growing number of klog features impacted project quality.
83+
Klog has multiple problems, including:
84+
* performance is much worse than alternatives, for example 7-8x than
85+
[JSON format](https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/1602-structured-logging#logger-implementation-performance)
86+
* doesn't support throughput to fulfill Kubernetes scalability requirements
87+
[kubernetes/kubernetes#90804](https://github.com/kubernetes/kubernetes/pull/90804)
88+
* complexity and confusion caused by maintaining backward compatibility for
89+
legacy glog features and flags. For example
90+
[kuberrnetes/klog#54](https://github.com/kubernetes/klog/issues/54)
91+
92+
Fixing all those issues would require big investment into logging, but would not
93+
solve the underlying problem of having to maintain a logging library. We have
94+
already seen cases like [kubernetes/kubernetes#90804](https://github.com/kubernetes/kubernetes/pull/90804)
95+
where it's easier to reimplement a klog feature in external project than fixing
96+
the problem in klog. To conclude, we should drive to reduce maintenance cost and
97+
improve quality by narrowing scope of logging library.
98+
99+
As for what configuration options should be standardized for all logging formats
100+
I would look into 12 factor app standard (https://12factor.net/). It defines
101+
logs as steams of events and discourages applications from taking on
102+
responsibility for log file management, log rotation and any other processing
103+
that can be done externally. This is something that Kubernetes already
104+
encourages by collecting stdout and stderr logs and making them available via
105+
kubectl logs. It's somewhat confusing that K8s components don't comply to K8s
106+
best practices.
107+
108+
### Goals
109+
110+
* Unblock development of alternative logging formats
111+
* Narrow scope of logging with more opinionated approach and smaller set of features
112+
* Reduce complexity of logging configuration and follow standard component configuration mechanism.
113+
114+
### Non-Goals
115+
116+
* Change klog output format
117+
118+
## Proposal
119+
120+
I propose to remove klog specific feature flags in Kubernetes core components
121+
(kube-apiserver, kube-scheduler, kube-controller-manager, kubelet) and set them
122+
to agreed good defaults. From klog flags we would remove all flags besides "-v"
123+
and "-vmodule". With removal of flags to route logs based on type we want to
124+
change the default routing that will work as better default. Changing the
125+
defaults will be done in via multi release process, that will introduce some
126+
temporary flags that will be removed at the same time as other klog flags.
127+
128+
### Removed klog flags
129+
130+
To adopt 12 factor app standard for logging we would drop all flags that extend
131+
logging over events streams. This change should be
132+
scoped to only those components and not affect broader klog community.
133+
134+
Flags that should be deprecated:
135+
136+
* --log-dir, --log-file, --log-flush-frequency - responsible for writing to
137+
files and syncs to disk.
138+
Motivation: Not critical as there are easy to set up alternatives like:
139+
shell redirection, systemd service management or docker log driver. Removing
140+
them reduces complexity and allows development of non-text loggers like one
141+
writing to journal.
142+
* --logtostderr, --alsologtostderr, --one-output, --stderrthreshold -
143+
responsible enabling/disabling writing to stderr (vs file).
144+
Motivation: Routing logs can be easily implemented by any log processors like:
145+
Fluentd, Fluentbit, Logstash.
146+
* --log-file-max-size, --skip-log-headers - responsible configuration of file
147+
rotation.
148+
Motivation: Not needed if writing to files is removed.
149+
* --add-dir-header, --skip-headers - klog format specific flags .
150+
Motivation: don't apply to other log formats
151+
* --log-backtrace-at - A legacy glog feature.
152+
Motivation: No trace of anyone using this feature.
153+
154+
Flag deprecation should comply with standard k8s policy and require 3 releases before removal.
155+
156+
This leaves that two flags that should be implemented by all log formats
157+
158+
* -v - control global log verbosity of Info logs
159+
* --vmodule - control log verbosity of Info logs on per file level
160+
161+
Those flags were chosen as they have effect of which logs are written,
162+
directly impacting log volume and component performance.
163+
164+
### Logging defaults
165+
166+
With removal of configuration alternatives we need to make sure that defaults
167+
make sense. List of logging features implemented by klog and proposed actions:
168+
* Routing logs based on type/verbosity - Should be reconsidered.
169+
* Writing logs to file - Feature removed.
170+
* Log file rotation based on file size - Feature removed.
171+
* Configuration of log headers - Use the current defaults.
172+
* Adding stacktrace - Feature removed.
173+
174+
For log routing I propose to adopt UNIX convention of writing info logs to
175+
stdout and errors to stderr. For log headers I propose to use the current
176+
default.
177+
178+
#### Split stdout and stderr
179+
180+
As logs should be treated as event streams I would propose that we separate two
181+
main streams "info" and "error" based on log method called. As error logs should
182+
usually be treated with higher priority, having two streams prevents single
183+
pipeline from being clogged down (for example
184+
[kubernetes/klog#209](https://github.com/kubernetes/klog/issues/209)).
185+
For logging formats writing to standard streams, we should follow UNIX standard
186+
of mapping "info" logs to stdout and "error" logs to stderr.
187+
188+
Splitting stdout from stderr would be a breaking change in both klog and
189+
kubernetes components. However, we expect only minimal impact on users, as
190+
redirecting both streams is a common practice. In rare cases that will be
191+
impacted, adapting to this change should be a 1 line change. Still we will want
192+
to give users a proper heads up before making this change, so we will hide the
193+
change behind a new logging flag `--logtostdout`. This flag will be used avoid
194+
introducing breaking change in klog.
195+
196+
With this flag we can follow multi release plan to minimize user impact (each
197+
point should be done in a separate Kubernetes release):
198+
1. Introduce the flag in disabled state and start using it in tests.
199+
1. Announce flag availability and encourage users to adopt it.
200+
1. Enable the flag by default and deprecate it (allows users to flip back to previous behavior)
201+
1. Remove the flag following the deprecation policy.
202+
203+
#### Logging headers
204+
205+
Default logging headers configuration results in klog writing information about
206+
log type (error/info), timestamp when log was created and code line responsible
207+
for generation it. All this information is useful and should be utilized by
208+
modern logging solutions. Log type is useful for log filtering when looking for
209+
an issue. Log generation timestamp is useful to preserve ordering of logs and
210+
should be always preferred over time of injection which can be much later.
211+
Source code location is important to identify how log line was generated.
212+
213+
Example:
214+
```
215+
I0605 22:03:07.224378 3228948 logger.go:59] "Log using InfoS" key="value"
216+
```
217+
218+
### User Stories
219+
220+
#### Writing logs to files
221+
222+
We should use go-runner as a official fallback for users that want to retain
223+
writing logs to files. go-runner runs as parent process to components binary
224+
reading it's stdout/stderr and is able to route them to files. go-runner is
225+
already released as part of official K8s images it should be as simple as changing:
226+
227+
```
228+
/usr/local/bin/kube-apiserver --log-file=/var/log/kube-apiserver.log
229+
```
230+
231+
to
232+
233+
```
234+
/go-runner --log-file=/var/log/kube-apiserver.log /usr/local/bin/kube-apiserver
235+
```
236+
237+
### Caveats
238+
239+
Is it ok for K8s components to drop support for subset of klog flags?
240+
241+
Technically K8s already doesn't support klog flags. Klog flags are renamed to
242+
comply with K8s flag naming convention (underscores are replaced with hyphens).
243+
Full klog support was never promised to users and removal of those flags should
244+
be treated as removal of any other flag.
245+
246+
Is it ok for K8s components to drop support writing to files?
247+
Writing directly to files is an important feature still used by users, but this
248+
doesn't directly necessitates direct support in components. By providing a
249+
external solution like go-runner we can allow community to develop more advanced
250+
features while maintaining high quality implementation within components.
251+
Having more extendable solution developed externally should be more beneficial
252+
to community when compared to forcing closed list of features on everyone.
253+
254+
### Risks and Mitigations
255+
256+
#### Users don't want to use go-runner as replacement.
257+
258+
There are multiple alternatives that allow users to redirect logs to a file.
259+
Exact solution depends on users preferred way to run the process with one shared
260+
property, all of them supports consuming stdout/stderr. For example shell
261+
redirection, systemd service management or
262+
[docker logging driver](https://docs.docker.com/config/containers/logging/configure/).
263+
Not all of them support log rotation, but it's users responsibility to know
264+
complementary tooling that provides it. For example tools like
265+
[logrotate](https://linux.die.net/man/8/logrotate).
266+
267+
#### Log processing in parent process causes performance problems
268+
269+
Passing logs through a parent process is a normal linux pattern used by
270+
systemd-run, docker or containerd. For kubernetes we already use go-runner in
271+
scalability testing to read apiserver logs and write them to file. Before we
272+
reach Beta we should conduct detailed throughput testing of go-runner to
273+
validate upper limit, but we don't expect any performance problem just based on
274+
architecture.
275+
276+
## Design Details
277+
278+
### Test Plan
279+
280+
Go-runner is already used for scalability tests. We should ensure that we cover
281+
all existing klog features.
282+
283+
### Graduation Criteria
284+
285+
#### Alpha
286+
287+
- Klog can be configured without registering flags
288+
- Kubernetes logging configuration drops global state
289+
- Go-runner is feature complementary to klog flags planned for deprecation
290+
- Projects in Kubernetes Org are migrated to go-runner
291+
- Add --logtostdout flag to klog disabled by default
292+
- Use --logtostdout in kubernetes tests
293+
294+
#### Beta
295+
296+
- Go-runner project is well maintained and documented
297+
- Documentation on migrating off klog flags is publicly available
298+
- Kubernetes klog flags are marked as deprecated
299+
- Enable --logtostdout in Kubernetes components by default
300+
301+
#### GA
302+
303+
- Kubernetes klog specific flags are removed (including --logtostdout)
304+
305+
### Upgrade / Downgrade Strategy
306+
307+
N/A
308+
309+
### Version Skew Strategy
310+
311+
N/A
312+
313+
## Implementation History
314+
315+
- 20/06/2021 - Original proposal created in https://github.com/kubernetes/kubernetes/issues/99270
316+
- 30/07/2021 - First KEP draft was created
317+
318+
## Drawbacks
319+
320+
Deprecating klog features outside klog might create confusion in community.
321+
Large part of community doesn't know that klog was created from necessity and
322+
is not the end goal for logging in Kubernetes. We should do due diligence to
323+
let community know about our plans and their impact on external components
324+
depending on klog.
325+
326+
## Alternatives
327+
328+
### Continue supporting all klog features
329+
At some point we should migrate all logging
330+
configuration to Options or Configuration. Doing so while supporting all klog
331+
features makes their future removal much harder.
332+
333+
### Release klog 3.0 with removed features
334+
Removal of those features cannot be done without whole k8s community instead of
335+
just k8s core components
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
title: Deprecate klog specific flags in Kubernetes components
2+
kep-number: 2845
3+
authors:
4+
- "@serathius"
5+
owning-sig: sig-instrumentation
6+
participating-sigs:
7+
- sig-arch
8+
status: provisional
9+
creation-date: 2021-07-30
10+
reviewers:
11+
- TBD
12+
approvers:
13+
- ehashman
14+
15+
see-also:
16+
- "/keps/sig-instrumentation/1602-structured-logging"
17+
replaces: []
18+
stage: alpha
19+
latest-milestone: "v1.23"
20+
milestone:
21+
alpha: "v1.23"
22+
beta: "v1.24"
23+
stable: "v1.25"
24+
25+
feature-gates: []
26+
disable-supported: true
27+
metrics: []

0 commit comments

Comments
 (0)