19
19
- [ kubelet] ( #kubelet )
20
20
- [ kubectl] ( #kubectl )
21
21
- [ Test Plan] ( #test-plan )
22
+ - [ Prerequisite testing updates] ( #prerequisite-testing-updates )
23
+ - [ Unit tests] ( #unit-tests )
24
+ - [ Integration tests] ( #integration-tests )
25
+ - [ e2e tests] ( #e2e-tests )
22
26
- [ Graduation Criteria] ( #graduation-criteria )
23
27
- [ Alpha -> ; Beta Graduation] ( #alpha---beta-graduation )
24
28
- [ Beta -> ; GA Graduation] ( #beta---ga-graduation )
@@ -96,16 +100,16 @@ This would work for:
96
100
## Proposal
97
101
98
102
### Implement client for logs endpoint viewer (OS agnostic)
99
- - Extend ` kubectl logs ` to work with node objects.
103
+ - Implement a new ` kubectl node- logs ` to work with node objects.
100
104
- Implement a client for the ` /var/log/ ` kubelet endpoint viewer.
101
105
102
106
### Linux distros with systemd / journald
103
107
Supplement the the ` /var/log/ ` endpoint viewer on the kubelet with a thin shim
104
- over the ` journal ` directory that shells out to journalctl. Then extend
105
- ` kubectl logs ` to also work with node objects.
108
+ over the ` journal ` directory that shells out to journalctl. Then implement
109
+ ` kubectl node- logs ` to also work with node objects.
106
110
107
111
### Linux distributions without systemd / journald
108
- Running the new "kubectl logs nodes " command against services on nodes that do
112
+ Running the new "kubectl node- logs" command against services on nodes that do
109
113
not use systemd / journald should return "OS not supported". However getting
110
114
logs from ` /var/log/ ` should work on all systems.
111
115
@@ -118,22 +122,22 @@ Reuse the kubelet API for querying the Linux journal for invoking the
118
122
Consider a scenario where pods / containers are refusing to come up on certain
119
123
nodes. As mentioned in the motivation section, troubleshooting this scenario
120
124
involves the cluster administrator to SSH into nodes to scan the logs. Allowing
121
- them to use ` kubectl logs ` to do the same as they would to debug issues with a
125
+ them to use ` kubectl node- logs ` to do the same as they would to debug issues with a
122
126
pod / container would greatly simply their debug workflow. This also opens up
123
127
opportunities for tooling and simplifying automated log gathering. The feature
124
128
can also be used to debug issues with Kubernetes services especially in Windows
125
129
nodes that run as native Windows services and not as DaemonSets or Deployments.
126
130
127
- Here are some example of how a cluser administrator would use this feature:
131
+ Here are some example of how a cluster administrator would use this feature:
128
132
```
129
133
# Show kubelet and crio journal logs from all masters
130
- kubectl logs nodes --role master -s kubelet -s crio
134
+ kubectl node- logs --role master -q kubelet -q crio
131
135
132
136
# Show kubelet log file (/var/log/kubelet/kubelet.log) from all Windows worker nodes
133
- kubectl logs nodes --label kubernetes.io/os=windows -s kubelet
137
+ kubectl node- logs --label kubernetes.io/os=windows -q kubelet
134
138
135
139
# Display docker runtime WinEvent log entries from a specific Windows worker node
136
- kubectl logs nodes <node-name> --service docker
140
+ kubectl node- logs <node-name> --query docker
137
141
```
138
142
139
143
### Risks and Mitigations
@@ -163,7 +167,7 @@ that is lacking a client. Given its existence we can supplement that with a
163
167
wafer thin shim over the /journal directory that shells out to journalctl. This
164
168
allows us to extend the endpoint for getting logs from the system journal on
165
169
Linux systems that support systemd. To enable filtering of logs, we can reuse
166
- the existing filters supported by journalctl. The ` kubectl logs ` will have
170
+ the existing filters supported by journalctl. The ` kubectl node- logs ` will have
167
171
command line options for specifying these filters when interacting with node
168
172
objects.
169
173
@@ -191,13 +195,13 @@ configured. Here are some examples:
191
195
The ` /var/log/ ` endpoint is enabled using the ` enableSystemLogHandler ` kubelet
192
196
configuration options. To gain access to this new feature this option needs to
193
197
be enabled. In addition when introducing this feature it will be hidden behind a
194
- ` NodeLogs ` feature gate in the kubelet that needs to be explicitly enabled. So
198
+ ` NodeLogViewer ` feature gate in the kubelet that needs to be explicitly enabled. So
195
199
you need to enable both options to get access to this new feature and disabling
196
200
` enableSystemLogHandler ` will disable the new feature irrespective of the
197
- ` NodeLogs ` feature gate.
201
+ ` NodeLogViewer ` feature gate.
198
202
199
- A reference implementation of this feature without the feature gate is
200
- available [ here] ( https://github.com/kubernetes/kubernetes/pull/96120 ) .
203
+ A reference implementation of this feature is available
204
+ [ here] ( https://github.com/kubernetes/kubernetes/pull/96120 ) .
201
205
202
206
#### kubectl
203
207
@@ -209,36 +213,42 @@ to appropriate resource type and associated endpoints, it will allow us to
209
213
restrict node logs access to only cluster administrators as long as the cluster
210
214
is setup in that manner. Access to the ` node/logs ` sub-resource needs to be
211
215
explicitly granted as a user with access to ` nodes ` will not automatically have
212
- access to ` node/logs ` .
216
+ access to ` node/logs ` . In the alpha phase the functionality will be behind
217
+ ` kubectl alpha node-logs ` sub-command. The functionality will be moved to
218
+ ` kubectl node-logs ` in the beta phase. However the examples will reference the
219
+ final destination i.e. ` kubectl node-logs ` .
213
220
214
- The ` logs ` sub-command for node objects will follow a heuristics approach when
221
+ The ` logs --query ` sub-command for node objects will follow a heuristics approach when
215
222
asked to query for logs from a Windows or Linux service. If asked to get the
216
223
logs from a service ` foobar ` , it will first assume ` foobar ` logs to the Linux
217
224
journal / Windows eventing mechanisms (Application, System, and ETW). If unable
218
225
to get logs from these, it will attempt to get logs from ` /var/log/foobar.log ` ,
219
226
` /var/log/foobar/foobar.log ` , ` /var/log/foobar*INFO ` or
220
- ` /var/log/foobar/foobar*INFO ` in that order.
227
+ ` /var/log/foobar/foobar*INFO ` in that order. Alternatively an explicit file
228
+ location can be passed to the ` --query ` option.
221
229
Here are some examples and explanation of the options that will be added.
222
230
```
223
231
Examples:
224
232
# Show kubelet logs from all masters
225
- kubectl logs nodes --role master -s kubelet
233
+ kubectl node- logs --role master -q kubelet
226
234
227
235
# Show docker logs from Windows nodes
228
- kubectl logs nodes -l kubernetes.io/os=windows -s docker
236
+ kubectl node-logs -l kubernetes.io/os=windows -q docker
237
+
238
+ # Show foo.log from Windows nodes
239
+ kubectl node-logs -l kubernetes.io/os=windows -q /foo/foo.log
229
240
230
241
Options:
231
- --case-sensitive=true: Filters are case sensitive by default. Pass --case-sensitive=false to do a case insensitive filter.
232
242
-g, --grep='': Filter log entries by the provided regex pattern. Only applies to node journal logs.
233
243
-o, --output='': Display journal logs in an alternate format (short, cat, json, short-unix). Only applies to node journal logs.
234
244
--raw=false: Perform no transformation of the returned data.
235
245
--role='': Set a label selector by node role.
236
246
-l, --selector='': Selector (label query) to filter on.
237
- --since='': Return logs after a specific ISO timestamp or relative date. Only applies to node journal or Get-WinEvent logs .
238
- --tail=0 : Return up to this many lines (not more than 100k) from the end of the log. Only applies to node journal or Get-WinEvent logs .
247
+ --since-time ='': Return logs after a specific ISO timestamp.
248
+ --tail=-1 : Return up to this many lines (not more than 100k) from the end of the log.
239
249
--sort=timestamp: Interleave logs by sorting the output. Defaults on when viewing node journal logs.
240
- -s , --service =[]: Return log entries from the specified service(s).
241
- --until='': Return logs before a specific ISO timestamp or relative date .
250
+ -q , --query =[]: Return log entries that matches any of the specified service(s).
251
+ --until-time ='': Return logs before a specific ISO timestamp.
242
252
```
243
253
244
254
The ` --sort=timestamp ` feature will introduce log unification across node
@@ -247,43 +257,78 @@ to see logs across nodes from the same time. Similarly for pods, it will allow
247
257
seeing logs across containers aligned by time.
248
258
249
259
Given that the feature will be introduced behind a feature gate, by default
250
- ` kubectl logs nodes ` will return a feature not enabled message. When the
251
- feature is enabled in alpha phase, ` kubectl logs nodes ` will display a
252
- warning message that the feature is in alpha. When the ` --service ` option
260
+ ` kubectl node- logs ` will return a functionality not available message. When the
261
+ feature is enabled in alpha phase, ` kubectl node- logs ` will display a
262
+ warning message that the feature is in alpha. When the ` --query ` option
253
263
is used against Linux nodes that do not support systemd/journald and the service
254
- does not log to ` /var/log ` , an OS not supported message will be returned.
264
+ does not log to ` /var/log ` , the same functionality not available message will be
265
+ returned.
255
266
256
267
### Test Plan
268
+
269
+ [ x] I/we understand the owners of the involved components may require updates to
270
+ existing tests to make this code solid enough prior to committing the changes necessary
271
+ to implement this enhancement.
272
+
273
+ ##### Prerequisite testing updates
274
+
275
+ ##### Unit tests
276
+
257
277
Add unit tests to kubelet and kubectl that exercise the new arguments that
258
278
have been added. A reference implementation of the tests can be seen
259
- [ here] ( https://github.com/kubernetes/kubernetes/pull/96120/commits/c606a38ec38ccfe486033495a1dc433279ce71f8#diff-1d703a87c6d6156adf2d0785ec0174bb365855d4883f5758c05fda1fee8f7f1bR1 )
279
+ [ here] ( https://github.com/kubernetes/kubernetes/pull/96120/commits/253dbad91a3896680da74da32595f02120f56cfa#diff-1d703a87c6d6156adf2d0785ec0174bb365855d4883f5758c05fda1fee8f7f1b )
280
+
281
+ Given that a new kubelet package is introduced as part of this feature there is
282
+ no existing test coverage to link to.
283
+
284
+ ##### Integration tests
285
+
286
+ Given that we need the kubelet running locally to test this feature, integration
287
+ tests will not be possible for this feature.
288
+
289
+ ##### e2e tests
290
+
291
+ We will add a test that query the kubelet service logs on Windows and Linux nodes.
292
+ On Windows node, the same kubelet service logs will queried by explicitly
293
+ specifying the log file. In Linux the explicit log file query will be tested by
294
+ querying a random file in present in /var/log.
295
+
296
+ On the Linux side tests will be added to [ kubelet node] ( https://github.com/kubernetes/kubernetes/blob/master/test/e2e/node/kubelet.go )
297
+ e2e tests. For Windows a new set of tests will be added to the existing
298
+ [ e2e tests] ( https://github.com/kubernetes/kubernetes/tree/master/test/e2e/windows ) .
299
+
300
+ - node: https://storage.googleapis.com/k8s-triage/index.html?sig=node
301
+ - windows: https://storage.googleapis.com/k8s-triage/index.html?sig=windows
260
302
261
303
### Graduation Criteria
262
304
263
- The plan is to introduce the feature as alpha in the v1.22 time frame behind the
264
- ` NodeLogs ` feature gate.
305
+ The plan is to introduce the feature as alpha in the v1.25 time frame behind the
306
+ ` NodeLogViewer ` kubelet feature gate and using the ` kubectl alpha node-logs `
307
+ sub-command.
265
308
266
309
#### Alpha -> Beta Graduation
267
310
268
- The plan is to graduate the feature to beta in the v1.23 time frame. At that
311
+ The plan is to graduate the feature to beta in the v1.26 time frame. At that
269
312
point we would have collected feedback from cluster administrators and
270
313
developers who have enabled the feature. Based on this feedback and issues
271
314
opened we should consider adding a kubelet side throttle for the viewing the
272
315
logs. In addition we will garner feedback on the heuristic approach and based on
273
316
that we will decide if we need introduce options to explicitly differentiate
274
317
between file vs journal / WinEvent logs.
275
318
319
+ The kubectl implementation will move from ` kubectl alpha node-logs ` to
320
+ ` kubectl node-logs ` .
276
321
#### Beta -> GA Graduation
277
322
278
- The plan is to graduate the feature to GA in the v1.24 time frame at which point
323
+ The plan is to graduate the feature to GA in the v1.27 time frame at which point
279
324
any major issues should have been surfaced and addressed during the alpha and
280
325
beta phases.
281
326
282
327
### Upgrade / Downgrade Strategy
283
328
284
329
### Version Skew Strategy
285
330
286
- If a kubectl version that has the new ` logs nodes ` option is used against a node
331
+ If a kubectl version that has the new ` node- logs` option is used against a node
287
332
that is using a kubelet that does not have the extended ` /var/log ` endpoint
288
333
viewer, the result should be "feature not supported".
289
334
@@ -293,13 +338,13 @@ viewer, the result should be "feature not supported".
293
338
294
339
* ** How can this feature be enabled / disabled in a live cluster?**
295
340
- [x] Feature gate
296
- - Feature gate name: NodeLogs
341
+ - Feature gate name: NodeLogViewer
297
342
- Components depending on the feature gate: kubelet
298
343
299
344
* ** Does enabling the feature change any default behavior?** No
300
345
301
346
* ** Can the feature be disabled once it has been enabled (i.e. can we roll back
302
- the enablement)?** Yes. It can be disabled by disabling the ` NodeLogs ` feature
347
+ the enablement)?** Yes. It can be disabled by disabling the ` NodeLogViewer ` feature
303
348
gate in the kubelet.
304
349
305
350
* ** What happens if we reenable the feature if it was previously rolled back?**
@@ -373,5 +418,5 @@ logs. The Windows side would require privileged container support. However this
373
418
would not help scenarios where containers are not launching successfully on the
374
419
nodes.
375
420
376
- For the kubectl changes an alternative to extending ` kubect logs` would be to
377
- introduce a plugin or add a new sub-command under ` kubectl alpha ` .
421
+ For the kubectl changes an alternative to introducing ` kubectl node- logs` would be to
422
+ introduce a plugin.
0 commit comments