Skip to content

Commit 97f9cf3

Browse files
authored
Merge pull request kubernetes#3341 from aravindhp/kep-2258-update-api-milestone-1.25
Windows/KEP-2258: Update API and milestones for Node Log Viewer
2 parents fba79ff + 185a441 commit 97f9cf3

File tree

2 files changed

+87
-43
lines changed

2 files changed

+87
-43
lines changed

keps/sig-windows/2258-node-service-log-viewer/README.md

Lines changed: 83 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@
1919
- [kubelet](#kubelet)
2020
- [kubectl](#kubectl)
2121
- [Test Plan](#test-plan)
22+
- [Prerequisite testing updates](#prerequisite-testing-updates)
23+
- [Unit tests](#unit-tests)
24+
- [Integration tests](#integration-tests)
25+
- [e2e tests](#e2e-tests)
2226
- [Graduation Criteria](#graduation-criteria)
2327
- [Alpha -> Beta Graduation](#alpha---beta-graduation)
2428
- [Beta -> GA Graduation](#beta---ga-graduation)
@@ -96,16 +100,16 @@ This would work for:
96100
## Proposal
97101

98102
### Implement client for logs endpoint viewer (OS agnostic)
99-
- Extend `kubectl logs` to work with node objects.
103+
- Implement a new `kubectl node-logs` to work with node objects.
100104
- Implement a client for the `/var/log/` kubelet endpoint viewer.
101105

102106
### Linux distros with systemd / journald
103107
Supplement the the `/var/log/` endpoint viewer on the kubelet with a thin shim
104-
over the `journal` directory that shells out to journalctl. Then extend
105-
`kubectl logs` to also work with node objects.
108+
over the `journal` directory that shells out to journalctl. Then implement
109+
`kubectl node-logs` to also work with node objects.
106110

107111
### Linux distributions without systemd / journald
108-
Running the new "kubectl logs nodes" command against services on nodes that do
112+
Running the new "kubectl node-logs" command against services on nodes that do
109113
not use systemd / journald should return "OS not supported". However getting
110114
logs from `/var/log/` should work on all systems.
111115

@@ -118,22 +122,22 @@ Reuse the kubelet API for querying the Linux journal for invoking the
118122
Consider a scenario where pods / containers are refusing to come up on certain
119123
nodes. As mentioned in the motivation section, troubleshooting this scenario
120124
involves the cluster administrator to SSH into nodes to scan the logs. Allowing
121-
them to use `kubectl logs` to do the same as they would to debug issues with a
125+
them to use `kubectl node-logs` to do the same as they would to debug issues with a
122126
pod / container would greatly simply their debug workflow. This also opens up
123127
opportunities for tooling and simplifying automated log gathering. The feature
124128
can also be used to debug issues with Kubernetes services especially in Windows
125129
nodes that run as native Windows services and not as DaemonSets or Deployments.
126130

127-
Here are some example of how a cluser administrator would use this feature:
131+
Here are some example of how a cluster administrator would use this feature:
128132
```
129133
# Show kubelet and crio journal logs from all masters
130-
kubectl logs nodes --role master -s kubelet -s crio
134+
kubectl node-logs --role master -q kubelet -q crio
131135
132136
# Show kubelet log file (/var/log/kubelet/kubelet.log) from all Windows worker nodes
133-
kubectl logs nodes --label kubernetes.io/os=windows -s kubelet
137+
kubectl node-logs --label kubernetes.io/os=windows -q kubelet
134138
135139
# Display docker runtime WinEvent log entries from a specific Windows worker node
136-
kubectl logs nodes <node-name> --service docker
140+
kubectl node-logs <node-name> --query docker
137141
```
138142

139143
### Risks and Mitigations
@@ -163,7 +167,7 @@ that is lacking a client. Given its existence we can supplement that with a
163167
wafer thin shim over the /journal directory that shells out to journalctl. This
164168
allows us to extend the endpoint for getting logs from the system journal on
165169
Linux systems that support systemd. To enable filtering of logs, we can reuse
166-
the existing filters supported by journalctl. The `kubectl logs` will have
170+
the existing filters supported by journalctl. The `kubectl node-logs` will have
167171
command line options for specifying these filters when interacting with node
168172
objects.
169173

@@ -191,13 +195,13 @@ configured. Here are some examples:
191195
The `/var/log/` endpoint is enabled using the `enableSystemLogHandler` kubelet
192196
configuration options. To gain access to this new feature this option needs to
193197
be enabled. In addition when introducing this feature it will be hidden behind a
194-
`NodeLogs` feature gate in the kubelet that needs to be explicitly enabled. So
198+
`NodeLogViewer` feature gate in the kubelet that needs to be explicitly enabled. So
195199
you need to enable both options to get access to this new feature and disabling
196200
`enableSystemLogHandler` will disable the new feature irrespective of the
197-
`NodeLogs` feature gate.
201+
`NodeLogViewer` feature gate.
198202

199-
A reference implementation of this feature without the feature gate is
200-
available [here](https://github.com/kubernetes/kubernetes/pull/96120).
203+
A reference implementation of this feature is available
204+
[here](https://github.com/kubernetes/kubernetes/pull/96120).
201205

202206
#### kubectl
203207

@@ -209,36 +213,42 @@ to appropriate resource type and associated endpoints, it will allow us to
209213
restrict node logs access to only cluster administrators as long as the cluster
210214
is setup in that manner. Access to the `node/logs` sub-resource needs to be
211215
explicitly granted as a user with access to `nodes` will not automatically have
212-
access to `node/logs`.
216+
access to `node/logs`. In the alpha phase the functionality will be behind
217+
`kubectl alpha node-logs` sub-command. The functionality will be moved to
218+
`kubectl node-logs` in the beta phase. However the examples will reference the
219+
final destination i.e. `kubectl node-logs`.
213220

214-
The `logs` sub-command for node objects will follow a heuristics approach when
221+
The `logs --query` sub-command for node objects will follow a heuristics approach when
215222
asked to query for logs from a Windows or Linux service. If asked to get the
216223
logs from a service `foobar`, it will first assume `foobar` logs to the Linux
217224
journal / Windows eventing mechanisms (Application, System, and ETW). If unable
218225
to get logs from these, it will attempt to get logs from `/var/log/foobar.log`,
219226
`/var/log/foobar/foobar.log`, `/var/log/foobar*INFO` or
220-
`/var/log/foobar/foobar*INFO` in that order.
227+
`/var/log/foobar/foobar*INFO` in that order. Alternatively an explicit file
228+
location can be passed to the `--query` option.
221229
Here are some examples and explanation of the options that will be added.
222230
```
223231
Examples:
224232
# Show kubelet logs from all masters
225-
kubectl logs nodes --role master -s kubelet
233+
kubectl node-logs --role master -q kubelet
226234
227235
# Show docker logs from Windows nodes
228-
kubectl logs nodes -l kubernetes.io/os=windows -s docker
236+
kubectl node-logs -l kubernetes.io/os=windows -q docker
237+
238+
# Show foo.log from Windows nodes
239+
kubectl node-logs -l kubernetes.io/os=windows -q /foo/foo.log
229240
230241
Options:
231-
--case-sensitive=true: Filters are case sensitive by default. Pass --case-sensitive=false to do a case insensitive filter.
232242
-g, --grep='': Filter log entries by the provided regex pattern. Only applies to node journal logs.
233243
-o, --output='': Display journal logs in an alternate format (short, cat, json, short-unix). Only applies to node journal logs.
234244
--raw=false: Perform no transformation of the returned data.
235245
--role='': Set a label selector by node role.
236246
-l, --selector='': Selector (label query) to filter on.
237-
--since='': Return logs after a specific ISO timestamp or relative date. Only applies to node journal or Get-WinEvent logs.
238-
--tail=0: Return up to this many lines (not more than 100k) from the end of the log. Only applies to node journal or Get-WinEvent logs.
247+
--since-time='': Return logs after a specific ISO timestamp.
248+
--tail=-1: Return up to this many lines (not more than 100k) from the end of the log.
239249
--sort=timestamp: Interleave logs by sorting the output. Defaults on when viewing node journal logs.
240-
-s, --service=[]: Return log entries from the specified service(s).
241-
--until='': Return logs before a specific ISO timestamp or relative date.
250+
-q, --query=[]: Return log entries that matches any of the specified service(s).
251+
--until-time='': Return logs before a specific ISO timestamp.
242252
```
243253

244254
The `--sort=timestamp` feature will introduce log unification across node
@@ -247,43 +257,78 @@ to see logs across nodes from the same time. Similarly for pods, it will allow
247257
seeing logs across containers aligned by time.
248258

249259
Given that the feature will be introduced behind a feature gate, by default
250-
`kubectl logs nodes` will return a feature not enabled message. When the
251-
feature is enabled in alpha phase, `kubectl logs nodes` will display a
252-
warning message that the feature is in alpha. When the `--service` option
260+
`kubectl node-logs` will return a functionality not available message. When the
261+
feature is enabled in alpha phase, `kubectl node-logs` will display a
262+
warning message that the feature is in alpha. When the `--query` option
253263
is used against Linux nodes that do not support systemd/journald and the service
254-
does not log to `/var/log`, an OS not supported message will be returned.
264+
does not log to `/var/log`, the same functionality not available message will be
265+
returned.
255266

256267
### Test Plan
268+
269+
[x] I/we understand the owners of the involved components may require updates to
270+
existing tests to make this code solid enough prior to committing the changes necessary
271+
to implement this enhancement.
272+
273+
##### Prerequisite testing updates
274+
275+
##### Unit tests
276+
257277
Add unit tests to kubelet and kubectl that exercise the new arguments that
258278
have been added. A reference implementation of the tests can be seen
259-
[here](https://github.com/kubernetes/kubernetes/pull/96120/commits/c606a38ec38ccfe486033495a1dc433279ce71f8#diff-1d703a87c6d6156adf2d0785ec0174bb365855d4883f5758c05fda1fee8f7f1bR1)
279+
[here](https://github.com/kubernetes/kubernetes/pull/96120/commits/253dbad91a3896680da74da32595f02120f56cfa#diff-1d703a87c6d6156adf2d0785ec0174bb365855d4883f5758c05fda1fee8f7f1b)
280+
281+
Given that a new kubelet package is introduced as part of this feature there is
282+
no existing test coverage to link to.
283+
284+
##### Integration tests
285+
286+
Given that we need the kubelet running locally to test this feature, integration
287+
tests will not be possible for this feature.
288+
289+
##### e2e tests
290+
291+
We will add a test that query the kubelet service logs on Windows and Linux nodes.
292+
On Windows node, the same kubelet service logs will queried by explicitly
293+
specifying the log file. In Linux the explicit log file query will be tested by
294+
querying a random file in present in /var/log.
295+
296+
On the Linux side tests will be added to [kubelet node](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/node/kubelet.go)
297+
e2e tests. For Windows a new set of tests will be added to the existing
298+
[e2e tests](https://github.com/kubernetes/kubernetes/tree/master/test/e2e/windows).
299+
300+
- node: https://storage.googleapis.com/k8s-triage/index.html?sig=node
301+
- windows: https://storage.googleapis.com/k8s-triage/index.html?sig=windows
260302

261303
### Graduation Criteria
262304

263-
The plan is to introduce the feature as alpha in the v1.22 time frame behind the
264-
`NodeLogs` feature gate.
305+
The plan is to introduce the feature as alpha in the v1.25 time frame behind the
306+
`NodeLogViewer` kubelet feature gate and using the `kubectl alpha node-logs`
307+
sub-command.
265308

266309
#### Alpha -> Beta Graduation
267310

268-
The plan is to graduate the feature to beta in the v1.23 time frame. At that
311+
The plan is to graduate the feature to beta in the v1.26 time frame. At that
269312
point we would have collected feedback from cluster administrators and
270313
developers who have enabled the feature. Based on this feedback and issues
271314
opened we should consider adding a kubelet side throttle for the viewing the
272315
logs. In addition we will garner feedback on the heuristic approach and based on
273316
that we will decide if we need introduce options to explicitly differentiate
274317
between file vs journal / WinEvent logs.
275318

319+
The kubectl implementation will move from `kubectl alpha node-logs` to
320+
`kubectl node-logs`.
276321
#### Beta -> GA Graduation
277322

278-
The plan is to graduate the feature to GA in the v1.24 time frame at which point
323+
The plan is to graduate the feature to GA in the v1.27 time frame at which point
279324
any major issues should have been surfaced and addressed during the alpha and
280325
beta phases.
281326

282327
### Upgrade / Downgrade Strategy
283328

284329
### Version Skew Strategy
285330

286-
If a kubectl version that has the new `logs nodes` option is used against a node
331+
If a kubectl version that has the new `node-logs` option is used against a node
287332
that is using a kubelet that does not have the extended `/var/log` endpoint
288333
viewer, the result should be "feature not supported".
289334

@@ -293,13 +338,13 @@ viewer, the result should be "feature not supported".
293338

294339
* **How can this feature be enabled / disabled in a live cluster?**
295340
- [x] Feature gate
296-
- Feature gate name: NodeLogs
341+
- Feature gate name: NodeLogViewer
297342
- Components depending on the feature gate: kubelet
298343

299344
* **Does enabling the feature change any default behavior?** No
300345

301346
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
302-
the enablement)?** Yes. It can be disabled by disabling the `NodeLogs` feature
347+
the enablement)?** Yes. It can be disabled by disabling the `NodeLogViewer` feature
303348
gate in the kubelet.
304349

305350
* **What happens if we reenable the feature if it was previously rolled back?**
@@ -373,5 +418,5 @@ logs. The Windows side would require privileged container support. However this
373418
would not help scenarios where containers are not launching successfully on the
374419
nodes.
375420

376-
For the kubectl changes an alternative to extending `kubect logs` would be to
377-
introduce a plugin or add a new sub-command under `kubectl alpha`.
421+
For the kubectl changes an alternative to introducing `kubectl node-logs` would be to
422+
introduce a plugin.

keps/sig-windows/2258-node-service-log-viewer/kep.yaml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ status: implementable
1212
reviewers:
1313
- "@marosset"
1414
- "@immuzz"
15+
- "@thockin"
1516
approvers:
1617
- "@marosset"
1718
prr-approvers:
@@ -24,18 +25,16 @@ stage: alpha
2425
# The most recent milestone for which work toward delivery of this KEP has been
2526
# done. This can be the current (upcoming) milestone, if it is being actively
2627
# worked on.
27-
latest-milestone: "v1.24"
28+
latest-milestone: "v1.25"
2829

2930
# The milestone at which this feature was, or is targeted to be, at each stage.
3031
milestone:
31-
alpha: "v1.24"
32-
beta: "v1.25"
33-
stable: "v1.26"
32+
alpha: "v1.25"
3433

3534
# The following PRR answers are required at alpha release
3635
# List the feature gate name and the components for which it must be enabled
3736
feature-gates:
38-
- name: NodeLogs
37+
- name: NodeLogViewer
3938
components:
4039
- kubelet
4140
disable-supported: true

0 commit comments

Comments
 (0)