Skip to content

Commit 64e639a

Browse files
committed
Address PRR and other review comments
1 parent 84adca7 commit 64e639a

File tree

1 file changed

+24
-19
lines changed

1 file changed

+24
-19
lines changed

keps/sig-node/2400-node-swap/README.md

Lines changed: 24 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,7 @@ cgroupsv2 improved memory management algos, such as oomd, currently require
164164
swap. Hence, having a small amount of swap available on nodes could improve
165165
better resource pressure handling and recovery.
166166

167+
- https://man7.org/linux/man-pages/man8/systemd-oomd.service.8.html
167168
- https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#id1
168169
- https://chrisdown.name/2018/01/02/in-defence-of-swap.html
169170
- https://media.ccc.de/v/ASG2018-175-oomd
@@ -258,10 +259,10 @@ per-workload parameter, and `swappiness` is optional and can be set globally,
258259
we are choosing to only expose `memory-swap` which will adjust swap available
259260
to workloads.
260261

261-
Since we are not currently setting `memory-swap` in the CRI, the default
262-
behaviour is to allocate the same amount of swap for a workload as memory
263-
requested. We will update the default to not permit the use of swap by setting
264-
`memory-swap` equal to `limit`.
262+
Since we are not currently setting `memory-swap` in the CRI, the current
263+
default behaviour when `--fail-swap-on=false` is set is to allocate the same
264+
amount of swap for a workload as memory requested. We will update the default
265+
to not permit the use of swap by setting `memory-swap` equal to `limit`.
265266

266267
[runtime specification]: https://github.com/opencontainers/runtime-spec/blob/1c3f411f041711bbeecf35ff7e93461ea6789220/config-linux.md#memory
267268

@@ -347,9 +348,9 @@ type LimitedSwapConfiguration struct {
347348
}
348349
```
349350

350-
The `MemorySwapConfiguration.SwapBehavior` setting will have the following
351-
effects, based on the [Docker] and open container specification for the
352-
`--memory-swap` flag:
351+
We want to expose all possible swap settings based on the [Docker] and open
352+
container specification for the `--memory-swap` flag. Thus, the
353+
`MemorySwapConfiguration.SwapBehavior` setting will have the following effects:
353354

354355
* If `SwapBehavior` is not set or set to `"NoSwap"`, containers do not have
355356
access to swap. This value effectively prevents a container from using swap,
@@ -361,7 +362,7 @@ effects, based on the [Docker] and open container specification for the
361362
and swap.
362363
* If `SwapBehavior` is set to `"UnlimitedSwap"`, the container is allowed to
363364
use unlimited swap, up to the maximum amount available on the host system.
364-
* If `SwapBehavior` is set to a `"LimitedSwap"`, then the `LimitedSwap`
365+
* If `SwapBehavior` is set to `"LimitedSwap"`, then the `LimitedSwap`
365366
configuration must also be set. `LimitedSwap.PerWorkloadMemorySwapLimit`
366367
represents the system-wide maximum limit for combined memory and swap usage
367368
of a container. For example, if the limit is set to `1Gi`:
@@ -387,13 +388,9 @@ swap usage in container runtimes. We will introduce a parameter
387388
// resources.
388389
message LinuxContainerResources {
389390
...
390-
// Memory limit in bytes. Default: 0 (not specified).
391-
int64 memory_limit_in_bytes = 4;
392391
// Memory + swap limit in bytes. Default: 0 (not specified).
393392
int64 memory_swap_limit_in_bytes = 9;
394393
...
395-
// List of HugepageLimits to limit the HugeTLB usage of container per page size. Default: nil (not specified).
396-
repeated HugepageLimit hugepage_limits = 8;
397394
}
398395
```
399396

@@ -511,12 +508,16 @@ Pick one of these and delete the rest.
511508
- [x] Feature gate (also fill in values in `kep.yaml`)
512509
- Feature gate name: NodeSwapEnabled
513510
- Components depending on the feature gate: API Server, Kubelet
514-
- [ ] Other
515-
- Describe the mechanism:
511+
- [x] Other
512+
- Describe the mechanism: `--fail-swap-on=false` flag for kubelet must also
513+
be set at kubelet start
516514
- Will enabling / disabling the feature require downtime of the control
517-
plane?
515+
plane? Yes. Flag must be set on kubelet start. To disable, kubelet must be
516+
restarted. Hence, there would be brief control component downtime on a
517+
given node.
518518
- Will enabling / disabling the feature require downtime or reprovisioning
519519
of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled).
520+
Yes. See above; disabling would require brief node downtime.
520521

521522
###### Does enabling the feature change any default behavior?
522523

@@ -541,10 +542,14 @@ feature, can it break the existing applications?).
541542
NOTE: Also set `disable-supported` to `true` or `false` in `kep.yaml`.
542543
-->
543544

544-
No. I don’t think it makes much sense to be able to provide users a meaningful
545-
ability to disable the feature flag at runtime, as this would be highly
546-
disruptive to workloads and difficult to implement. To turn this off, the
547-
kubelet would need to be restarted.
545+
No. The feature flag can be disabled while the `--fail-swap-on=false` flag is
546+
set, but this would result in undefined behaviour.
547+
548+
To turn this off, the kubelet would need to be restarted. If a cluster admin
549+
wants to disable swap on the node without repartitioning the node, they could
550+
stop the kubelet, set `swapoff` on the node, and restart the kubelet with
551+
`--fail-swap-on=true`. The setting of the feature flag will be ignored in this
552+
case.
548553

549554
###### What happens if we reenable the feature if it was previously rolled back?
550555

0 commit comments

Comments
 (0)