Skip to content

Commit 3e4fc78

Browse files
authored
Merge pull request #29060 from ehashman/swap-blog
1.22 feature blog for alpha swap support
2 parents 9c7c238 + 39e39c0 commit 3e4fc78

File tree

1 file changed

+142
-0
lines changed

1 file changed

+142
-0
lines changed
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
---
2+
layout: blog
3+
title: 'New in Kubernetes v1.22: alpha support for using swap memory'
4+
date: 2021-08-09
5+
slug: run-nodes-with-swap-alpha
6+
---
7+
8+
**Author:** Elana Hashman (Red Hat)
9+
10+
The 1.22 release introduced alpha support for configuring swap memory usage for
11+
Kubernetes workloads on a per-node basis.
12+
13+
In prior releases, Kubernetes did not support the use of swap memory on Linux,
14+
as it is difficult to provide guarantees and account for pod memory utilization
15+
when swap is involved. As part of Kubernetes' earlier design, swap support was
16+
considered out of scope, and a kubelet would by default fail to start if swap
17+
was detected on a node.
18+
19+
However, there are a number of [use cases](https://github.com/kubernetes/enhancements/blob/9d127347773ad19894ca488ee04f1cd3af5774fc/keps/sig-node/2400-node-swap/README.md#user-stories)
20+
that would benefit from Kubernetes nodes supporting swap, including improved
21+
node stability, better support for applications with high memory overhead but
22+
smaller working sets, the use of memory-constrained devices, and memory
23+
flexibility.
24+
25+
Hence, over the past two releases, [SIG Node](https://github.com/kubernetes/community/tree/master/sig-node#readme) has
26+
been working to gather appropriate use cases and feedback, and propose a design
27+
for adding swap support to nodes in a controlled, predictable manner so that
28+
Kubernetes users can perform testing and provide data to continue building
29+
cluster capabilities on top of swap. The alpha graduation of swap memory
30+
support for nodes is our first milestone towards this goal!
31+
32+
## How does it work?
33+
34+
There are a number of possible ways that one could envision swap use on a node.
35+
To keep the scope manageable for this initial implementation, when swap is
36+
already provisioned and available on a node, [we have proposed](https://github.com/kubernetes/enhancements/blob/9d127347773ad19894ca488ee04f1cd3af5774fc/keps/sig-node/2400-node-swap/README.md#proposal)
37+
the kubelet should be able to be configured such that:
38+
39+
- It can start with swap on.
40+
- It will direct the Container Runtime Interface to allocate zero swap memory
41+
to Kubernetes workloads by default.
42+
- You can configure the kubelet to specify swap utilization for the entire
43+
node.
44+
45+
Swap configuration on a node is exposed to a cluster admin via the
46+
[`memorySwap` in the KubeletConfiguration](/docs/reference/config-api/kubelet-config.v1beta1/).
47+
As a cluster administrator, you can specify the node's behaviour in the
48+
presence of swap memory by setting `memorySwap.swapBehavior`.
49+
50+
This is possible through the addition of a `memory_swap_limit_in_bytes` field
51+
to the container runtime interface (CRI). The kubelet's config will control how
52+
much swap memory the kubelet instructs the container runtime to allocate to
53+
each container via the CRI. The container runtime will then write the swap
54+
settings to the container level cgroup.
55+
56+
## How do I use it?
57+
58+
On a node where swap memory is already provisioned, Kubernetes use of swap on a
59+
node can be enabled by enabling the `NodeSwap` feature gate on the kubelet, and
60+
disabling the `failSwapOn` [configuration setting](/docs/reference/config-api/kubelet-config.v1beta1/#kubelet-config-k8s-io-v1beta1-KubeletConfiguration)
61+
or the `--fail-swap-on` command line flag.
62+
63+
You can also optionally configure `memorySwap.swapBehavior` in order to
64+
specify how a node will use swap memory. For example,
65+
66+
```yaml
67+
memorySwap:
68+
swapBehavior: LimitedSwap
69+
```
70+
71+
The available configuration options for `swapBehavior` are:
72+
73+
- `LimitedSwap` (default): Kubernetes workloads are limited in how much swap
74+
they can use. Workloads on the node not managed by Kubernetes can still swap.
75+
- `UnlimitedSwap`: Kubernetes workloads can use as much swap memory as they
76+
request, up to the system limit.
77+
78+
If configuration for `memorySwap` is not specified and the feature gate is
79+
enabled, by default the kubelet will apply the same behaviour as the
80+
`LimitedSwap` setting.
81+
82+
The behaviour of the `LimitedSwap` setting depends if the node is running with
83+
v1 or v2 of control groups (also known as "cgroups"):
84+
85+
- **cgroups v1:** Kubernetes workloads can use any combination of memory and
86+
swap, up to the pod's memory limit, if set.
87+
- **cgroups v2:** Kubernetes workloads cannot use swap memory.
88+
89+
### Caveats
90+
91+
Having swap available on a system reduces predictability. Swap's performance is
92+
worse than regular memory, sometimes by many orders of magnitude, which can
93+
cause unexpected performance regressions. Furthermore, swap changes a system's
94+
behaviour under memory pressure, and applications cannot directly control what
95+
portions of their memory usage are swapped out. Since enabling swap permits
96+
greater memory usage for workloads in Kubernetes that cannot be predictably
97+
accounted for, it also increases the risk of noisy neighbours and unexpected
98+
packing configurations, as the scheduler cannot account for swap memory usage.
99+
100+
The performance of a node with swap memory enabled depends on the underlying
101+
physical storage. When swap memory is in use, performance will be significantly
102+
worse in an I/O operations per second (IOPS) constrained environment, such as a
103+
cloud VM with I/O throttling, when compared to faster storage mediums like
104+
solid-state drives or NVMe.
105+
106+
Hence, we do not recommend the use of swap for certain performance-constrained
107+
workloads or environments. Cluster administrators and developers should
108+
benchmark their nodes and applications before using swap in production
109+
scenarios, and [we need your help](#how-do-i-get-involved) with that!
110+
111+
## Looking ahead
112+
113+
The Kubernetes 1.22 release introduces alpha support for swap memory on nodes,
114+
and we will continue to work towards beta graduation in the 1.23 release. This
115+
will include:
116+
117+
* Adding support for controlling swap consumption at the Pod level via cgroups.
118+
* This will include the ability to set a system-reserved quantity of swap
119+
from what kubelet detects on the host.
120+
* Determining a set of metrics for node QoS in order to evaluate the
121+
performance and stability of nodes with and without swap enabled.
122+
* Collecting feedback from test user cases.
123+
* We will consider introducing new configuration modes for swap, such as a
124+
node-wide swap limit for workloads.
125+
126+
## How can I learn more?
127+
128+
You can review the current [documentation](https://kubernetes.io/docs/concepts/architecture/nodes/#swap-memory)
129+
on the Kubernetes website.
130+
131+
For more information, and to assist with testing and provide feedback, please
132+
see [KEP-2400](https://github.com/kubernetes/enhancements/issues/2400) and its
133+
[design proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md).
134+
135+
## How do I get involved?
136+
137+
Your feedback is always welcome! SIG Node [meets regularly](https://github.com/kubernetes/community/tree/master/sig-node#meetings)
138+
and [can be reached](https://github.com/kubernetes/community/tree/master/sig-node#contact)
139+
via [Slack](https://slack.k8s.io/) (channel **#sig-node**), or the SIG's
140+
[mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-node).
141+
Feel free to reach out to me, Elana Hashman (**@ehashman** on Slack and GitHub)
142+
if you'd like to help.

0 commit comments

Comments
 (0)