@@ -220,13 +220,108 @@ Since swap provisioning is out of scope of this proposal, this enhancement poses
220
220
221
221
## Design Details
222
222
223
- \[ In progress\]
224
-
225
- Need to add specifics here for:
226
-
227
- - Changes to ` --fail-on-swap ` flag
228
- - CRI config details
229
- - Where changes will need to be made so that dockershim and the CRI are consistent with swap control
223
+ ### TL;DR
224
+
225
+ In a nutshell, the following implementation are planned for Memory Swap Support
226
+ in 1.22 GKE alpha
227
+
228
+ 1 . Having a feature gate ` SupportNodeMemorySwap ` guarding against the memory
229
+ swap support feature
230
+ 2 . Keep the default value of kubelet flag ` --fail-on-swap ` to ` true ` in order
231
+ to minimize the blast radius
232
+ 3 . Introducing two new kubelet config ` MemorySwapLimit ` and ` Swappiness `
233
+ 4 . Introducing two new CRI parameter ` memory_swap_limit_in_bytes ` and ` memory_swappiness `
234
+ 5 . End to end wiring from kubelet config file to CRI
235
+
236
+ ### Expected User Behaviour
237
+
238
+ For alpha, the feature gate ` SupportNodeMemorySwap ` is default to disabled, and
239
+ ` --fail-on-swap ` flag value is the same as 1.21. Therefore, from Kubernetes
240
+ user’s perspective, no behavior changes out of the box.
241
+
242
+ For users that are ready to explore the Memory Swap feature in 1.22 Alpha, they
243
+ will need to complete the following steps
244
+
245
+ 1 . provision swap enable ` SupportNodeMemorySwap ` flag AND
246
+ 2 . set ` --fail-on-swap ` flag to ` false `
247
+
248
+ Then, the user can start experimenting/fine tuning kubelet configuration
249
+ ` MemorySwapLimit ` and/or ` Swappiness ` and observe the changes.
250
+
251
+ ### New Kubelet Configuration
252
+
253
+ We will be introducing two new parameters to ` KubeletConfiguration struct `
254
+ defined in
255
+ [ https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/types.go ] ( https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/types.go ) .
256
+ These two configurations, if set, will apply to every container of the Node
257
+ where kubelet is running.
258
+
259
+ | Name| Description| Default Value| Feature Gate|
260
+ | --- | --- | --- | --- |
261
+ | MemorySwapLimit| This parameter sets total memory limit (memory + swap). This limits the total amount of memory this container is allowed to swap to disk.| -2, which enable disable swap| SupportNodeMemorySwap|
262
+ | MemorySwappiness| This configuration sets how aggressively the kernel will swap memory pages. By default, the host kernel can swap out a percentage of anonymous pages used by a container. Users can set value between 0 and 100, to tune this percentage.| Unset, which will use host value| SupportNodeMemorySwap|
263
+
264
+ #### MemorySwapLimit details
265
+
266
+ MemorySwapLimit configuration is a kubelet flag that only takes effect on a
267
+ container that has a memory limit set, either explicitly from
268
+ [ PodSpec] ([ https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits ] ( https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits )
269
+ ) or implicitly from [ Resource
270
+ Quota] ([ https://kubernetes.io/docs/concepts/policy/resource-quotas/ ] ( https://kubernetes.io/docs/concepts/policy/resource-quotas/ )
271
+ ).
272
+
273
+ For container with memory limit set, MemorySwapLimit setting will have the
274
+ following effects, [ similar to
275
+ docker] ( https://docs.docker.com/config/containers/resource_constraints/#--memory-swap-details )
276
+
277
+ * If MemorySwapLimit is set to a positive integer,
278
+ * If the memory limit of the container is greater or equal to
279
+ MemorySwapLimit, then no swap is allowed, the container does not have
280
+ access to swap.
281
+ * If the memory limit of the container is less than MemorySwapLimit, then
282
+ MemorySwapLimit represents the total amount of memory and swap that can be
283
+ used. For example, for a container with memory limit set to 300m, and
284
+ ` MemorySwapLimit ` set to 1g, the container can use 300m of memory and 700m (1g
285
+ - 300m) swap.
286
+ * If MemorySwapLimit is set to 0, for containers with memory limit is set, the
287
+ container can use as much swap as the Memory limit setting, if the host
288
+ container has swap memory configured. For instance, if a container requests
289
+ memory="300m" and MemorySwapLimit is not set, the container can use 600m in
290
+ total of memory and swap.
291
+ * If MemorySwapLimit is explicitly set to -1, the container is allowed to use
292
+ unlimited swap, up to the amount available on the host system.
293
+ * If MemorySwapLimit is explicitly set to -2, the container does not have
294
+ access to swap. This value effectively prevents a container from using swap.
295
+
296
+ In summary, for users experimenting with this feature
297
+
298
+ | MemorySwapLimit| container memory limit (explicit or implicit)| Expected Behavior| Comment|
299
+ | --- | --- | --- | --- |
300
+ | Any| not set| N/A| Same as docker|
301
+ | -2| N| no swap allowed, this is the default value||
302
+ | -1| N| unlimited swap| Same as docker|
303
+ | 0| N| container can use up to N swap (ie: 2N memory+swap)| Same as docker|
304
+ | X where X > 0| N where N < X| container can use up to X-N swap (ie: 2N memory+swap)| Same as docker|
305
+ | X where X > 0| N where N >= X| no swap allowed (ie: N memory only)| Same as docker|
306
+
307
+ #### MemorySwappiness details
308
+
309
+ * A value of 0 turns off anonymous page swapping.
310
+ * A value of 100 sets all anonymous pages as swappable.
311
+ * By default, if you do not set MemorySwappiness, the value is inherited from
312
+ the host machine.
313
+
314
+ ### CRI Changes
315
+
316
+ We will be introducing the following two parameters
317
+ ` memory_swap_limit_in_bytes ` and ` memory_swappiness ` to `message
318
+ LinuxContainerResources` defined in
319
+ [ https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1/api.proto#L563-L580 ] ( https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1/api.proto#L563-L580 )
320
+
321
+ | Name| Type| Description| Default Value| Feature Gate|
322
+ | --- | --- | --- | --- | --- |
323
+ | ` memory_swap_limit_in_bytes ` | int64| set/show limit of memory+swap usage| Default 0, which is unspecified.| SupportNodeMemorySwap|
324
+ | ` memory_swappiness ` | int64| set/show swappiness parameter| Default 0, which is unspecified.| SupportNodeMemorySwap|
230
325
231
326
### Test Plan
232
327
0 commit comments