Skip to content

Commit c6eebed

Browse files
authored
blog post for MatchLabelKeys in PodAffinity (#46833)
* blog post for MatchLabelKeys in PodAffinity * address comments * fix wording * fix: rolling upgrade -> rolling update
1 parent e31a274 commit c6eebed

File tree

1 file changed

+156
-0
lines changed

1 file changed

+156
-0
lines changed
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
---
2+
layout: blog
3+
title: 'Kubernetes 1.31: MatchLabelKeys in PodAffinity graduates to beta'
4+
date: 2024-08-16
5+
slug: matchlabelkeys-podaffinity
6+
author: >
7+
Kensei Nakada (Tetrate)
8+
---
9+
10+
Kubernetes 1.29 introduced new fields `MatchLabelKeys` and `MismatchLabelKeys` in PodAffinity and PodAntiAffinity.
11+
12+
In Kubernetes 1.31, this feature moves to beta and the corresponding feature gate (`MatchLabelKeysInPodAffinity`) gets enabled by default.
13+
14+
## `MatchLabelKeys` - Enhanced scheduling for versatile rolling updates
15+
16+
During a workload's (e.g., Deployment) rolling update, a cluster may have Pods from multiple versions at the same time.
17+
However, the scheduler cannot distinguish between old and new versions based on the `LabelSelector` specified in PodAffinity or PodAntiAffinity. As a result, it will co-locate or disperse Pods regardless of their versions.
18+
19+
This can lead to sub-optimal scheduling outcome, for example:
20+
- New version Pods are co-located with old version Pods (PodAffinity), which will eventually be removed after rolling updates.
21+
- Old version Pods are distributed across all available topologies, preventing new version Pods from finding nodes due to PodAntiAffinity.
22+
23+
`MatchLabelKeys` is a set of Pod label keys and addresses this problem.
24+
The scheduler looks up the values of these keys from the new Pod's labels and combines them with `LabelSelector`
25+
so that PodAffinity matches Pods that have the same key-value in labels.
26+
27+
By using label [pod-template-hash](/docs/concepts/workloads/controllers/deployment/#pod-template-hash-label) in `MatchLabelKeys`,
28+
you can ensure that only Pods of the same version are evaluated for PodAffinity or PodAntiAffinity.
29+
30+
```yaml
31+
apiVersion: apps/v1
32+
kind: Deployment
33+
metadata:
34+
name: application-server
35+
...
36+
affinity:
37+
podAffinity:
38+
requiredDuringSchedulingIgnoredDuringExecution:
39+
- labelSelector:
40+
matchExpressions:
41+
- key: app
42+
operator: In
43+
values:
44+
- database
45+
topologyKey: topology.kubernetes.io/zone
46+
matchLabelKeys:
47+
- pod-template-hash
48+
```
49+
50+
The above matchLabelKeys will be translated in Pods like:
51+
52+
```yaml
53+
kind: Pod
54+
metadata:
55+
name: application-server
56+
labels:
57+
pod-template-hash: xyz
58+
...
59+
affinity:
60+
podAffinity:
61+
requiredDuringSchedulingIgnoredDuringExecution:
62+
- labelSelector:
63+
matchExpressions:
64+
- key: app
65+
operator: In
66+
values:
67+
- database
68+
- key: pod-template-hash # Added from matchLabelKeys; Only Pods from the same replicaset will match this affinity.
69+
operator: In
70+
values:
71+
- xyz
72+
topologyKey: topology.kubernetes.io/zone
73+
matchLabelKeys:
74+
- pod-template-hash
75+
```
76+
77+
## `MismatchLabelKeys` - Service isolation
78+
79+
`MismatchLabelKeys` is a set of Pod label keys, like `MatchLabelKeys`,
80+
which looks up the values of these keys from the new Pod's labels, and merge them with `LabelSelector` as `key notin (value)`
81+
so that PodAffinity does _not_ match Pods that have the same key-value in labels.
82+
83+
Suppose all Pods for each tenant get `tenant` label via a controller or a manifest management tool like Helm.
84+
85+
Although the value of `tenant` label is unknown when composing each workload's manifest,
86+
the cluster admin wants to achieve exclusive 1:1 tenant to domain placement for a tenant isolation.
87+
88+
`MismatchLabelKeys` works for this usecase;
89+
By applying the following affinity globally using a mutating webhook,
90+
the cluster admin can ensure that the Pods from the same tenant will land on the same domain exclusively,
91+
meaning Pods from other tenants won't land on the same domain.
92+
93+
```yaml
94+
affinity:
95+
podAffinity: # ensures the pods of this tenant land on the same node pool
96+
requiredDuringSchedulingIgnoredDuringExecution:
97+
- matchLabelKeys:
98+
- tenant
99+
topologyKey: node-pool
100+
podAntiAffinity: # ensures only Pods from this tenant lands on the same node pool
101+
requiredDuringSchedulingIgnoredDuringExecution:
102+
- mismatchLabelKeys:
103+
- tenant
104+
labelSelector:
105+
matchExpressions:
106+
- key: tenant
107+
operator: Exists
108+
topologyKey: node-pool
109+
```
110+
111+
The above matchLabelKeys and mismatchLabelKeys will be translated to like:
112+
113+
```yaml
114+
kind: Pod
115+
metadata:
116+
name: application-server
117+
labels:
118+
tenant: service-a
119+
spec:
120+
affinity:
121+
podAffinity: # ensures the pods of this tenant land on the same node pool
122+
requiredDuringSchedulingIgnoredDuringExecution:
123+
- matchLabelKeys:
124+
- tenant
125+
topologyKey: node-pool
126+
labelSelector:
127+
matchExpressions:
128+
- key: tenant
129+
operator: In
130+
values:
131+
- service-a
132+
podAntiAffinity: # ensures only Pods from this tenant lands on the same node pool
133+
requiredDuringSchedulingIgnoredDuringExecution:
134+
- mismatchLabelKeys:
135+
- tenant
136+
labelSelector:
137+
matchExpressions:
138+
- key: tenant
139+
operator: Exists
140+
- key: tenant
141+
operator: NotIn
142+
values:
143+
- service-a
144+
topologyKey: node-pool
145+
```
146+
147+
## Getting involved
148+
149+
These features are managed by Kubernetes [SIG Scheduling](https://github.com/kubernetes/community/tree/master/sig-scheduling).
150+
151+
Please join us and share your feedback. We look forward to hearing from you!
152+
153+
## How can I learn more?
154+
155+
- [The official document of PodAffinity](/docs/concepts/scheduling-eviction/assign-pod-node/#inter-pod-affinity-and-anti-affinity)
156+
- [KEP-3633: Introduce MatchLabelKeys and MismatchLabelKeys to PodAffinity and PodAntiAffinity](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/3633-matchlabelkeys-to-podaffinity/README.md#story-2)

0 commit comments

Comments
 (0)