Skip to content

Commit 0b32f89

Browse files
authored
Merge pull request #26820 from ahg-g/nss
Document Pod affinity namespaceSelector
2 parents a65f9ac + 07c5dbc commit 0b32f89

File tree

3 files changed

+73
-0
lines changed

3 files changed

+73
-0
lines changed

content/en/docs/concepts/policy/resource-quotas.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,7 @@ Resources specified on the quota outside of the allowed set results in a validat
189189
| `BestEffort` | Match pods that have best effort quality of service. |
190190
| `NotBestEffort` | Match pods that do not have best effort quality of service. |
191191
| `PriorityClass` | Match pods that references the specified [priority class](/docs/concepts/configuration/pod-priority-preemption). |
192+
| `CrossNamespacePodAffinity` | Match pods that have cross-namespace pod [(anti)affinity terms](/docs/concepts/scheduling-eviction/assign-pod-node). |
192193

193194
The `BestEffort` scope restricts a quota to tracking the following resource:
194195

@@ -429,6 +430,63 @@ memory 0 20Gi
429430
pods 0 10
430431
```
431432
433+
### Cross-namespace Pod Affinity Quota
434+
435+
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
436+
437+
Operators can use `CrossNamespacePodAffinity` quota scope to limit which namespaces are allowed to
438+
have pods with affinity terms that cross namespaces. Specifically, it controls which pods are allowed
439+
to set `namespaces` or `namespaceSelector` fields in pod affinity terms.
440+
441+
Preventing users from using cross-namespace affinity terms might be desired since a pod
442+
with anti-affinity constraints can block pods from all other namespaces
443+
from getting scheduled in a failure domain.
444+
445+
Using this scope operators can prevent certain namespaces (`foo-ns` in the example below)
446+
from having pods that use cross-namespace pod affinity by creating a resource quota object in
447+
that namespace with `CrossNamespaceAffinity` scope and hard limit of 0:
448+
449+
```yaml
450+
apiVersion: v1
451+
kind: ResourceQuota
452+
metadata:
453+
name: disable-cross-namespace-affinity
454+
namespace: foo-ns
455+
spec:
456+
hard:
457+
pods: "0"
458+
scopeSelector:
459+
matchExpressions:
460+
- scopeName: CrossNamespaceAffinity
461+
```
462+
463+
If operators want to disallow using `namespaces` and `namespaceSelector` by default, and
464+
only allow it for specific namespaces, they could configure `CrossNamespaceAffinity`
465+
as a limited resource by setting the kube-apiserver flag --admission-control-config-file
466+
to the path of the following configuration file:
467+
468+
```yaml
469+
apiVersion: apiserver.config.k8s.io/v1
470+
kind: AdmissionConfiguration
471+
plugins:
472+
- name: "ResourceQuota"
473+
configuration:
474+
apiVersion: apiserver.config.k8s.io/v1
475+
kind: ResourceQuotaConfiguration
476+
limitedResources:
477+
- resource: pods
478+
matchScopes:
479+
- scopeName: CrossNamespaceAffinity
480+
```
481+
482+
With the above configuration, pods can use `namespaces` and `namespaceSelector` in pod affinity only
483+
if the namespace where they are created have a resource quota object with
484+
`CrossNamespaceAffinity` scope and a hard limit greater than or equal to the number of pods using those fields.
485+
486+
This feature is alpha and disabled by default. You can enable it by setting the
487+
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
488+
`PodAffinityNamespaceSelector` in both kube-apiserver and kube-scheduler.
489+
432490
## Requests compared to Limits {#requests-vs-limits}
433491

434492
When allocating compute resources, each container may specify a request and a limit value for either CPU or memory.

content/en/docs/concepts/scheduling-eviction/assign-pod-node.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,18 @@ If omitted or empty, it defaults to the namespace of the pod where the affinity/
271271
All `matchExpressions` associated with `requiredDuringSchedulingIgnoredDuringExecution` affinity and anti-affinity
272272
must be satisfied for the pod to be scheduled onto a node.
273273

274+
#### Namespace selector
275+
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
276+
277+
Users can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
278+
The affinity term is applied to the union of the namespaces selected by `namespaceSelector` and the ones listed in the `namespaces` field.
279+
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
280+
null `namespaceSelector` means "this pod's namespace".
281+
282+
This feature is alpha and disabled by default. You can enable it by setting the
283+
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
284+
`PodAffinityNamespaceSelector` in both kube-apiserver and kube-scheduler.
285+
274286
#### More Practical Use-cases
275287

276288
Interpod Affinity and AntiAffinity can be even more useful when they are used with higher

content/en/docs/reference/command-line-tools-reference/feature-gates.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,7 @@ different Kubernetes components.
140140
| `NonPreemptingPriority` | `true` | Beta | 1.19 | |
141141
| `PodDisruptionBudget` | `false` | Alpha | 1.3 | 1.4 |
142142
| `PodDisruptionBudget` | `true` | Beta | 1.5 | |
143+
| `PodAffinityNamespaceSelector` | `false` | Alpha | 1.21 | |
143144
| `PodOverhead` | `false` | Alpha | 1.16 | 1.17 |
144145
| `PodOverhead` | `true` | Beta | 1.18 | |
145146
| `ProcMountType` | `false` | Alpha | 1.12 | |
@@ -671,6 +672,8 @@ Each feature gate is designed for enabling/disabling a specific feature:
671672
- `PersistentLocalVolumes`: Enable the usage of `local` volume type in Pods.
672673
Pod affinity has to be specified if requesting a `local` volume.
673674
- `PodDisruptionBudget`: Enable the [PodDisruptionBudget](/docs/tasks/run-application/configure-pdb/) feature.
675+
- `PodAffinityNamespaceSelector`: Enable the [Pod Affinity Namespace Selector](/docs/concepts/scheduling-eviction/assign-pod-node/#namespace-selector)
676+
and [CrossNamespacePodAffinity](/docs/concepts/policy/resource-quotas/#cross-namespace-pod-affinity-quota) quota scope features.
674677
- `PodOverhead`: Enable the [PodOverhead](/docs/concepts/scheduling-eviction/pod-overhead/)
675678
feature to account for pod overheads.
676679
- `PodPriority`: Enable the descheduling and preemption of Pods based on their

0 commit comments

Comments
 (0)