Skip to content

Commit c779b21

Browse files
authored
Merge pull request #24783 from thockin/kep-1659-doc-topology-labels
Better docs for standard topology labels
2 parents db658b2 + 300c2e8 commit c779b21

File tree

8 files changed

+32
-49
lines changed

8 files changed

+32
-49
lines changed

content/en/docs/concepts/configuration/pod-priority-preemption.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,7 @@ preempted. Here's an example:
271271
* Pod P is being considered for Node N.
272272
* Pod Q is running on another Node in the same Zone as Node N.
273273
* Pod P has Zone-wide anti-affinity with Pod Q (`topologyKey:
274-
failure-domain.beta.kubernetes.io/zone`).
274+
topology.kubernetes.io/zone`).
275275
* There are no other cases of anti-affinity between Pod P and other Pods in
276276
the Zone.
277277
* In order to schedule Pod P on Node N, Pod Q can be preempted, but scheduler

content/en/docs/concepts/scheduling-eviction/assign-pod-node.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -200,8 +200,8 @@ The affinity on this pod defines one pod affinity rule and one pod anti-affinity
200200
while the `podAntiAffinity` is `preferredDuringSchedulingIgnoredDuringExecution`. The
201201
pod affinity rule says that the pod can be scheduled onto a node only if that node is in the same zone
202202
as at least one already-running pod that has a label with key "security" and value "S1". (More precisely, the pod is eligible to run
203-
on node N if node N has a label with key `failure-domain.beta.kubernetes.io/zone` and some value V
204-
such that there is at least one node in the cluster with key `failure-domain.beta.kubernetes.io/zone` and
203+
on node N if node N has a label with key `topology.kubernetes.io/zone` and some value V
204+
such that there is at least one node in the cluster with key `topology.kubernetes.io/zone` and
205205
value V that is running a pod that has a label with key "security" and value "S1".) The pod anti-affinity
206206
rule says that the pod cannot be scheduled onto a node if that node is in the same zone as a pod with
207207
label having key "security" and value "S2". See the

content/en/docs/concepts/storage/storage-classes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,7 @@ parameters:
209209
volumeBindingMode: WaitForFirstConsumer
210210
allowedTopologies:
211211
- matchLabelExpressions:
212-
- key: failure-domain.beta.kubernetes.io/zone
212+
- key: topology.kubernetes.io/zone
213213
values:
214214
- us-central1-a
215215
- us-central1-b

content/en/docs/concepts/storage/volumes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -449,7 +449,7 @@ spec:
449449
required:
450450
nodeSelectorTerms:
451451
- matchExpressions:
452-
- key: failure-domain.beta.kubernetes.io/zone
452+
- key: topology.kubernetes.io/zone
453453
operator: In
454454
values:
455455
- us-central1-a

content/en/docs/reference/access-authn-authz/admission-controllers.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -534,8 +534,8 @@ and kubelets will not be allowed to modify labels with that prefix.
534534
* `kubernetes.io/os`
535535
* `beta.kubernetes.io/instance-type`
536536
* `node.kubernetes.io/instance-type`
537-
* `failure-domain.beta.kubernetes.io/region`
538-
* `failure-domain.beta.kubernetes.io/zone`
537+
* `failure-domain.beta.kubernetes.io/region` (deprecated)
538+
* `failure-domain.beta.kubernetes.io/zone` (deprecated)
539539
* `topology.kubernetes.io/region`
540540
* `topology.kubernetes.io/zone`
541541
* `kubelet.kubernetes.io/`-prefixed labels

content/en/docs/reference/command-line-tools-reference/kubelet.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -967,7 +967,7 @@ WindowsEndpointSliceProxying=true|false (ALPHA - default=false)<br/>
967967
<td colspan="2">--node-labels mapStringString</td>
968968
</tr>
969969
<tr>
970-
<td></td><td style="line-height: 130%; word-wrap: break-word;">&lt;Warning: Alpha feature&gt; Labels to add when registering the node in the cluster. Labels must be `key=value` pairs separated by `,`. Labels in the `kubernetes.io` namespace must begin with an allowed prefix (`kubelet.kubernetes.io`, `node.kubernetes.io`) or be in the specifically allowed set (`beta.kubernetes.io/arch`, `beta.kubernetes.io/instance-type`, `beta.kubernetes.io/os`, `failure-domain.beta.kubernetes.io/region`, `failure-domain.beta.kubernetes.io/zone`, `failure-domain.kubernetes.io/region`, `failure-domain.kubernetes.io/zone`, `kubernetes.io/arch`, `kubernetes.io/hostname`, `kubernetes.io/instance-type`, `kubernetes.io/os`)</td>
970+
<td></td><td style="line-height: 130%; word-wrap: break-word;">&lt;Warning: Alpha feature&gt;Labels to add when registering the node in the cluster. Labels must be `key=value pairs` separated by `,`. Labels in the `kubernetes.io` namespace must begin with an allowed prefix (`kubelet.kubernetes.io`, `node.kubernetes.io`) or be in the specifically allowed set (`beta.kubernetes.io/arch`, `beta.kubernetes.io/instance-type`, `beta.kubernetes.io/os`, `failure-domain.beta.kubernetes.io/region`, `failure-domain.beta.kubernetes.io/zone`, `kubernetes.io/arch`, `kubernetes.io/hostname`, `kubernetes.io/os`, `node.kubernetes.io/instance-type`, `topology.kubernetes.io/region`, `topology.kubernetes.io/zone`)</td>
971971
</tr>
972972

973973
<tr>

content/en/docs/reference/kubernetes-api/labels-annotations-taints.md

Lines changed: 22 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -38,14 +38,16 @@ This label has been deprecated. Please use `kubernetes.io/arch` instead.
3838

3939
This label has been deprecated. Please use `kubernetes.io/os` instead.
4040

41-
## kubernetes.io/hostname
41+
## kubernetes.io/hostname {#kubernetesiohostname}
4242

4343
Example: `kubernetes.io/hostname=ip-172-20-114-199.ec2.internal`
4444

4545
Used on: Node
4646

4747
The Kubelet populates this label with the hostname. Note that the hostname can be changed from the "actual" hostname by passing the `--hostname-override` flag to the `kubelet`.
4848

49+
This label is also used as part of the topology hierarchy. See [topology.kubernetes.io/zone](#topologykubernetesiozone) for more information.
50+
4951
## beta.kubernetes.io/instance-type (deprecated)
5052

5153
{{< note >}} Starting in v1.17, this label is deprecated in favor of [node.kubernetes.io/instance-type](#nodekubernetesioinstance-type). {{< /note >}}
@@ -63,71 +65,52 @@ to rely on the Kubernetes scheduler to perform resource-based scheduling. You sh
6365

6466
## failure-domain.beta.kubernetes.io/region (deprecated) {#failure-domainbetakubernetesioregion}
6567

66-
See [failure-domain.beta.kubernetes.io/zone](#failure-domainbetakubernetesiozone).
68+
See [topology.kubernetes.io/region](#topologykubernetesioregion).
6769

6870
{{< note >}} Starting in v1.17, this label is deprecated in favor of [topology.kubernetes.io/region](#topologykubernetesioregion). {{< /note >}}
6971

7072
## failure-domain.beta.kubernetes.io/zone (deprecated) {#failure-domainbetakubernetesiozone}
7173

72-
Example:
73-
74-
`failure-domain.beta.kubernetes.io/region=us-east-1`
75-
76-
`failure-domain.beta.kubernetes.io/zone=us-east-1c`
77-
78-
Used on: Node, PersistentVolume
79-
80-
On the Node: The `kubelet` populates this with the zone information as defined by the `cloudprovider`.
81-
This will be set only if you are using a `cloudprovider`. However, you should consider setting this
82-
on the nodes if it makes sense in your topology.
83-
84-
On the PersistentVolume: The `PersistentVolumeLabel` admission controller will automatically add zone labels to PersistentVolumes, on GCE and AWS.
85-
86-
Kubernetes will automatically spread the Pods in a replication controller or service across nodes in a single-zone cluster (to reduce the impact of failures). With multiple-zone clusters, this spreading behaviour is extended across zones (to reduce the impact of zone failures). This is achieved via _SelectorSpreadPriority_.
87-
88-
_SelectorSpreadPriority_ is a best effort placement. If the zones in your cluster are heterogeneous (for example: different numbers of nodes, different types of nodes, or different pod resource requirements), this placement might prevent equal spreading of your Pods across zones. If desired, you can use homogenous zones (same number and types of nodes) to reduce the probability of unequal spreading.
89-
90-
The scheduler (through the _VolumeZonePredicate_ predicate) also will ensure that Pods, that claim a given volume, are only placed into the same zone as that volume. Volumes cannot be attached across zones.
91-
92-
The actual values of zone and region don't matter. Nor is the node hierarchy rigidly defined.
93-
The expectation is that failures of nodes in different zones should be uncorrelated unless the entire region has failed. For example, zones should typically avoid sharing a single network switch. The exact mapping depends on your particular infrastructure - a three rack installation will choose a very different setup to a multi-datacenter configuration.
94-
95-
If `PersistentVolumeLabel` does not support automatic labeling of your PersistentVolumes, you should consider
96-
adding the labels manually (or adding support for `PersistentVolumeLabel`). With `PersistentVolumeLabel`, the scheduler prevents Pods from mounting volumes in a different zone. If your infrastructure doesn't have this constraint, you don't need to add the zone labels to the volumes at all.
74+
See [topology.kubernetes.io/zone](#topologykubernetesiozone).
9775

9876
{{< note >}} Starting in v1.17, this label is deprecated in favor of [topology.kubernetes.io/zone](#topologykubernetesiozone). {{< /note >}}
9977

10078
## topology.kubernetes.io/region {#topologykubernetesioregion}
10179

80+
Example:
81+
82+
`topology.kubernetes.io/region=us-east-1`
83+
10284
See [topology.kubernetes.io/zone](#topologykubernetesiozone).
10385

10486
## topology.kubernetes.io/zone {#topologykubernetesiozone}
10587

10688
Example:
10789

108-
`topology.kubernetes.io/region=us-east-1`
109-
11090
`topology.kubernetes.io/zone=us-east-1c`
11191

11292
Used on: Node, PersistentVolume
11393

114-
On the Node: The `kubelet` populates this with the zone information as defined by the `cloudprovider`.
115-
This will be set only if you are using a `cloudprovider`. However, you should consider setting this
116-
on the nodes if it makes sense in your topology.
94+
On Node: The `kubelet` or the external `cloud-controller-manager` populates this with the information as provided by the `cloudprovider`. This will be set only if you are using a `cloudprovider`. However, you should consider setting this on nodes if it makes sense in your topology.
11795

118-
On the PersistentVolume: The `PersistentVolumeLabel` admission controller will automatically add zone labels to PersistentVolumes, on GCE and AWS.
96+
On PersistentVolume: topology-aware volume provisioners will automatically set node affinity constraints on `PersistentVolumes`.
11997

120-
Kubernetes will automatically spread the Pods in a replication controller or service across nodes in a single-zone cluster (to reduce the impact of failures). With multiple-zone clusters, this spreading behaviour is extended across zones (to reduce the impact of zone failures). This is achieved via _SelectorSpreadPriority_.
98+
A zone represents a logical failure domain. It is common for Kubernetes clusters to span multiple zones for increased availability. While the exact definition of a zone is left to infrastructure implementations, common properties of a zone include very low network latency within a zone, no-cost network traffic within a zone, and failure independence from other zones. For example, nodes within a zone might share a network switch, but nodes in different zones should not.
99+
100+
A region represents a larger domain, made up of one or more zones. It is uncommon for Kubernetes clusters to span multiple regions, While the exact definition of a zone or region is left to infrastructure implementations, common properties of a region include higher network latency between them than within them, non-zero cost for network traffic between them, and failure independence from other zones or regions. For example, nodes within a region might share power infrastructure (e.g. a UPS or generator), but nodes in different regions typically would not.
101+
102+
Kubernetes makes a few assumptions about the structure of zones and regions:
103+
1) regions and zones are hierarchical: zones are strict subsets of regions and no zone can be in 2 regions
104+
2) zone names are unique across regions; for example region "africa-east-1" might be comprised of zones "africa-east-1a" and "africa-east-1b"
105+
106+
It should be safe to assume that topology labels do not change. Even though labels are strictly mutable, consumers of them can assume that a given node is not going to be moved between zones without being destroyed and recreated.
107+
108+
Kubernetes can use this information in various ways. For example, the scheduler automatically tries to spread the Pods in a ReplicaSet across nodes in a single-zone cluster (to reduce the impact of node failures, see [kubernetes.io/hostname](#kubernetesiohostname)). With multiple-zone clusters, this spreading behavior also applies to zones (to reduce the impact of zone failures). This is achieved via _SelectorSpreadPriority_.
121109

122110
_SelectorSpreadPriority_ is a best effort placement. If the zones in your cluster are heterogeneous (for example: different numbers of nodes, different types of nodes, or different pod resource requirements), this placement might prevent equal spreading of your Pods across zones. If desired, you can use homogenous zones (same number and types of nodes) to reduce the probability of unequal spreading.
123111

124112
The scheduler (through the _VolumeZonePredicate_ predicate) also will ensure that Pods, that claim a given volume, are only placed into the same zone as that volume. Volumes cannot be attached across zones.
125113

126-
The actual values of zone and region don't matter. Nor is the node hierarchy rigidly defined.
127-
The expectation is that failures of nodes in different zones should be uncorrelated unless the entire region has failed. For example, zones should typically avoid sharing a single network switch. The exact mapping depends on your particular infrastructure - a three rack installation will choose a very different setup to a multi-datacenter configuration.
128-
129114
If `PersistentVolumeLabel` does not support automatic labeling of your PersistentVolumes, you should consider
130115
adding the labels manually (or adding support for `PersistentVolumeLabel`). With `PersistentVolumeLabel`, the scheduler prevents Pods from mounting volumes in a different zone. If your infrastructure doesn't have this constraint, you don't need to add the zone labels to the volumes at all.
131116

132-
133-

content/en/examples/pods/pod-with-pod-affinity.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ spec:
1212
operator: In
1313
values:
1414
- S1
15-
topologyKey: failure-domain.beta.kubernetes.io/zone
15+
topologyKey: topology.kubernetes.io/zone
1616
podAntiAffinity:
1717
preferredDuringSchedulingIgnoredDuringExecution:
1818
- weight: 100
@@ -23,7 +23,7 @@ spec:
2323
operator: In
2424
values:
2525
- S2
26-
topologyKey: failure-domain.beta.kubernetes.io/zone
26+
topologyKey: topology.kubernetes.io/zone
2727
containers:
2828
- name: with-pod-affinity
2929
image: k8s.gcr.io/pause:2.0

0 commit comments

Comments
 (0)