Skip to content

Commit d897e3d

Browse files
committed
providing new standard for node distribution in draft version - to be updated
Signed-off-by: Piotr <[email protected]>
1 parent e28ed22 commit d897e3d

File tree

1 file changed

+130
-0
lines changed

1 file changed

+130
-0
lines changed
Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
---
2+
title: Kubernetes Node Distribution and Availability
3+
type: Standard
4+
status: Draft
5+
replaces: scs-0214-v1-k8s-node-distribution.md and scs-0214-v1-k8s-node-distribution.md
6+
track: KaaS
7+
---
8+
9+
## Introduction
10+
11+
A Kubernetes instance is provided as a cluster, which consists of a set of machines,
12+
so-called nodes. A cluster is composed of a control plane and at least one worker node.
13+
The control plane manages the worker nodes and therefore the pods in the cluster by making
14+
decisions about scheduling, event detection and rights management. Inside the control plane,
15+
multiple components exist, which can be duplicated and distributed over multiple nodes
16+
inside the cluster. Typically, no user workloads are run on these nodes in order to
17+
separate the controller component from user workloads, which could pose a security risk.
18+
19+
The Kubernetes project maintains multiple release versions, with the three most recent minor
20+
versions actively supported, along with a fourth version in development.
21+
Each new minor version replaces the oldest version at the end of its support period,
22+
which typically spans approximately 14 months, comprising a 12-month standard support period
23+
followed by a 2-month end-of-life (EOL) phase for critical updates.
24+
25+
### Glossary
26+
27+
The following terms are used throughout this document:
28+
29+
| Term | Meaning |
30+
|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
31+
| Worker | Virtual or bare-metal machine, which hosts workloads of customers |
32+
| Control Plane | Virtual or bare-metal machine, which hosts the container orchestration layer that exposes the API and interfaces to define, deploy, and manage the lifecycle of containers. |
33+
| Machine | Virtual or bare-metal entity with computational capabilities |
34+
| Failure Zone | A logical entity representing a group of physical machines that share a risk of failure due to their proximity or dependency on common resources. |
35+
36+
## Motivation
37+
38+
In normal day-to-day operation, it is not unusual for some operational failures, either
39+
due to wear and tear of hardware, software misconfigurations, external problems or
40+
user errors. Whichever was the source of such an outage, it always means down-time for
41+
operations and users and possible even data loss.
42+
Therefore, a Kubernetes cluster in a productive environment should be distributed over
43+
multiple "failure zones" in order to provide fault-tolerance and high availability.
44+
This is especially important for the control plane of the cluster, since it contains the
45+
state of the whole cluster. A failure of this component could mean an unrecoverable failure
46+
of the whole cluster.
47+
48+
## Design Considerations
49+
50+
Most design considerations of this standard follow the previously written Decision Record
51+
[Kubernetes Nodes Anti Affinity][scs-0213-v1] as well as the Kubernetes documents on
52+
[High Availability][k8s-ha] and [Best practices for large clusters][k8s-large-clusters].
53+
54+
The SCS prefers distributed, highly available systems due to advantages such as fault tolerance and
55+
data redundancy. It also acknowledges the costs and overhead for providers associated with this effort,
56+
given that hardware and infrastructure may be dedicated to fail-over safety and duplication.
57+
58+
The [Best practices for large clusters][k8s-large-clusters] documentation describes the concept
59+
of a failure zone. This term is context-dependent and describes a group of physical machines that are close
60+
enough—physically or logically—that a specific issue could affect all machines in the zone.
61+
To mitigate this, critical data and services should not be confined to one failure zone.
62+
How a failure zone is defined depends on the risk model and infrastructure capabilities of the provider,
63+
ranging from single machines or racks to entire datacenters or regions. Failure zones are therefore logical
64+
entities that should not be strictly defined in this document.
65+
66+
67+
## Decision
68+
69+
This standard formulates the requirements for the distribution of Kubernetes nodes to provide a fault-tolerant
70+
and available Kubernetes cluster infrastructure. Since some providers only have small environments to work
71+
with and therefore couldn't comply with this standard, it will be treated as a RECOMMENDED standard,
72+
where providers can OPT OUT.
73+
74+
### Control Plane Requirements
75+
76+
1. **Distribution Across Physical Machines**: Control plane nodes MUST be distributed over multiple physical
77+
machines to avoid single points of failure, aligning with Kubernetes best practices.
78+
2. **Failure Zone Placement**: At least one control plane instance MUST be run in each defined failure zone.
79+
More instances in each failure zone are RECOMMENDED to enhance fault tolerance within each zone.
80+
81+
### Worker Node Requirements
82+
83+
- The control plane nodes MUST be distributed over multiple physical machines. Kubernetes
84+
provides best-practices on this topic, which are also RECOMMENDED by SCS.
85+
- At least one control plane instance MUST be run in each "failure zone", more are
86+
RECOMMENDED in each "failure zone" to provide fault-tolerance for each zone.
87+
- Worker nodes are RECOMMENDED to be distributed over multiple zones. This policy makes
88+
it OPTIONAL to provide a worker node in each "failure zone", meaning that worker nodes
89+
can also be scaled vertically first before scaling horizontally.
90+
- Worker node distribution MUST be indicated to the user through some kind of labeling
91+
in order to enable (anti)-affinity for workloads over "failure zones".
92+
- To provide metadata about the node distribution, which also enables testing of this standard,
93+
providers MUST label their K8s nodes with the labels listed below.
94+
95+
96+
To provide metadata about node distribution and enable efficient workload scheduling and testing of this standard,
97+
providers MUST label their Kubernetes nodes with the following labels. These labels MUST remain current with the
98+
deployment’s state.
99+
100+
- `topology.kubernetes.io/zone`
101+
- Corresponds with the label described in [K8s labels documentation][k8s-labels-docs].
102+
This label provides a logical failure zone identifier on the provider side,
103+
such as a server rack in the same electrical circuit. It is typically autopopulated by either
104+
the kubelet or external mechanisms like the cloud controller.
105+
106+
- `topology.kubernetes.io/region`
107+
- This label groups multiple failure zones into a region, such as a building with multiple racks.
108+
It is typically autopopulated by the kubelet or a cloud controller.
109+
110+
- `topology.scs.community/host-id`
111+
- This SCS-specific label MUST contain the unique hostID of the physical machine running the hypervisor,
112+
helping identify the physical machine’s distribution.
113+
114+
## Conformance Tests
115+
116+
The `k8s-node-distribution-check.py` script assesses node distribution using a user-provided kubeconfig file.
117+
It verifies compliance based on the `topology.scs.community/host-id`, `topology.kubernetes.io/zone`,
118+
`topology.kubernetes.io/region`, and `node-role.kubernetes.io/control-plane` labels.
119+
The script produces errors if node distribution does not meet the standard’s requirements and generates
120+
warnings if labels appear incomplete.
121+
122+
## Previous Standard Versions
123+
124+
This version extends [version 1](scs-0214-v1-k8s-node-distribution.md) by enhancing node labeling requirements.
125+
126+
[k8s-ha]: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/
127+
[k8s-large-clusters]: https://kubernetes.io/docs/setup/best-practices/cluster-large/
128+
[scs-0213-v1]: https://github.com/SovereignCloudStack/standards/blob/main/Standards/scs-0213-v1-k8s-nodes-anti-affinity.md
129+
[k8s-labels-docs]: https://kubernetes.io/docs/reference/labels-annotations-taints/#topologykubernetesiozone
130+

0 commit comments

Comments
 (0)