You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/services/kubernetes/kubernetes-upgrades.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
4
4
To maintain a secure, stable, and supported platform, we regularly upgrade our Kubernetes clusters. We use **[RKE2](https://docs.rke2.io/)** as our Kubernetes distribution.
5
5
6
-
## 🔄 Upgrade Flow
6
+
## Upgrade Flow
7
7
8
8
**Phased Rollout**
9
9
@@ -16,7 +16,7 @@ To maintain a secure, stable, and supported platform, we regularly upgrade our K
16
16
- Timing may depend on compatibility with **other infrastructure components** (e.g., storage, CNI plugins, monitoring tools).
17
17
- However, all clusters will be upgraded **before the current Kubernetes version reaches End of Life (EOL)**.
18
18
19
-
## ⚠️ Upgrade Impact
19
+
## Upgrade Impact
20
20
21
21
The **impact of a Kubernetes upgrade can vary**, depending on the nature of the changes involved:
22
22
@@ -32,7 +32,7 @@ The **impact of a Kubernetes upgrade can vary**, depending on the nature of the
32
32
33
33
??? Note "Applications that follow cloud-native best practices (e.g., readiness probes, multiple replicas, graceful shutdown handling) are **less likely to be impacted** by upgrades."
34
34
35
-
## ✅ What You Can Expect
35
+
## What You Can Expect
36
36
37
37
- Upgrades are performed using safe, tested procedures with minimal risk to production workloads.
38
38
- TDS clusters serve as a **canary environment**, allowing us to identify issues early.
Copy file name to clipboardExpand all lines: docs/services/kubernetes/node-updates.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
4
4
To ensure the **security** and **stability** of our infrastructure, CSCS will perform **monthly OS updates** on all nodes of our Kubernetes clusters.
5
5
6
-
## 🔄 Maintenance Schedule
6
+
## Maintenance Schedule
7
7
8
8
-**Frequency**: Every **first week of the month**
9
9
-**Reboot Window**: **Monday to Friday**, between **09:00 and 15:00**
@@ -13,15 +13,15 @@ These updates include important security patches and system updates for the oper
13
13
14
14
??? Note "Nodes will be rebooted only if required by the updates."
15
15
16
-
## 🚨 Urgent Security Patches
16
+
## Urgent Security Patches
17
17
18
18
In the event of a **critical zero-day vulnerability**, we will apply patches and perform reboots (if required) **as soon as possible**, outside of the regular update schedule if needed.
19
19
20
20
- Affected nodes will be updated **immediately** to protect the platform.
21
21
- Users will be notified ahead of time **when possible**.
22
22
- Standard safety and rolling reboot practices will still be followed.
23
23
24
-
## 🛠️ Reboot Management with Kured
24
+
## Reboot Management with Kured
25
25
26
26
We use [**Kured** (KUbernetes REboot Daemon)](https://github.com/kubereboot/kured) to safely automate the reboot process. Kured ensures that:
27
27
@@ -30,7 +30,7 @@ We use [**Kured** (KUbernetes REboot Daemon)](https://github.com/kubereboot/kure
30
30
- Reboots occur **only during the defined window**
31
31
- Nodes are **cordoned**, **drained**, and **gracefully reintegrated** after reboot.
32
32
33
-
## ✅ Application Requirements
33
+
## Application Requirements
34
34
35
35
To avoid service disruption during node maintenance, applications **must be designed for high availability**. Specifically:
0 commit comments