Skip to content

Commit c67f555

Browse files
Sync Product PR #115 (Add Roll Back instructions for K3s) (#421)
* Add roll back instructions Signed-off-by: Lucas Saintarbor <lucas.saintarbor@suse.com> Co-authored-by: Brad Davidson <brad@oatmail.org>
1 parent 6811ca2 commit c67f555

File tree

4 files changed

+502
-0
lines changed

4 files changed

+502
-0
lines changed

docs/upgrades/roll-back.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
title: Rolling Back K3s
3+
---
4+
5+
# Rolling Back K3s
6+
7+
You can roll back the K3s Kubernetes version after an upgrade, using a combination of K3s binary downgrade and datastore restoration. Rollback can be performed on clusters of all types, including a single-node SQLite, an external datastore, or an embedded etcd. When rolling back to a previous Kubernetes minor version, you must have a datastore snapshot taken on the Kubernetes minor version you wish to roll back to.
8+
9+
:::warning
10+
If you cannot restore the database, you can not roll back to a previous minor version.
11+
:::
12+
13+
## Important Considerations
14+
15+
- **Backups:** Before upgrading, ensure you have a valid database or etcd snapshot from your cluster running the older version of K3s. Without a backup, a rollback is impossible.
16+
- **Potential Data Loss:** The `k3s-killall.sh` script forcefully terminates K3s processes and may result in data loss if applications are not properly shut down.
17+
- **Version Specifics:** Always verify K3s and component versions before and after the rollback.
18+
19+
## Rolling Back a K3s Cluster
20+
21+
<Tabs>
22+
<TabItem value='SQLite' default>
23+
24+
To roll back a K3s cluster when using a SQLite database, replace the `.db` file with the copy of the `.db` file you made while backing up your database.
25+
26+
</TabItem>
27+
28+
<TabItem value='Embedded etcd' default>
29+
30+
To roll back a K3s cluster when using an embedded etcd, follow these steps:
31+
32+
1. If the cluster is running and the Kubernetes API is available, gracefully stop workloads by draining all nodes:
33+
34+
```bash
35+
kubectl drain --ignore-daemonsets --delete-emptydir-data <NODE-ONE-NAME> <NODE-TWO-NAME> <NODE-THREE-NAME> ...
36+
```
37+
38+
1. On each node, stop the K3s service and all running pod processes:
39+
40+
```bash
41+
k3s-killall.sh
42+
```
43+
44+
1. On each node, roll back the K3s binary to the previous version, but *do not* start K3s.
45+
46+
- Clusters with Internet Access:
47+
48+
- Server nodes:
49+
50+
```bash
51+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="server" INSTALL_K3S_SKIP_START="true" sh -
52+
```
53+
54+
- Agent nodes:
55+
56+
```bash
57+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="agent" INSTALL_K3S_SKIP_START="true" sh -
58+
```
59+
60+
- Air-gapped Clusters:
61+
62+
- Download the artifacts and run the [install script](../installation/airgap.md#2-install-k3s) locally. Add the environment variable `INSTALL_K3S_SKIP_START="true"` when running the install script to prevent K3s from starting.
63+
64+
1. On the first server node or the node without a `server:` entry in its [K3s config file](../installation/configuration.md), initiate the cluster restore. Refer to the [Snapshot Restore Steps](../cli/etcd-snapshot.md#snapshot-restore-steps) for more information:
65+
66+
```bash
67+
k3s server --cluster-reset --cluster-reset-restore-path=<PATH-TO-SNAPSHOT>
68+
```
69+
70+
:::warning
71+
This will overwrite all data in the etcd datastore. Verify the snapshot's integrity before restoring. Be aware that large snapshots can take a long time to restore.
72+
:::
73+
74+
1. Start the K3s service on the first server node:
75+
76+
```bash
77+
systemctl start k3s
78+
```
79+
80+
1. On the other server nodes, remove the K3s database directory:
81+
82+
```bash
83+
rm -rf /var/lib/rancher/k3s/server/db
84+
```
85+
86+
1. Start the K3s service on the other server nodes:
87+
88+
```bash
89+
systemctl start k3s
90+
```
91+
92+
1. Start the K3s service on all agent nodes:
93+
94+
```bash
95+
systemctl start k3s
96+
```
97+
98+
1. Verify the K3s service status with `systemctl status k3s`.
99+
100+
</TabItem>
101+
102+
<TabItem value='External Database' default>
103+
104+
To roll back a K3s cluster when using an external database (e.g., PostgreSQL, MySQL), follow these steps:
105+
106+
1. If the cluster is running and the Kubernetes API is available, gracefully stop workloads by draining all nodes:
107+
108+
```bash
109+
kubectl drain --ignore-daemonsets --delete-emptydir-data <NODE-ONE-NAME> <NODE-TWO-NAME> <NODE-THREE-NAME> ...
110+
```
111+
112+
:::note
113+
114+
This process may disrupt running applications.
115+
116+
:::
117+
118+
1. On each node, stop the K3s service and all running pod processes:
119+
120+
```bash
121+
k3s-killall.sh
122+
```
123+
124+
1. Restore a database snapshot taken before upgrading K3s and verify the integrity of the database. For example, if you're using PostgreSQL, run the following command:
125+
126+
```bash
127+
pg_restore -U <DB-USER> -d <DB-NAME> <BACKUP-FILE>
128+
```
129+
130+
1. On each node, roll back the K3s binary to the previous version.
131+
132+
- Clusters with Internet Access:
133+
- Server nodes:
134+
135+
```bash
136+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="server" sh -
137+
```
138+
139+
- Agent nodes:
140+
141+
```bash
142+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="agent" sh -
143+
```
144+
145+
- Air-gapped Clusters:
146+
147+
- Download the artifacts and run the [install script](../installation/airgap.md#2-install-k3s) locally. Verify the K3s version after install with `k3s --version` and reapply any custom configurations that where used before the upgrade.
148+
149+
1. Start the K3s service on each node:
150+
151+
```bash
152+
systemctl start k3s
153+
```
154+
155+
1. Verify the K3s service status with `systemctl status k3s`.
156+
157+
</TabItem>
158+
</Tabs>
159+
160+
## Verification
161+
162+
After the rollback, verify the following:
163+
164+
- K3s version: `k3s --version`
165+
- Kubernetes cluster health: `kubectl get nodes`
166+
- Application functionality.
167+
- Check the K3s logs for errors.
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
title: Rolling Back K3s
3+
---
4+
5+
# Rolling Back K3s
6+
7+
You can roll back the K3s Kubernetes version after an upgrade, using a combination of K3s binary downgrade and datastore restoration. Rollback can be performed on clusters of all types, including a single-node SQLite, an external datastore, or an embedded etcd. When rolling back to a previous Kubernetes minor version, you must have a datastore snapshot taken on the Kubernetes minor version you wish to roll back to.
8+
9+
:::warning
10+
If you cannot restore the database, you can not roll back to a previous minor version.
11+
:::
12+
13+
## Important Considerations
14+
15+
- **Backups:** Before upgrading, ensure you have a valid database or etcd snapshot from your cluster running the older version of K3s. Without a backup, a rollback is impossible.
16+
- **Potential Data Loss:** The `k3s-killall.sh` script forcefully terminates K3s processes and may result in data loss if applications are not properly shut down.
17+
- **Version Specifics:** Always verify K3s and component versions before and after the rollback.
18+
19+
## Rolling Back a K3s Cluster
20+
21+
<Tabs>
22+
<TabItem value='SQLite' default>
23+
24+
To roll back a K3s cluster when using a SQLite database, replace the `.db` file with the copy of the `.db` file you made while backing up your database.
25+
26+
</TabItem>
27+
28+
<TabItem value='Embedded etcd' default>
29+
30+
To roll back a K3s cluster when using an embedded etcd, follow these steps:
31+
32+
1. If the cluster is running and the Kubernetes API is available, gracefully stop workloads by draining all nodes:
33+
34+
```bash
35+
kubectl drain --ignore-daemonsets --delete-emptydir-data <NODE-ONE-NAME> <NODE-TWO-NAME> <NODE-THREE-NAME> ...
36+
```
37+
38+
1. On each node, stop the K3s service and all running pod processes:
39+
40+
```bash
41+
k3s-killall.sh
42+
```
43+
44+
1. On each node, roll back the K3s binary to the previous version, but *do not* start K3s.
45+
46+
- Clusters with Internet Access:
47+
48+
- Server nodes:
49+
50+
```bash
51+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="server" INSTALL_K3S_SKIP_START="true" sh -
52+
```
53+
54+
- Agent nodes:
55+
56+
```bash
57+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="agent" INSTALL_K3S_SKIP_START="true" sh -
58+
```
59+
60+
- Air-gapped Clusters:
61+
62+
- Download the artifacts and run the [install script](../installation/airgap.md#2-install-k3s) locally. Add the environment variable `INSTALL_K3S_SKIP_START="true"` when running the install script to prevent K3s from starting.
63+
64+
1. On the first server node or the node without a `server:` entry in its [K3s config file](../installation/configuration.md), initiate the cluster restore. Refer to the [Snapshot Restore Steps](../cli/etcd-snapshot.md#snapshot-restore-steps) for more information:
65+
66+
```bash
67+
k3s server --cluster-reset --cluster-reset-restore-path=<PATH-TO-SNAPSHOT>
68+
```
69+
70+
:::warning
71+
This will overwrite all data in the etcd datastore. Verify the snapshot's integrity before restoring. Be aware that large snapshots can take a long time to restore.
72+
:::
73+
74+
1. Start the K3s service on the first server node:
75+
76+
```bash
77+
systemctl start k3s
78+
```
79+
80+
1. On the other server nodes, remove the K3s database directory:
81+
82+
```bash
83+
rm -rf /var/lib/rancher/k3s/server/db
84+
```
85+
86+
1. Start the K3s service on the other server nodes:
87+
88+
```bash
89+
systemctl start k3s
90+
```
91+
92+
1. Start the K3s service on all agent nodes:
93+
94+
```bash
95+
systemctl start k3s
96+
```
97+
98+
1. Verify the K3s service status with `systemctl status k3s`.
99+
100+
</TabItem>
101+
102+
<TabItem value='External Database' default>
103+
104+
To roll back a K3s cluster when using an external database (e.g., PostgreSQL, MySQL), follow these steps:
105+
106+
1. If the cluster is running and the Kubernetes API is available, gracefully stop workloads by draining all nodes:
107+
108+
```bash
109+
kubectl drain --ignore-daemonsets --delete-emptydir-data <NODE-ONE-NAME> <NODE-TWO-NAME> <NODE-THREE-NAME> ...
110+
```
111+
112+
:::note
113+
114+
This process may disrupt running applications.
115+
116+
:::
117+
118+
1. On each node, stop the K3s service and all running pod processes:
119+
120+
```bash
121+
k3s-killall.sh
122+
```
123+
124+
1. Restore a database snapshot taken before upgrading K3s and verify the integrity of the database. For example, if you're using PostgreSQL, run the following command:
125+
126+
```bash
127+
pg_restore -U <DB-USER> -d <DB-NAME> <BACKUP-FILE>
128+
```
129+
130+
1. On each node, roll back the K3s binary to the previous version.
131+
132+
- Clusters with Internet Access:
133+
- Server nodes:
134+
135+
```bash
136+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="server" sh -
137+
```
138+
139+
- Agent nodes:
140+
141+
```bash
142+
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=vX.Y.Zk3s1 INSTALL_K3S_EXEC="agent" sh -
143+
```
144+
145+
- Air-gapped Clusters:
146+
147+
- Download the artifacts and run the [install script](../installation/airgap.md#2-install-k3s) locally. Verify the K3s version after install with `k3s --version` and reapply any custom configurations that where used before the upgrade.
148+
149+
1. Start the K3s service on each node:
150+
151+
```bash
152+
systemctl start k3s
153+
```
154+
155+
1. Verify the K3s service status with `systemctl status k3s`.
156+
157+
</TabItem>
158+
</Tabs>
159+
160+
## Verification
161+
162+
After the rollback, verify the following:
163+
164+
- K3s version: `k3s --version`
165+
- Kubernetes cluster health: `kubectl get nodes`
166+
- Application functionality.
167+
- Check the K3s logs for errors.

0 commit comments

Comments
 (0)