You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/docs/04-For Operators/04-maintenance.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,13 +23,17 @@ If you use the Gardener integration of metal-stack do not skip any patch release
23
23
## Releases
24
24
25
25
Before upgrading your metal-stack installation, review the release notes carefully - they contain important information on required pre-upgrade actions and notable changes. These notes are currently shared via a dedicated Slack channel and are also available in the release on GitHub. Once you are prepared, you can deploy a new metal-stack version by updating the `metal_stack_release_version` variable in your Ansible configuration and trigger the corresponding deployment jobs in your CI.
26
+
26
27
metal-stack offers prebuilt system images for firewalls and worker machines, which can be downloaded from `images.metal-stack.io`. In offline or air-gapped setups, these images must either be synced into the partition-local [image-cache](https://github.com/metal-stack/metal-image-cache-sync) after they were added to the metal-api or be manually downloaded in advance and uploaded to your local S3-compatible storage. Ensure that the image paths and metadata are correctly maintained so the system can retrieve them during provisioning.
27
28
If you are using metal-stack in combination with Gardener and you do not run pre-production stages, we advise running some basic functional tests after upgrading metal-stack to ensure the installation is in a fully functional state (e.g. reconciling a bunch of shoot clusters with evaluation purpose, creating and deleting a shoot cluster).
29
+
28
30
metal-images for firewalls and worker nodes follow independent release cycles, typically driven by the need for security patches or system updates. When new images are made available, the machines must be re-provisioned to apply the updates. When using metal-stack in a Kubernetes context, this results in a rolling update of the cluster worker groups.
29
31
In a Gardener setup, image updates can be triggered by referencing the new image in the shoot spec.
32
+
30
33
Because all outbound traffic passes through the firewall node, this results in a short downtime of around 30 seconds. This interruption only occurs if the firewall image has actually changed. The process works as follows: a new firewall node is provisioned and configured in parallel with the existing one. Once setup is complete, traffic is switched over to the new node, and the old firewall is then decommissioned. This minimizes disruption while ensuring a seamless transition.
34
+
31
35
The worker nodes are rolled out one after the other and, if possible, the containers are redistributed to the machines that are still available. However, for unclustered stateful workloads like databases, temporary disruptions may occur during node restarts.
32
36
33
37
## Rollback
34
38
35
-
metal-stack employs forward-only database migrations (e.g., for RethinkDB), and each release undergoes thorough integration testing. However, rollback procedures are not included in test coverage. To maintain data integrity and system reliability, rolling back a full release is not supported and strongly discouraged. In the event of issues after an upgrade, it is possible to downgrade specific components rather than reverting the entire system.
39
+
metal-stack employs forward-only database migrations (e.g., for RethinkDB), and each release undergoes thorough integration testing. However, rollback procedures are not included in test coverage. To maintain data integrity and system reliability, rolling back a full release is not supported and strongly discouraged. In the event of issues after an upgrade, it is possible to downgrade specific components rather than reverting the entire system.
0 commit comments