Skip to content

Commit c7cf563

Browse files
committed
Update perform-ece-hosts-maintenance.md
1 parent b7f454d commit c7cf563

File tree

1 file changed

+65
-1
lines changed

1 file changed

+65
-1
lines changed

deploy-manage/maintenance/ece/perform-ece-hosts-maintenance.md

Lines changed: 65 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,13 @@ These steps show how you can safely perform maintenance on hosts in your ECE ins
2020
You can perform these maintenance actions on the hosts in your ECE installation using one of these methods:
2121

2222
* [By disabling the Docker daemon (nondestructive)](#ece-perform-host-maintenance-docker-disable)
23+
* [By disabling the Podman related services (nondestructive)](#ece-perform-host-maintenance-podman-disable) - **For Podman Only**
2324
* [By deleting the host (destructive)](#ece-perform-host-maintenance-delete-runner)
2425
* [By shutting down the host (less destructive)](#ece-perform-host-maintenance-delete-runner)
2526

2627
Which method you choose depends on how invasive your host maintenance needs to be. If your host maintenance could affect ECE, use the destructive method that first deletes the host from your installation. These methods include a step that moves any hosted {{es}} clusters and {{kib}} instances off the affected hosts and are generally considered safe, provided that your ECE installation still has sufficient resources available to operate after the host has been removed.
2728

28-
## By disabling the Docker daemon [ece-perform-host-maintenance-docker-disable]
29+
## By disabling the Docker daemon (Non Destructive) [ece-perform-host-maintenance-docker-disable]
2930

3031
This method lets you perform maintenance actions on hosts without first removing the associated host from your {{ece}} installation. It works by disabling the Docker daemon. The host remains a part of your ECE installation throughout these steps but will be offline and the resources it provides will not be available.
3132

@@ -71,6 +72,69 @@ To perform host maintenance:
7172

7273
After the host shows a green status in the Cloud UI, it is fully functional again and can be used as before.
7374

75+
## By disabling the Podman related services (Non Destructive) [ece-perform-host-maintenance-podman-disable]
76+
77+
:::{note}
78+
This section only applies to Podman.
79+
:::
80+
81+
This method lets you perform maintenance actions on hosts without first removing the associated host from your {{ece}} installation. It works by disabling the Podman related services. The host remains a part of your ECE installation throughout these steps but will be offline and the resources it provides will not be available.
82+
83+
To perform host maintenance:
84+
85+
1. Recommended: If the host holds the allocator role and you have enough spare capacity:
86+
1. [Enable maintenance mode](enable-maintenance-mode.md) on the allocator.
87+
2. [Move all nodes off the allocator](move-nodes-instances-from-allocators.md) and to other allocators in your installation. Moving all nodes lets you retain the same level of redundancy for highly available {{es}} clusters and ensures that other clusters without high availability remain available.
88+
::::{important}
89+
Skipping Step 1 will affect the availability of clusters with nodes on the allocator.
90+
::::
91+
92+
2. Disable the Podman Service, Podman Socket, and Podman Restart Service
93+
94+
```sh
95+
sudo systemctl disable podman.service
96+
sudo systemctl disable podman.socket
97+
sudo systemctl disable podman-restart.service
98+
```
99+
100+
3. Reboot the host:
101+
102+
```sh
103+
sudo reboot
104+
```
105+
106+
After rebooting, confirm there are no running containers:
107+
- `sudo podman ps` - Output should be empty
108+
109+
If an `frc-*` or `fac-*` container is found to be running for some reason, stop it:
110+
- `sudo podman stop $(sudo podman ps -a --filter "name=fac" --filter "name=frc" --format "{{.ID}}")`
111+
112+
4. Perform your maintenance on the host, such as patching the operating system.
113+
5. Re-enable the Podman related services:
114+
115+
```sh
116+
sudo systemctl enable podman.service
117+
sudo systemctl enable podman.socket
118+
sudo systemctl enable podman-restart.service
119+
```
120+
121+
6. Reboot the host again:
122+
123+
```sh
124+
sudo reboot
125+
```
126+
127+
Confirm the containers have started:
128+
- `sudo podman ps -a`
129+
- use `-a` flag so no containers are overlooked
130+
131+
132+
7. If you enabled maintenance mode in Step 1: Take the allocator out of maintenance mode.
133+
8. Optional for allocators: ECE will start using the allocator again as you create new or change existing clusters, but it will not automatically redistribute nodes to an allocator after it becomes available. If you want to move nodes back to the same allocator after host maintenance, you need to manually [move the nodes](move-nodes-instances-from-allocators.md) and specify the allocator as a target.
134+
9. Verify that all ECE services and deployments are back up by checking that the host shows a green status in the Cloud UI.
135+
136+
After the host shows a green status in the Cloud UI, it is fully functional again and can be used as before.
137+
74138
## By deleting the host (destructive) [ece-perform-host-maintenance-delete-runner]
75139

76140
This method lets you perform potentially destructive maintenance actions on hosts. It works by deleting the associated host, which removes the host from your {{ece}} installation. To add the host to your ECE installation again after host maintenance is complete, you must reinstall ECE.

0 commit comments

Comments
 (0)