|
| 1 | +--- |
| 2 | +title: Updating your Nutanix cluster firmware |
| 3 | +slug: nutanix-cluster-firmware-update |
| 4 | +excerpt: Find out how to update your Nutanix cluster firmware |
| 5 | +section: Upgrade |
| 6 | +order: 01 |
| 7 | +updated: 2023-03-08 |
| 8 | +--- |
| 9 | + |
| 10 | +**Last updated 8th March 2023** |
| 11 | + |
| 12 | +## Objective |
| 13 | + |
| 14 | +This article provides you with the steps to update Nutanix clusters firmwares by putting each node in maintenance, before rebooting in rescue mode one node at a time. |
| 15 | + |
| 16 | +Our services will take over to apply updates firmwares and will restart the node once done. |
| 17 | + |
| 18 | +> [!warning] |
| 19 | +> Before beginning any action, log in to your [OVHcloud Control Panel](https://www.ovh.com/auth/?action=gotomanager&from=https://www.ovh.co.uk/&ovhSubsidiary=GB) and create a support request ticket to ask a firmware update and provide the OVHcloud support teams with the technical elements regarding your cluster. |
| 20 | +
|
| 21 | +**Find out how to update your Nutanix cluster firmware.** |
| 22 | + |
| 23 | +## Requirements |
| 24 | + |
| 25 | +- A Nutanix cluster in your OVHcloud account |
| 26 | +- Access to the [OVHcloud Control Panel](https://www.ovh.com/auth/?action=gotomanager&from=https://www.ovh.co.uk/&ovhSubsidiary=GB) |
| 27 | +- Consulting the guide [First steps to use the OVHcloud API](https://docs.ovh.com/gb/en/api/first-steps-with-ovh-api/) (to familiarise yourself with the OVHcloud API) |
| 28 | + |
| 29 | +## Instructions |
| 30 | + |
| 31 | +Before any action, log in to your Prism Element interface and perform the following tasks: |
| 32 | + |
| 33 | +- Check that the cluster's "**Data Resiliency Status**" is `OK` |
| 34 | + |
| 35 | +This can be verified on the main dashboard of your Prism Element (PE) interface: |
| 36 | + |
| 37 | +{.thumbnail} |
| 38 | + |
| 39 | +- Run a NCC check |
| 40 | + |
| 41 | +In the Prism Element interface, click `Health`{.action} from the main menu. |
| 42 | + |
| 43 | +{.thumbnail} |
| 44 | + |
| 45 | +Then click `Actions`{.action} to the right and click `Run NCC Checks`{.action}. |
| 46 | + |
| 47 | +{.thumbnail} |
| 48 | + |
| 49 | +Select `All checks`{.action} and click `Run`{.action}. |
| 50 | + |
| 51 | +{.thumbnail} |
| 52 | + |
| 53 | +A log file called `/home/nutanix/data/logs/ncc-output-latest.log` will be generated at the end of checks. |
| 54 | + |
| 55 | +Please analyze it carefully. If you find errors or fails about cluster or service state, do not continue and contact the OVHcloud support. |
| 56 | + |
| 57 | +> [!primary] |
| 58 | +> It is possible to run ncc checks on the CVM by typing the following command from a terminal. |
| 59 | +
|
| 60 | +```bash |
| 61 | +ncc health_checks run_all |
| 62 | +``` |
| 63 | + |
| 64 | +### Enabling maintenance mode |
| 65 | + |
| 66 | +Nodes will be updated one by one, the Nutanix cluster will continue to work properly. |
| 67 | + |
| 68 | +To log in to CVM, you can launch IPMI from your OVHcloud Control Panel or use a terminal. |
| 69 | + |
| 70 | +> [!primary] |
| 71 | +> Before putting the host in maintenance, ensure remaining hosts have enough resources to host migrated VMS from it (CPU, Memory, storage). |
| 72 | +
|
| 73 | +#### Connect to CVM |
| 74 | + |
| 75 | +At the login prompt, log in with root credentials to access the host terminal.<br> |
| 76 | +Then open an SSH connection to any CVM with Nutanix credentials to access the CVM terminal. |
| 77 | + |
| 78 | +{.thumbnail} |
| 79 | + |
| 80 | +#### Check nodes state |
| 81 | + |
| 82 | +Once logged in, check that: |
| 83 | + |
| 84 | +- `Node state` status is set to `AcropolisNormal`. |
| 85 | +- `Schedulable` column is set to `True` for all nodes. |
| 86 | + |
| 87 | +Then run the following command to check: |
| 88 | + |
| 89 | +```bash |
| 90 | +acli host.list |
| 91 | +``` |
| 92 | + |
| 93 | +{.thumbnail} |
| 94 | + |
| 95 | +If all checks are OK, you need to check that the current host state can be changed to maintenance. To do so, use the following command: |
| 96 | + |
| 97 | +```bash |
| 98 | +acli host.enter_maintenance_mode_check <Hypervisor_IP> |
| 99 | +``` |
| 100 | + |
| 101 | +{.thumbnail} |
| 102 | + |
| 103 | +#### Put a node in maintenance mode |
| 104 | + |
| 105 | +> [!primary] |
| 106 | +> VMs with specific policies (like affinity, CPU passthrough...) shall be stopped manually before running maintenance as they will not migrate. |
| 107 | +
|
| 108 | +If all hosts are eligible to maintenance mode, put a first host in maintenance mode with the following command: |
| 109 | + |
| 110 | +```bash |
| 111 | +acli host.enter_maintenance_mode 192.168.0.1 wait=true |
| 112 | +``` |
| 113 | + |
| 114 | +{.thumbnail} |
| 115 | + |
| 116 | +> [!warning] |
| 117 | +> When hosts enter maintenance mode, all hosted VMs will be migrated on other hosts without any interruption. |
| 118 | +
|
| 119 | +#### Shutdown the CVM |
| 120 | + |
| 121 | +Once the host is in maintenance mode, CVM can be shutdown with the following command: |
| 122 | + |
| 123 | +```bash |
| 124 | +cvm_shutdown -P now |
| 125 | +``` |
| 126 | + |
| 127 | +{.thumbnail} |
| 128 | + |
| 129 | +With root credentials, open a terminal on the node that hosts the CVM and confirm that the CVM is stopped: |
| 130 | + |
| 131 | +```bash |
| 132 | +virsh list --all |
| 133 | +``` |
| 134 | + |
| 135 | +{.thumbnail} |
| 136 | + |
| 137 | +On the main dashboard, the "**Data Resiliency Status**" will become `Critical`, the cluster is now running with 2 nodes. |
| 138 | + |
| 139 | +{.thumbnail} |
| 140 | + |
| 141 | +The CVM is now shut down. |
| 142 | + |
| 143 | +### Reboot to rescue mode |
| 144 | + |
| 145 | +Log in to the [OVHcloud Control Panel](https://www.ovh.com/auth/?action=gotomanager&from=https://www.ovh.co.uk/&ovhSubsidiary=GB), go to the `Hosted Private Cloud`{.action}, choose the `Nutanix`{.action} solution and select your cluster. |
| 146 | + |
| 147 | +{.thumbnail} |
| 148 | + |
| 149 | +Identify the node to boot in rescue mode by using the following OVHcloud API call: |
| 150 | + |
| 151 | +> [!api] |
| 152 | +> |
| 153 | +> @api {GET} /nutanix/{serviceName} |
| 154 | +> |
| 155 | +
|
| 156 | +- `serviceName`: enter the cluster name |
| 157 | + |
| 158 | +You can then identify your node name: |
| 159 | + |
| 160 | +{.thumbnail} |
| 161 | + |
| 162 | +Once you have retrieved the name of the node to reboot in rescue mode, select this node in your OVHcloud Control Panel. |
| 163 | + |
| 164 | +In the `Boot` section, click the `...`{.action} button then click `Edit`{.action}. |
| 165 | + |
| 166 | +{.thumbnail} |
| 167 | + |
| 168 | +Change the netboot by choosing `rescue mode`{.action}, choose the `rescue-customer`{.action} version and click `Next`{.action}. |
| 169 | + |
| 170 | +{.thumbnail} |
| 171 | + |
| 172 | +Confirm your choice. |
| 173 | + |
| 174 | +{.thumbnail} |
| 175 | + |
| 176 | +Once confirmed, a green message will confirm that the new netboot has been updated. |
| 177 | + |
| 178 | +Click again the `...`{.action} button and click `Restart`{.action}. |
| 179 | + |
| 180 | +{.thumbnail} |
| 181 | + |
| 182 | +The server will reboot. Optionally, you can open an IPMI session to follow the reboot of your node. |
| 183 | + |
| 184 | +When the node is booted on `rescue-customer`, update the your support ticket with this information to notify the OVHcloud support teams that they can proceed with the firmware update. |
| 185 | + |
| 186 | +Our support teams will finish the necessary updates, meaning they will: |
| 187 | + |
| 188 | +- restart the node on the local disk, which will start the Nutanix system and the CVM automatically. |
| 189 | +- update the ticket to let you know you can exit the node from maintenance mode. |
| 190 | + |
| 191 | +At this time, the node will be up and running, follow the next step to exit the maintenance mode. |
| 192 | + |
| 193 | +### Exit from maintenance mode |
| 194 | + |
| 195 | +After updating the node, our services will reboot the node from local disk. The Nutanix software will load AOS and the CVM will automatically start. |
| 196 | + |
| 197 | +Once the system is up and running, log in to the CVM and run the following command: |
| 198 | + |
| 199 | +```bash |
| 200 | +acli host.list |
| 201 | +``` |
| 202 | + |
| 203 | +As you can see in the output image below, the first node is still in maintenance mode. |
| 204 | + |
| 205 | +{.thumbnail} |
| 206 | + |
| 207 | +To exit the node from maintenance mode, run the following command: |
| 208 | + |
| 209 | +```bash |
| 210 | +host.exit_maintenance_mode 192.168.0.1 |
| 211 | +``` |
| 212 | + |
| 213 | +{.thumbnail} |
| 214 | + |
| 215 | +The host exits from `maintenance` state and goes back to `Normal` state. |
| 216 | + |
| 217 | +Migrated VMs from this node automatically move from other nodes to it. |
| 218 | + |
| 219 | +On the main dashboard, the "**Data Resiliency Status**" will revert to `OK`, the cluster also returns to its nominal state. |
| 220 | + |
| 221 | +{.thumbnail} |
| 222 | + |
| 223 | +Proceed with the remaining nodes one at a time with the same steps. |
| 224 | + |
| 225 | +Please do not open a new ticket, just add comments on the same ticket for each node, specifying the name server (e.g. `ns123456`). |
| 226 | + |
| 227 | +## Go further <a name="gofurther"></a> |
| 228 | + |
| 229 | +Join our community of users on <https://community.ovh.com/en/>. |
0 commit comments