diff --git a/docs/upgrade/automatic.md b/docs/upgrade/automatic.md index 5acd0bc180f..0f308305b48 100644 --- a/docs/upgrade/automatic.md +++ b/docs/upgrade/automatic.md @@ -21,6 +21,7 @@ The following table shows the upgrade path of all supported versions. | Upgrade from version | Supported new version(s) | |----------------------|--------------------------| +| [v1.1.2](./v1-1-2-to-v1-2-0.md) | v1.2.0 | | [v1.1.0, v1.1.1](./v1-1-to-v1-1-2.md) | v1.1.2 | | [v1.0.3](./v1-0-3-to-v1-1-1.md) | v1.1.0, v1.1.1 (v1.1.1 is recommended) | | [v1.0.2](./previous-releases/v1-0-2-to-v1-0-3.md) | v1.0.3 | @@ -29,8 +30,6 @@ The following table shows the upgrade path of all supported versions. ## Start an upgrade -Note we are still working towards zero-downtime upgrade, due to some known issues please follow the steps below before you upgrade your Harvester cluster: - :::caution - Before you upgrade your Harvester cluster, we highly recommend: @@ -39,6 +38,7 @@ Note we are still working towards zero-downtime upgrade, due to some known issue - Do not operate the cluster during an upgrade. For example, creating new VMs, uploading new images, etc. - Make sure your hardware meets the **preferred** [hardware requirements](../install/requirements.md#hardware-requirements). This is due to there will be intermediate resources consumed by an upgrade. - Make sure each node has at least 30 GiB of free system partition space (`df -h /usr/local/`). If any node in the cluster has less than 30 GiB of free system partition space, the upgrade will be denied. Check [free system partition space requirement](#free-system-partition-space-requirement) for more information. +- Run the pre-check script on a Harvester control-plane node. Please pick a script according to your cluster's version: https://github.com/harvester/upgrade-helpers/tree/main/pre-check. ::: diff --git a/docs/upgrade/previous-releases/_category_.json b/docs/upgrade/previous-releases/_category_.json index 1ba1c95a661..1ebd14bcc34 100644 --- a/docs/upgrade/previous-releases/_category_.json +++ b/docs/upgrade/previous-releases/_category_.json @@ -1,4 +1,4 @@ { - "position": 4, + "position": 5, "label": "Previous Releases" } \ No newline at end of file diff --git a/docs/upgrade/troubleshooting.md b/docs/upgrade/troubleshooting.md index 685623aadb8..096627061f4 100644 --- a/docs/upgrade/troubleshooting.md +++ b/docs/upgrade/troubleshooting.md @@ -1,5 +1,5 @@ --- -sidebar_position: 4 +sidebar_position: 6 sidebar_label: Troubleshooting title: "Troubleshooting" --- diff --git a/docs/upgrade/v1-0-3-to-v1-1-1.md b/docs/upgrade/v1-0-3-to-v1-1-1.md index 38f6625d080..04238bc40f6 100644 --- a/docs/upgrade/v1-0-3-to-v1-1-1.md +++ b/docs/upgrade/v1-0-3-to-v1-1-1.md @@ -1,5 +1,5 @@ --- -sidebar_position: 3 +sidebar_position: 4 sidebar_label: Upgrade from v1.0.3/v1.1.0 to v1.1.1 title: "Upgrade from v1.0.3/v1.1.0 to v1.1.1" --- diff --git a/docs/upgrade/v1-1-2-to-v1-2-0.md b/docs/upgrade/v1-1-2-to-v1-2-0.md new file mode 100644 index 00000000000..e180716a90b --- /dev/null +++ b/docs/upgrade/v1-1-2-to-v1-2-0.md @@ -0,0 +1,157 @@ +--- +sidebar_position: 2 +sidebar_label: Upgrade from v1.1.2 to v1.2.0 +title: "Upgrade from v1.1.2 to v1.2.0" +--- + + + + + + +## General information + +:::tip + +Before you start an upgrade, you can run the pre-check script to make sure the cluster is in a stable state. For more details, please visit this [URL](https://github.com/harvester/upgrade-helpers/tree/main/pre-check/v1.1.x) for the script. +::: + +Once there is an upgradable version, the Harvester GUI Dashboard page will show an upgrade button. For more details, please refer to [start an upgrade](./automatic.md#start-an-upgrade). + +For the air-gap env upgrade, please refer to [prepare an air-gapped upgrade](./automatic.md#prepare-an-air-gapped-upgrade). + + +## Known issues + +--- + +### 1. An upgrade can't start and reports `"validator.harvesterhci.io" denied the request: managed chart rancher-monitoring is not ready, please wait for it to be ready` + +If a cluster is configured with a **storage network**, an upgrade can't start with the following message. + +![](/img/v1.2/upgrade/known_issues/3839-error.png) + +- Related issue: + - [[Doc] upgrade stuck while upgrading system service with alertmanager and prometheus](https://github.com/harvester/harvester/issues/3839) +- Workaround: + - https://github.com/harvester/harvester/issues/3839#issuecomment-1534438192 + +--- + +### 2. An upgrade is stuck in `Creating Upgrade Repository` + +During an upgrade, **Creating Upgrade Repository** is stuck in the **Pending** state: + + ![](/img/v1.2/upgrade/known_issues/4246-pending.png) + +Please perform the following steps to check if the cluster runs into the issue: + +1. Check the upgrade repository pod: + + ![](/img/v1.2/upgrade/known_issues/4246-upgrade-repo-pod.png) + + If the `virt-launcher-upgrade-repo-hvst-` pod stays in `ContainerCreating`, your cluster might have run into this issue. In this case, proceed with step 2. + +2. Check the upgrade repository volume in the Longhorn GUI. + + 1. [Go to Longhorn GUI](../troubleshooting/harvester.md#access-embedded-rancher-and-longhorn-dashboards). + 2. Navigate to the **Volume** page. + 3. Check the upgrade repository VM volume. It should be attached to a pod called `virt-launcher-upgrade-repo-hvst-`. If one of the volume's replicas stays in `Stopped` (gray color), the cluster is running into the issue. + + ![](/img/v1.2/upgrade/known_issues/4246-pending-replica.png) + +- Related issue: + - [[BUG] upgrade stuck on create upgrade VM](https://github.com/harvester/harvester/issues/4246) +- Workaround: + - Delete the `Stopped` replica from Longhorn GUI. Or, + - [Start over the upgrade](./troubleshooting.md#start-over-an-upgrade). + +--- + +### 3. An upgrade is stuck when pre-draining a node + +Starting from v1.1.0, Harvester will wait for all volumes to become healthy (when node count >= 3) before upgrading a node. Generally, you can check volumes' health if an upgrade is stuck in the "pre-draining" state. + +Visit ["Access Embedded Longhorn"](../troubleshooting/harvester.md#access-embedded-rancher-and-longhorn-dashboards) to see how to access the embedded Longhorn GUI. + +You can also check the pre-drain job logs. Please refer to [Phase 4: Upgrade nodes](./troubleshooting.md#phase-4-upgrade-nodes) in the troubleshooting guide. + +--- + +### 4. An upgrade is stuck in upgrading the first node: Job was active longer than the specified deadline + +An upgrade fails, as shown in the screenshot below: + +![](/img/v1.2/upgrade/known_issues/2894-deadline.png) + + +- Related issue: + - [[BUG] Upgrade stuck in upgrading first node: Job was active longer than specified deadline](https://github.com/harvester/harvester/issues/2894) +- Workaround: + - https://github.com/harvester/harvester/issues/2894#issuecomment-1274069690 + + +--- + +### 5. An upgrade is stuck in the Pre-drained state + +You might see an upgrade is stuck in the "pre-drained" state: + +![](/img/v1.2/upgrade/known_issues/3730-stuck.png) + +This could be caused by a misconfigured PDB. To check if that's the case, perform the following steps: + +1. Assume the stuck node is `harvester-node-1`. +1. Check the `instance-manager-e` or `instance-manager-r` pod names on the stuck node: + + ``` + $ kubectl get pods -n longhorn-system --field-selector spec.nodeName=harvester-node-1 | grep instance-manager + instance-manager-r-d4ed2788 1/1 Running 0 3d8h + ``` + + The output above shows that the `instance-manager-r-d4ed2788` pod is on the node. + +1. Check Rancher logs and verify that the `instance-manager-e` or `instance-manager-r` pod can't be drained: + + ``` + $ kubectl logs deployment/rancher -n cattle-system + ... + 2023-03-28T17:10:52.199575910Z 2023/03/28 17:10:52 [INFO] [planner] rkecluster fleet-local/local: waiting: draining etcd node(s) custom-4f8cb698b24a,custom-a0f714579def + 2023-03-28T17:10:55.034453029Z evicting pod longhorn-system/instance-manager-r-d4ed2788 + 2023-03-28T17:10:55.080933607Z error when evicting pods/"instance-manager-r-d4ed2788" -n "longhorn-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. + ``` + +1. Run the command to check if there is a PDB associated with the stuck node: + + ``` + $ kubectl get pdb -n longhorn-system -o yaml | yq '.items[] | select(.spec.selector.matchLabels."longhorn.io/node"=="harvester-node-1") | .metadata.name' + instance-manager-r-466e3c7f + ``` + +1. Check the owner of the instance manager to this PDB: + + ``` + $ kubectl get instancemanager instance-manager-r-466e3c7f -n longhorn-system -o yaml | yq -e '.spec.nodeID' + harvester-node-2 + ``` + + If the output doesn't match the stuck node (in this example output, `harvester-node-2` doesn't match the stuck node `harvester-node-1`), then we can conclude this issue happens. + +1. Before applying the workaround, check if all volumes are healthy: + + ``` + kubectl get volumes -n longhorn-system -o yaml | yq '.items[] | select(.status.state == "attached")| .status.robustness' + ``` + + The output should all be `healthy`. If this is not the case, you might want to uncordon nodes to make the volume healthy again. + +1. Remove the misconfigured PDB: + + ``` + kubectl delete pdb instance-manager-r-466e3c7f -n longhorn-system + ``` + +- Related issue: + - [[BUG] 3 Node AirGapped Cluster Upgrade Stuck v1.1.0->v1.1.2-rc4](https://github.com/harvester/harvester/issues/3730 ) + +--- diff --git a/docs/upgrade/v1-1-to-v1-1-2.md b/docs/upgrade/v1-1-to-v1-1-2.md index 4c6df9a0528..ec81bc11533 100644 --- a/docs/upgrade/v1-1-to-v1-1-2.md +++ b/docs/upgrade/v1-1-to-v1-1-2.md @@ -1,5 +1,5 @@ --- -sidebar_position: 2 +sidebar_position: 3 sidebar_label: Upgrade from v1.1.0/v1.1.1 to v1.1.2 title: "Upgrade from v1.1.0/v1.1.1 to v1.1.2" --- diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/automatic.md b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/automatic.md index 287ed27a7f7..f8b0490e7d2 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/automatic.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/automatic.md @@ -17,6 +17,7 @@ Description: 升级 Harvester 有两种方法。你可以使用 ISO 镜像或通 | 原版本 | 支持的新版本 | |----------------------|--------------------------| +| [v1.1.2](./v1-1-2-to-v1-2-0.md) | v1.2.0 | | [v1.1.0, v1.1.1](./v1-1-to-v1-1-2.md) | v1.1.2 | | [v1.0.3](./v1-0-3-to-v1-1-1.md) | v1.1.0, v1.1.1(建议使用 v1.1.1) | | [v1.0.2](./previous-releases/v1-0-2-to-v1-0-3.md) | v1.0.3 | diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/previous-releases/_category_.json b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/previous-releases/_category_.json index 1e97d2b3abd..37322cf1c5a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/previous-releases/_category_.json +++ b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/previous-releases/_category_.json @@ -1,4 +1,4 @@ { - "position": 4, + "position": 5, "label": "以前的版本" } \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/troubleshooting.md b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/troubleshooting.md index 6f907a3a592..a878bc75fb9 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/troubleshooting.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/troubleshooting.md @@ -1,5 +1,5 @@ --- -sidebar_position: 4 +sidebar_position: 6 sidebar_label: 故障排除 title: "故障排除" --- diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-0-3-to-v1-1-1.md b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-0-3-to-v1-1-1.md index 3c96168d547..94f5de4f882 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-0-3-to-v1-1-1.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-0-3-to-v1-1-1.md @@ -1,5 +1,5 @@ --- -sidebar_position: 3 +sidebar_position: 4 sidebar_label: 从 v1.0.3/v1.1.0 升级到 v1.1.1 title: "从 v1.0.3/v1.1.0 升级到 v1.1.1" --- diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-1-2-to-v1-2-0.md b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-1-2-to-v1-2-0.md new file mode 100644 index 00000000000..fc4aafd1836 --- /dev/null +++ b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-1-2-to-v1-2-0.md @@ -0,0 +1,5 @@ +--- +sidebar_position: 2 +sidebar_label: Upgrade from v1.1.2 to v1.2.0 +title: "Upgrade from v1.1.2 to v1.2.0" +--- diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-1-to-v1-1-2.md b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-1-to-v1-1-2.md index c768b385456..9afe941674b 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-1-to-v1-1-2.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/upgrade/v1-1-to-v1-1-2.md @@ -1,5 +1,5 @@ --- -sidebar_position: 2 +sidebar_position: 3 sidebar_label: 从 v1.1.0/v1.1.1 升级到 v1.1.2 title: "从 v1.1.0/v1.1.1 升级到 v1.1.2" --- diff --git a/static/img/v1.2/upgrade/known_issues/3839-error.png b/static/img/v1.2/upgrade/known_issues/3839-error.png new file mode 100644 index 00000000000..d241d6b1a2f Binary files /dev/null and b/static/img/v1.2/upgrade/known_issues/3839-error.png differ diff --git a/static/img/v1.2/upgrade/known_issues/4246-pending-replica.png b/static/img/v1.2/upgrade/known_issues/4246-pending-replica.png new file mode 100644 index 00000000000..db2e6d6b0be Binary files /dev/null and b/static/img/v1.2/upgrade/known_issues/4246-pending-replica.png differ diff --git a/static/img/v1.2/upgrade/known_issues/4246-pending.png b/static/img/v1.2/upgrade/known_issues/4246-pending.png new file mode 100644 index 00000000000..9841f218084 Binary files /dev/null and b/static/img/v1.2/upgrade/known_issues/4246-pending.png differ diff --git a/static/img/v1.2/upgrade/known_issues/4246-upgrade-repo-pod.png b/static/img/v1.2/upgrade/known_issues/4246-upgrade-repo-pod.png new file mode 100644 index 00000000000..e55e14f539f Binary files /dev/null and b/static/img/v1.2/upgrade/known_issues/4246-upgrade-repo-pod.png differ