Skip to content

Commit c828d39

Browse files
Merge pull request #299780 from santhosh-kumar-cm/patch-1
[operator-nexus] R4.4 - Update howto-cluster-runtime-upgrade.md to provide context and info regarding the new management groups.
2 parents e71e6bd + 204ef91 commit c828d39

File tree

2 files changed

+5
-4
lines changed

2 files changed

+5
-4
lines changed

articles/operator-nexus/concepts-cluster-upgrade-overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ author: matternst7258
55
ms.author: matthewernst
66
ms.service: azure-operator-nexus
77
ms.topic: conceptual
8-
ms.date: 11/11/2024
8+
ms.date: 05/21/2025
99
ms.custom: template-concept
1010
---
1111

@@ -34,7 +34,7 @@ Patch runtime release is produced monthly in between the minor releases. These r
3434

3535
Starting a runtime upgrade is defined under [Upgrading cluster runtime via Azure CLI](./howto-cluster-runtime-upgrade.md).
3636

37-
The runtime upgrade starts by upgrading the three management servers designated as the control plane nodes. The spare control plane server is the first server to upgrade. The last control plane server deprovisions and transitions to `Available` state. These servers are updated serially and proceed only when each completes. The remaining management servers are upgraded into four different groups and completed one group at a time.
37+
The runtime upgrade starts by upgrading the three management servers designated as the control plane nodes. The spare control plane server is the first server to upgrade. The last control plane server deprovisions and transitions to `Available` state. These servers are updated serially and proceed only when each completes. The remaining management servers are segregated into two groups. The runtime upgrade will now leverage two management groups, instead of a single group. Each group is upgraded in two stages and sequentially with 50% success threshold in each group. Introducing this capability allows for components running on the management servers to ensure resiliency during the runtime upgrade by applying affinity rules. For this release, each CSN will leverage this functionality by placing one instance in each management group. No customer interaction with this functionality. There may be additional labels seen on management nodes to identify the groups.
3838

3939
> [!Note]
4040
> Customers may observe the spare server with a different runtime version. This is expected.

articles/operator-nexus/howto-cluster-runtime-upgrade.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: bpinto
66
ms.service: azure-operator-nexus
77
ms.custom: azure-operator-nexus, devx-track-azurecli
88
ms.topic: how-to
9-
ms.date: 02/25/2025
9+
ms.date: 05/21/2025
1010
# ms.custom: template-include
1111
---
1212

@@ -160,7 +160,8 @@ az networkcloud cluster update-version --cluster-name "<CLUSTER>" \
160160
```
161161

162162
The runtime upgrade is a long process. The upgrade first upgrades the management nodes and then sequentially Rack-by-Rack for the worker nodes.
163-
The upgrade is considered to be finished when 80% of worker nodes per rack and 100% of management nodes are successfully upgraded.
163+
The management servers are segregated into two groups. The runtime upgrade will now leverage two management groups, instead of a single group. Introducing this capability allows for components running on the management servers to ensure resiliency during the runtime upgrade by applying affinity rules. For this release, each CSN will leverage this functionality by placing one instance in each management group. No customer interaction with this functionality. There may be additional labels seen on management nodes to identify the groups.
164+
The upgrade is considered to be finished when 80% of worker nodes per rack and 50% of management nodes in each group are successfully upgraded.
164165
Workloads might be impacted while the worker nodes in a rack are in the process of being upgraded, however workloads in all other racks aren't impacted. Consideration of workload placement in light of this implementation design is encouraged.
165166

166167
Upgrading all the nodes takes multiple hours, depending upon how many racks exist for the Cluster.

0 commit comments

Comments
 (0)