You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/operator-nexus/concepts-nexus-availability.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,7 +55,7 @@ Go through the following steps to help plan an Operator Nexus deployment.
55
55
56
56
4. Plan the distribution of workload instances to determine the number of racks needed in each site type, allowing for the fact that each rack is an Operator Nexus zone. The platform can enforce affinity/anti-affinity rules at the scope of these zones, to ensure workload instances are distributed in such a way as to be resilient to failures of individual servers or racks. See [this article](./howto-virtual-machine-placement-hints.md) for more on affinity/anti-affinity rules. The Operator Nexus Azure Kubernetes Service (NAKS) controller automatically distributes nodes within a cluster across the available servers in a zone as uniformly as possible, within other constraints. As a result, failure of any single server has the minimum impact on the total capacity remaining.
57
57
58
-
5. Factor in the [threshold redundancy](./howto-cluster-runtime-upgrade#configure-compute-threshold-parameters-for-runtime-upgrade-using-cluster-updatestrategy) that is required within each site on upgrade. This configuration option indicates to the orchestration engine the minimum number of worker nodes that must be available in order for a platform upgrade to be considered successful and allowed to proceed. Reserving these nodes eats into any capacity headroom. Setting a higher bar decreases the overall deployment's resilience to failure of individual nodes, but improves efficiency of utilization of the available capacity.
58
+
5. Factor in the [threshold redundancy](./howto-cluster-runtime-upgrade.md#configure-compute-threshold-parameters-for-runtime-upgrade-using-cluster-updatestrategy) that is required within each site on upgrade. This configuration option indicates to the orchestration engine the minimum number of worker nodes that must be available in order for a platform upgrade to be considered successful and allowed to proceed. Reserving these nodes eats into any capacity headroom. Setting a higher bar decreases the overall deployment's resilience to failure of individual nodes, but improves efficiency of utilization of the available capacity.
59
59
60
60
6. Operator Nexus supports between 1 and 8 racks per site inclusive, with each rack containing 4, 8, 12 or 16 servers. All racks must be identical in terms of number of servers. See [here](./reference-near-edge-compute.md) for specifics of the resource available for workloads. See the following diagram, and also [this article](./reference-limits-and-quotas.md) for other limits and quotas that might have an impact.
61
61
@@ -119,13 +119,13 @@ Operator Nexus upgrade is initiated by the customer, but it's then managed by th
119
119
120
120
- The process is only active on one rack in the selected site at a time. Although upgrade is done in-place, there's still some impact to the worker nodes in the rack during the upgrade.
121
121
122
-
For more information about the upgrade process, see [this article](./howto-cluster-runtime-upgrade#upgrading-cluster-runtime-using-cli). For more information about ensuring control-plane resiliency, see [this one](./concepts-rack-resiliency.md).
122
+
For more information about the upgrade process, see [this article](./howto-cluster-runtime-upgrade.md#upgrading-cluster-runtime-using-cli). For more information about ensuring control-plane resiliency, see [this one](./concepts-rack-resiliency.md).
123
123
124
124
## Designing and Operating High Availability Workloads for Operator Nexus
125
125
126
126
Workloads should ideally follow a cloud-native design, with N+k clusters that can be deployed across multiple nodes and racks within a site, using the Operator Nexus zone concept.
127
127
128
-
The Well Architected Framework guidance on [mission critical](/well-architected/mission-critical/) and [carrier grade](/well-architected/carrier-grade/) workloads on Azure also applies to workloads on Operator Nexus.
128
+
The Well Architected Framework guidance on [mission critical](../well-architected/mission-critical/) and [carrier grade](../well-architected/carrier-grade/) workloads on Azure also applies to workloads on Operator Nexus.
129
129
130
130
Designing and implementing highly available workloads on any platform requires a top-down approach. Start with an understanding of the availability required from the solution as a whole. Consider the key elements of the solution and their predicted availability. Then determine how these attributes need to be combined in order to achieve the solution level goals.
0 commit comments