Skip to content

Commit b12d1a4

Browse files
Merge pull request #275011 from dramasamy/fixnaks
NAKS to NKS, and title fix
2 parents 2e90531 + 209aacd commit b12d1a4

File tree

1 file changed

+62
-61
lines changed

1 file changed

+62
-61
lines changed

articles/operator-nexus/concepts-nexus-kubernetes-placement.md

Lines changed: 62 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -9,24 +9,24 @@ ms.date: 04/19/2024
99
ms.custom: template-concept
1010
---
1111

12-
# Background
12+
# Resource placement in Azure Operator Nexus Kubernetes
1313

1414
Operator Nexus instances are deployed at the customer premises. Each instance
1515
comprises one or more racks of bare metal servers.
1616

17-
When a user creates a Nexus Kubernetes Cluster (NAKS), they specify a count and
17+
When a user creates a Nexus Kubernetes Cluster (NKS), they specify a count and
1818
a [stock keeping unit](./reference-nexus-kubernetes-cluster-sku.md) (SKU) for
1919
virtual machines (VM) that make up the Kubernetes Control Plane and one or more
2020
Agent Pools. Agent Pools are the set of Worker Nodes on which a customer's
2121
containerized network functions run.
2222

2323
The Nexus platform is responsible for deciding the bare metal server on which
24-
each NAKS VM launches.
24+
each NKS VM launches.
2525

26-
## How the Nexus Platform Schedules a NAKS VM
26+
## How the Nexus platform schedules a Nexus Kubernetes Cluster VM
2727

2828
Nexus first identifies the set of potential bare metal servers that meet all of
29-
the resource requirements of the NAKS VM SKU. For example, if the user
29+
the resource requirements of the NKS VM SKU. For example, if the user
3030
specified an `NC_G48_224_v1` VM SKU for their agent pool, Nexus collects the
3131
bare metal servers that have available capacity for 48 vCPU, 224Gi of RAM, etc.
3232

@@ -35,39 +35,39 @@ Plane being scheduled. If this field isn't empty, Nexus filters the list of
3535
potential bare metal servers to only those servers in the specified
3636
availability zones (racks). This behavior is a *hard scheduling constraint*. If
3737
there's no bare metal servers in the filtered list, Nexus *doesn't schedule*
38-
the NAKS VM and the cluster fails to provision.
38+
the NKS VM and the cluster fails to provision.
3939

4040
Once Nexus identifies a list of potential bare metal servers on which to place
41-
the NAKS VM, Nexus then picks one of the bare metal servers after applying the
41+
the NKS VM, Nexus then picks one of the bare metal servers after applying the
4242
following sorting rules:
4343

44-
1. Prefer bare metal servers in availability zones (racks) that don't have NAKS
45-
VMs from this NAKS Cluster. In other words, *spread the NAKS VMs for a NAKS
44+
1. Prefer bare metal servers in availability zones (racks) that don't have NKS
45+
VMs from this NKS Cluster. In other words, *spread the NKS VMs for an NKS
4646
Cluster across availability zones*.
4747

4848
1. Prefer bare metal servers within a single availability zone (rack) that
49-
don't have other NAKS VMs from the same NAKS Cluster. In other words,
50-
*spread the NAKS VMs for a NAKS Cluster across bare metal servers within an
49+
don't have other NKS VMs from the same NKS Cluster. In other words,
50+
*spread the NKS VMs for an NKS Cluster across bare metal servers within an
5151
availability zone*.
5252

53-
1. If the NAKS VM SKU is either `NC_G48_224_v1` or `NC_P46_224_v1`, prefer
53+
1. If the NKS VM SKU is either `NC_G48_224_v1` or `NC_P46_224_v1`, prefer
5454
bare metal servers that already house `NC_G48_224_v1` or `NC_P46_224_v1`
55-
NAKS VMs from other NAKS Clusters. In other words, *group the extra-large
56-
VMs from different NAKS Clusters on the same bare metal servers*. This rule
55+
NKS VMs from other NKS Clusters. In other words, *group the extra-large
56+
VMs from different NKS Clusters on the same bare metal servers*. This rule
5757
"bin packs" the extra-large VMs in order to reduce fragmentation of the
5858
available compute resources.
5959

60-
## Example Placement Scenarios
60+
## Example placement scenarios
6161

6262
The following sections highlight behavior that Nexus users should expect
63-
when creating NAKS Clusters against an Operator Nexus environment.
63+
when creating NKS Clusters against an Operator Nexus environment.
6464

65-
> **Hint**: You can see which bare metal server your NAKS VMs were scheduled to
66-
> by examining the `nodes.bareMetalMachineId` property of the NAKS
65+
> **Hint**: You can see which bare metal server your NKS VMs were scheduled to
66+
> by examining the `nodes.bareMetalMachineId` property of the NKS
6767
> KubernetesCluster resource or viewing the "Host" column in Azure Portal's
6868
> display of Kubernetes Cluster Nodes.
6969
70-
:::image type="content" source="media/nexus-kubernetes/show-baremetal-host.png" alt-text="A screenshot showing bare metal server for NAKS VMs.":::
70+
:::image type="content" source="media/nexus-kubernetes/show-baremetal-host.png" lightbox="media/nexus-kubernetes/show-baremetal-host.png" alt-text="A screenshot showing bare metal server for NKS VMs.":::
7171

7272
The example Operator Nexus environment has these specifications:
7373

@@ -77,12 +77,12 @@ The example Operator Nexus environment has these specifications:
7777

7878
[numa]: https://en.wikipedia.org/wiki/Non-uniform_memory_access
7979

80-
### Empty Environment
80+
### Empty environment
8181

8282
Given an empty Operator Nexus environment with the given capacity, we create
8383
three differently sized Nexus Kubernetes Clusters.
8484

85-
The NAKS Clusters have these specifications, and we assume for the purposes of
85+
The NKS Clusters have these specifications, and we assume for the purposes of
8686
this exercise that the user creates the three Clusters in the following order:
8787

8888
Cluster A
@@ -124,17 +124,16 @@ Cluster C Agent Pool #1 has 12 VMs restricted to AvailabilityZones [1, 4] so it
124124
has 12 VMs on 12 bare metal servers, six in each of racks 1 and 4.
125125

126126
Extra-large VMs (the `NC_P46_224_v1` SKU) from different clusters are placed
127-
on the same bare metal servers (see rule #3 in
128-
[How the Nexus Platform Schedules a VM][#how-the-nexus-platform-schedule-a-vm]).
127+
on the same bare metal servers (see rule #3 in [How the Nexus platform schedules a Nexus Kubernetes Cluster VM](#how-the-nexus-platform-schedules-a-nexus-kubernetes-cluster-vm)).
129128

130129
Here's a visualization of a layout the user might see after deploying Clusters
131130
A, B, and C into an empty environment.
132131

133-
:::image type="content" source="media/nexus-kubernetes/after-first-deployment.png" alt-text="Diagram showing possible layout of VMs after first deployment.":::
132+
:::image type="content" source="media/nexus-kubernetes/after-first-deployment.png" lightbox="media/nexus-kubernetes/after-first-deployment.png" alt-text="Diagram showing possible layout of VMs after first deployment.":::
134133

135-
### Half-full Environment
134+
### Half-full environment
136135

137-
We now run through an example of launching another NAKS Cluster when the target
136+
We now run through an example of launching another NKS Cluster when the target
138137
environment is half-full. The target environment is half-full after Clusters A,
139138
B, and C are deployed into the target environment.
140139

@@ -164,7 +163,7 @@ If a Cluster D control plane VM lands on rack 7 or 8, it's likely that one
164163
Cluster D Agent Pool #1 VM lands on the same bare metal server as that Cluster
165164
D control plane VM. This behavior is due to Agent Pool #1 being "pinned" to
166165
racks 7 and 8. Capacity constraints in those racks cause the scheduler to
167-
collocate a control plane VM and an Agent Pool #1 VM from the same NAKS
166+
collocate a control plane VM and an Agent Pool #1 VM from the same NKS
168167
Cluster.
169168

170169
Cluster D's Agent Pool #2 has three VMs on different bare metal servers on each
@@ -176,12 +175,12 @@ and Agent Pool #2 are collocated on the same bare metal servers in racks 7 and
176175
Here's a visualization of a layout the user might see after deploying Cluster
177176
D into the target environment.
178177

179-
:::image type="content" source="media/nexus-kubernetes/after-second-deployment.png" alt-text="Diagram showing possible layout of VMs after second deployment.":::
178+
:::image type="content" source="media/nexus-kubernetes/after-second-deployment.png" lightbox="media/nexus-kubernetes/after-second-deployment.png" alt-text="Diagram showing possible layout of VMs after second deployment.":::
180179

181-
### Nearly full Environment
180+
### Nearly full environment
182181

183182
In our example target environment, four of the eight racks are
184-
close to capacity. Let's try to launch another NAKS Cluster.
183+
close to capacity. Let's try to launch another NKS Cluster.
185184

186185
Cluster E has the following specifications:
187186

@@ -197,71 +196,73 @@ into the target environment.
197196
| E | Agent Pool #1 | `NC_P46_224_v1` | 32 | 8 | 8 | **4** | **3, 4 or 5** |
198197

199198
Cluster E's Agent Pool #1 will spread unevenly over all eight racks. Racks 7
200-
and 8 will have three NAKS VMs from Agent Pool #1 instead of the expected four
201-
NAKS VMs because there's no more capacity for the extra-large SKU VMs in those
199+
and 8 will have three NKS VMs from Agent Pool #1 instead of the expected four
200+
NKS VMs because there's no more capacity for the extra-large SKU VMs in those
202201
racks after scheduling Clusters A through D. Because racks 7 and 8 don't have
203-
capacity for the fourth extra-large SKU in Agent Pool #1, five NAKS VMs will
202+
capacity for the fourth extra-large SKU in Agent Pool #1, five NKS VMs will
204203
land on the two least-utilized racks. In our example, those least-utilized
205204
racks were racks 3 and 6.
206205

207206
Here's a visualization of a layout the user might see after deploying Cluster
208207
E into the target environment.
209208

210-
:::image type="content" source="media/nexus-kubernetes/after-third-deployment.png" alt-text="Diagram showing possible layout of VMs after third deployment.":::
209+
:::image type="content" source="media/nexus-kubernetes/after-third-deployment.png" lightbox="media/nexus-kubernetes/after-third-deployment.png" alt-text="Diagram showing possible layout of VMs after third deployment.":::
211210

212-
## Placement during a Runtime Upgrade
211+
## Placement during a runtime upgrade
213212

214213
As of April 2024 (Network Cloud 2304.1 release), runtime upgrades are performed
215214
using a rack-by-rack strategy. Bare metal servers in rack 1 are reimaged all at
216215
once. The upgrade process pauses until all the bare metal servers successfully
217216
restart and tell Nexus that they're ready to receive workloads.
218217

219-
> Note: It is possible to instruct Operator Nexus to only reimage a portion of
218+
> [!NOTE]
219+
> It is possible to instruct Operator Nexus to only reimage a portion of
220220
> the bare metal servers in a rack at once, however the default is to reimage
221221
> all bare metal servers in a rack in parallel.
222222
223223
When an individual bare metal server is reimaged, all workloads running on that
224-
bare metal server, including all NAKS VMs, lose power, and connectivity. Workload
225-
containers running on NAKS VMs will, in turn, lose power, and connectivity.
226-
After one minute of not being able to reach those workload containers, the NAKS
224+
bare metal server, including all NKS VMs, lose power, and connectivity. Workload
225+
containers running on NKS VMs will, in turn, lose power, and connectivity.
226+
After one minute of not being able to reach those workload containers, the NKS
227227
Cluster's Kubernetes Control Plane will mark the corresponding Pods as
228-
unhealthy. If the Pods are members of a Deployment or StatefulSet, the NAKS
228+
unhealthy. If the Pods are members of a Deployment or StatefulSet, the NKS
229229
Cluster's Kubernetes Control Plane attempts to launch replacement Pods to
230230
bring the observed replica count of the Deployment or StatefulSet back to the
231231
desired replica count.
232232

233233
New Pods only launch if there's available capacity for the Pod in the remaining
234-
healthy NAKS VMs. As of April 2024 (Network Cloud 2304.1 release), new NAKS VMs
235-
aren't created to replace NAKS VMs that were on the bare metal server being
234+
healthy NKS VMs. As of April 2024 (Network Cloud 2304.1 release), new NKS VMs
235+
aren't created to replace NKS VMs that were on the bare metal server being
236236
reimaged.
237237

238-
Once the bare metal server is successfully reimaged and able to accept new NAKS
239-
VMs, the NAKS VMs that were originally on the same bare metal server relaunch
238+
Once the bare metal server is successfully reimaged and able to accept new NKS
239+
VMs, the NKS VMs that were originally on the same bare metal server relaunch
240240
on the newly reimaged bare metal server. Workload containers may then be
241-
scheduled to those NAKS VMs, potentially restoring the Deployments or
242-
StatefulSets that had Pods on NAKS VMs that were on the bare metal server.
241+
scheduled to those NKS VMs, potentially restoring the Deployments or
242+
StatefulSets that had Pods on NKS VMs that were on the bare metal server.
243243

244-
> **Note**: This behavior may seem to the user as if the NAKS VMs did not
244+
> [!NOTE]
245+
> This behavior may seem to the user as if the NKS VMs did not
245246
> "move" from the bare metal server, when in fact a new instance of an identical
246-
> NAKS VM was launched on the newly reimaged bare metal server that retained the
247+
> NKS VM was launched on the newly reimaged bare metal server that retained the
247248
> same bare metal server name as before reimaging.
248249
249-
## Best Practices
250+
## Best practices
250251

251252
When working with Operator Nexus, keep the following best practices in mind.
252253

253254
* Avoid specifying `AvailabilityZones` for an Agent Pool.
254-
* Launch larger NAKS Clusters before smaller ones.
255+
* Launch larger NKS Clusters before smaller ones.
255256
* Reduce the Agent Pool's Count before reducing the VM SKU size.
256257

257258
### Avoid specifying AvailabilityZones for an Agent Pool
258259

259260
As you can tell from the above placement scenarios, specifying
260-
`AvailabilityZones` for an Agent Pool is the primary reason that NAKS VMs from
261-
the same NAKS Cluster would end up on the same bare metal server. By specifying
261+
`AvailabilityZones` for an Agent Pool is the primary reason that NKS VMs from
262+
the same NKS Cluster would end up on the same bare metal server. By specifying
262263
`AvailabilityZones`, you "pin" the Agent Pool to a subset of racks and
263264
therefore limit the number of potential bare metal servers in that set of racks
264-
for other NAKS Clusters and other Agent Pool VMs in the same NAKS Cluster to
265+
for other NKS Clusters and other Agent Pool VMs in the same NKS Cluster to
265266
land on.
266267

267268
Therefore, our first best practice is to avoid specifying `AvailabilityZones`
@@ -274,27 +275,27 @@ two or three VMs in an agent pool. You might consider setting
274275
`AvailabilityZones` for that agent pool to `[1,3,5,7]` or `[0,2,4,6]` to
275276
increase availability during runtime upgrades.
276277

277-
### Launch larger NAKS Clusters before smaller ones
278+
### Launch larger NKS Clusters before smaller ones
278279

279-
As of April 2024, and the Network Cloud 2403.1 release, NAKS Clusters are
280+
As of April 2024, and the Network Cloud 2403.1 release, NKS Clusters are
280281
scheduled in the order in which they're created. To most efficiently pack your
281-
target environment, we recommended you create larger NAKS Clusters before
282+
target environment, we recommended you create larger NKS Clusters before
282283
smaller ones. Likewise, we recommended you schedule larger Agent Pools before
283284
smaller ones.
284285

285286
This recommendation is important for Agent Pools using the extra-large
286287
`NC_G48_224_v1` or `NC_P46_224_v1` SKU. Scheduling the Agent Pools with the
287288
greatest count of these extra-large SKU VMs creates a larger set of bare metal
288-
servers upon which other extra-large SKU VMs from Agent Pools in other NAKS
289+
servers upon which other extra-large SKU VMs from Agent Pools in other NKS
289290
Clusters can collocate.
290291

291-
### Reduce the Agent Pool's Count before reducing the VM SKU size
292+
### Reduce the Agent Pool's count before reducing the VM SKU size
292293

293-
If you run into capacity constraints when launching a NAKS Cluster or Agent
294+
If you run into capacity constraints when launching an NKS Cluster or Agent
294295
Pool, reduce the Count of the Agent Pool before adjusting the VM SKU size. For
295-
example, if you attempt to create a NAKS Cluster with an Agent Pool with VM SKU
296+
example, if you attempt to create an NKS Cluster with an Agent Pool with VM SKU
296297
size of `NC_P46_224_v1` and a Count of 24 and get back a failure to provision
297-
the NAKS Cluster due to insufficient resources, you may be tempted to use a VM
298+
the NKS Cluster due to insufficient resources, you may be tempted to use a VM
298299
SKU Size of `NC_P36_168_v1` and continue with a Count of 24. However, due to
299300
requirements for workload VMs to be aligned to a single NUMA cell on a bare
300301
metal server, it's likely that that same request results in similar

0 commit comments

Comments
 (0)