Skip to content

Commit e1c92cd

Browse files
committed
updates per acrolinx suggestions
1 parent b7eafb1 commit e1c92cd

File tree

1 file changed

+69
-90
lines changed

1 file changed

+69
-90
lines changed

articles/operator-nexus/howto-run-instance-readiness-testing.md

Lines changed: 69 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,3 @@
1-
---
2-
title: "Azure Operator Nexus: How to run Instance Readiness Testing"
3-
description: Learn how to run instance readiness testing.
4-
author: lesage-oded
5-
ms.author: odedlesage
6-
ms.service: azure-operator-nexus
7-
ms.topic: how-to
8-
ms.date: 02/29/2024
9-
ms.custom: template-how-to
10-
---
11-
121
# Azure Operator Nexus Instance Readiness Test (IRT)
132

143
The Instance Readiness Test (IRT) framework is an optional/add-on tool for the Nexus platform. It enables operators to verify the successful deployment and readiness of the Azure Operator Nexus instance for workload deployment. This verification applies to both initial deployment and subsequent upgrades of the Nexus. It runs a series of tests and provides the test results as an html report.
@@ -27,10 +16,25 @@ The Instance Readiness Test (IRT) framework is an optional/add-on tool for the N
2716
## Tests executed with IRT
2817
- Validate that l3 domains in the fabric subscription and resource group exist after all tests on the resources under test are done.
2918
- Validate that there are l3 networks created in the testing resource group after all tests on the resources under test are done.
19+
- Validate that ApiserverAuditRequestsRejectedTotal metric data is present within the last 10 minutes.
20+
Every average metric should be greater than 0.
21+
- Validate that ContainerMemoryUsageBytes metric data is present within the last 10 minutes.
22+
Every average metric should be greater than 0.
23+
- Validate that CorednsDnsRequestsTotal metric data is present within the last 10 minutes.
24+
Every average metric should be greater than 0.
25+
- Validate that EtcdServerIsLeader metric data is present within the last 10 minutes. Every count metric should be greater than 0.
3026
- Validate that FelixClusterNumHosts metric data is present within the last 10 minutes.
3127
Every average metric should be greater than 0.
28+
- Validate that IdracPowerOn metric data is present within the last 10 minutes. Every count metric should be greater than 0.
29+
- Validate that KubeDaemonsetStatusCurrentNumberScheduled metric data is present within the last 10 minutes. Every average metric should be greater than 0.
30+
- Validate that KubeletRunningPods metric data is present within the last 10 minutes.
31+
Every average metric should be greater than 0.
32+
- Validate that KubevirtInfo metric data is present within the last 10 minutes.
33+
Every average metric should be greater than 0.
3234
- Validate that NodeOsInfo metric data for a baremetal machine is present within the last 10 minutes.
3335
Every count metric should be greater than 0.
36+
- Validate that TyphaConnectionsAccepted metric data is present within the last 10 minutes.
37+
Every average metric should be greater than 0.
3438
- Test the transmission of IPv4 TCP data between two virtual machines using iPerf3 and affinity settings in the ARM template.
3539
The test ensures that the data throughput exceeds 60 Mbps.
3640
- Test the transmission of IPv6 TCP data between two virtual machines using iPerf3 and affinity settings in the ARM template.
@@ -64,17 +68,17 @@ The Instance Readiness Test (IRT) framework is an optional/add-on tool for the N
6468
Stderr should be empty and no packet loss should be observed.
6569
- Test IPv6 ping between a NAKS cluster pod and a VM with jumbo frames enabled.
6670
Stderr should be empty and no packet loss should be observed.
67-
- Validate PVC has been created successfully.
68-
- Validate PV has been created successfully.
69-
- Test creating a PVC with volumeMode Block and accessMode RWO.
70-
- Validate that all the nexus-shared and nexus-volume volumes that were added are mounted in sts 0
71+
- Validate PersistentVolumeClaim is created successfully.
72+
- Validate PersistentVolume is created successfully.
73+
- Test creating a PVC with volumeMode Block and accessMode RWO.
74+
- Validate that all the nexus-shared and nexus-volume volumes that were added are mounted in sts 0.
7175
- Validate that all the nexus-shared and nexus-volume volumes that were added are mounted in sts 1.
7276
- Validate that nfs storage mounted on sts 0 is writable.
7377
- Validate that nfs storage file written to sts 0 can be read.
7478
- Validate that shared nfs storage mounted on sts 0 is writable.
7579
- Validate that shared nfs file written to sts 0 can be read.
7680
- Validate that nfs storage mounted on sts 1 is writable.
77-
- Validate that shared file written to sts 0 can be read in sts 1
81+
- Validate that shared file written to sts 0 can be read in sts 1.
7882
- Validate that shared nfs storage mounted on sts 1 is writable.
7983
- Validate that shared file written to sts 0 and sts 1 can be read from sts 1.
8084
- Validate that shared file written to in sts 0 and sts 1 can be read from sts 0.
@@ -106,19 +110,19 @@ For access to the nexus-samples GitHub repository
106110

107111
## Environment Requirements
108112

109-
- A Linux environment (Ubuntu suggested) capable of calling Azure APIs
110-
- Support for other Linux distros e.g. RedHat, Mariner, etc. depends on being able to install the necessary tooling. See [Install Dependencies](#install-dependencies) section.
113+
- A Linux environment (Ubuntu suggested) capable of calling Azure APIs.
114+
- Support for other Linux distros for example, RedHat, Mariner, etc. depends on being able to install the necessary tooling. See [Install Dependencies](#install-dependencies) section.
111115
- Any machine that has the required packages installed should be able to use the scripts.
112-
- Knowledge of networks to use for the test
116+
- Knowledge of networks to use for the test.
113117
* Networks to use for the test are specified in a "networks-blueprint.yml" file, see [Input Configuration](#input-configuration).
114-
- A way to download the IRT release package e.g. curl, wget, etc
115-
- The ability to create a service principal with the correct roles
116-
- The ability to read secrets from the KeyVault, see [Service Principal] (#create-service-principal-and-security-group) section for more details
117-
- The ability to create security groups in your Active Directory tenant
118+
- A way to download the IRT release package for example, curl, wget, etc.
119+
- The ability to create a service principal with the correct roles.
120+
- The ability to read secrets from the KeyVault, see [Service Principal] (#create-service-principal-and-security-group) section for more details.
121+
- The ability to create security groups in your Active Directory tenant.
118122

119123
## Input Configuration
120124

121-
Start by building your input file. The IRT tarball provides `irt-input.example.yml` as an example. Follow the [instructions](#download-irt) to download the tarball. Please note that these values **will not work for your instances**. You need to manually change them and rename the file to `irt-input.yml`. We provide the example input file as a stub to help you configure new input files. The example outlines overridable values and their usage. The **[One Time Setup](#one-time-setup)** assists you in setting input values by writing key/value pairs to the config file as they execute.
125+
Start by building your input file. The IRT tarball provides `irt-input.example.yml` as an example. Download the tarball by following the [instructions](#download-irt). Note that these values **will not work for your instances**. You need to manually change them and rename the file to `irt-input.yml`. We provide the example input file as a stub to help you configure new input files. The example outlines overridable values and their usage. The **[One Time Setup](#one-time-setup)** assists you in setting input values by writing key/value pairs to the config file as they execute.
122126

123127
You can provide the network information in a `networks-blueprint.yml` file, similar to the `networks-blueprint.example.yml` that we provide, or append it to the `irt-input.yml` file. The `networks-blueprint.example.yml` defines the schema for IRT. The test creates the networks, so provide network details that aren't in use. Currently, IRT has the following network requirements:
124128

@@ -131,11 +135,11 @@ You can provide the network information in a `networks-blueprint.yml` file, simi
131135
## One Time Setup
132136

133137
### Download IRT
134-
IRT is distributed via tarball from the release section of the [nexus-samples](https://aka.ms/nexus-irt) GitHub repo
138+
IRT is distributed via tarball from the release section of the [nexus-samples](https://aka.ms/nexus-irt) GitHub repo.
135139
1. Find the release package marked with 'Latest'. Download it, extract it, and navigate to the `irt` directory.
136-
1. Extract the tarball to the local file system: `mkdir -p irt && tar xf nexus-irt.tar.gz --directory ./irt`
137-
1. Switch to the new directory `cd irt`
138-
1. See RELEASE-CHANGELOG.md for any notable updates or changes
140+
1. Extract the tarball to the local file system: `mkdir -p irt && tar xf nexus-irt.tar.gz --directory ./irt`.
141+
1. Switch to the new directory `cd irt`.
142+
1. See RELEASE-CHANGELOG.md for any notable updates or changes.
139143

140144
### Install Dependencies
141145
There are multiple dependencies expected to be available during execution. Review this list;
@@ -185,7 +189,7 @@ The supplemental script, `create-service-principal.sh` creates a service princip
185189

186190
Additionally, the script creates the necessary security group, and adds the service principal to the security group. If the security group exists, it adds the service principal to the existing security group.
187191

188-
Executing `create-service-principal.sh` requires the input yaml to have the following properties, all of them can be overridden by the corresponding environment variables:
192+
Executing `create-service-principal.sh` requires the input yaml to have the following values. All values can be overridden by setting the corresponding environment variables:
189193
```yml
190194
SERVICE_PRINCIPAL:
191195
NAME: "<name>" # env: SERVICE_PRINCIPAL_NAME
@@ -198,7 +202,7 @@ SERVICE_PRINCIPAL:
198202
* `SERVICE_PRINCIPAL.AAD_GROUP_NAME` - The name of the security group.
199203
* `SERVICE_PRINCIPAL.SUBSCRIPTION` - The subscription of the service principal.
200204
* `SERVICE_PRINCIPAL.KV_NAME` - The KeyVault to store the service principal password.
201-
* `SERVICE_PRINCIPAL.KV_ID` - The KeyVault secret where the service principal password is actually stored.
205+
* `SERVICE_PRINCIPAL.KV_ID` - The KeyVault secret where the service principal password is stored.
202206

203207
> **_NOTE:_** Please ensure that you have already created a KeyVault (KV_NAME) and/or a Secret (KV_ID) with a dummy value prior to executing `create-service-principal.sh`.
204208
> The `az login` user (person executing IRT) should also be granted access to this KeyVault so secrets can be pulled at runtime.
@@ -227,7 +231,7 @@ KV_ID: "<provided-key-valut-secret>" # If SP already exists please fill it in to
227231
<details>
228232
<summary>Expand to see details for using a custom role </summary>
229233

230-
If you have an existing service principal and would like the convenience of only having to assign one role for IRT execution, you can follow the steps below.
234+
If you have an existing service principal and would like the convenience of only having to assign one role for IRT execution, you can follow the directions in this section.
231235

232236
##### Prerequisites
233237

@@ -236,9 +240,9 @@ If you have an existing service principal and would like the convenience of only
236240

237241
##### Steps
238242

239-
1. Prepare Your Environment
243+
1. Prepare Your Environment:
240244
- Open a Bash Shell:
241-
- You can use any terminal that supports Bash.
245+
- You can use any terminal that supports Bash
242246

243247
1. Sign in to Azure:
244248
- Execute the following command to sign in to your Azure account:
@@ -270,9 +274,9 @@ If you have an existing service principal and would like the convenience of only
270274
--parameters roleName="$roleName"
271275
```
272276

273-
1. Assign Role to Application Service Principal used for testing
277+
1. Assign Role to Application Service Principal used for testing:
274278

275-
Weather created via the all-in-one setup or using your own, assign the newly created role to your identity, this single role provides all the necessary authorizations to run Instance Readiness Testing.
279+
Weather created via the all-in-one setup or using your own, assign the newly created role to your identity. This single role provides all the necessary authorizations to run Instance Readiness Testing.
276280

277281
```bash
278282
# The Application ID of your Service Principal for your application
@@ -297,7 +301,7 @@ If you have an existing service principal and would like the convenience of only
297301
<details>
298302
<summary>Expand to see how to create l3 isolation. </summary>
299303
300-
The testing framework does not create, destroy, or manipulate isolation domains. Therefore, existing isolation domains can be used for execution. Each isolation domain requires at least one external network. The supplemental script, `create-l3-isolation-domains.sh`. Internal networks e.g. L3, trunked, etc. are created, manipulated, and destroyed through the course of testing.
304+
The testing framework does't create, destroy, or manipulate isolation domains. Therefore, existing isolation domains can be used for execution. Each isolation domain requires at least one external network. The supplemental script, `create-l3-isolation-domains.sh`. Internal networks for example, L3, trunked, etc. are created, manipulated, and destroyed through the course of testing.
301305

302306
Executing `create-l3-isolation-domains.sh` requires one **parameter**, a path to a file containing the networks requirements. You can choose either the standalone network-blueprint.yml or the input.yml based on your workflow, either can contain the information needed.
303307

@@ -324,11 +328,10 @@ Executing `create-l3-isolation-domains.sh` requires one **parameter**, a path to
324328

325329
## How to Read the IRT Summary Results
326330

327-
The IRT summary page is a html page that can be downloaded after the
328-
execution of the IRT and can be viewed from any browser.
331+
The IRT summary page is an html page that is generated after the
332+
execution of IRT and can be viewed from any browser.
329333

330-
IRT Summary Page comprises three major sections, which drills further to
331-
provide more details.
334+
IRT Summary Page comprises three major sections, which expand to provide more details.
332335

333336
- Test Results
334337

@@ -345,47 +348,25 @@ executed, different prerequisite test commands, so totals may not always be the
345348

346349
![irt summary header success](./media/irt/irt-header-success.png)
347350

348-
In case of any failures in the tests, the values represent accordingly.
351+
If there is any failures in the tests, the values represent accordingly.
349352

350353
![irt summary header failure](./media/irt/irt-header-failure.png)
351354

352355
### Test Results
353356

354357
The Test Results section provides all the tests (assertions) that IRT
355358
executes. The Asserters section expands to view the list of tests
356-
(assertions) that are run and available. Each asserter can be further
357-
expanded that loads an accordion pane which provides more details of the
358-
asserter, including the description of the test and any thresholds to be
359-
measured and asserted against.
360-
361-
IRT asserters include the tests related to the following:
362-
363-
- Connectivity between resources -- NS (nslookup) and EW
364-
(pings/iperfs)
365-
366-
- Dual Stack support
367-
368-
- Iperf tests on IPv4 and IPv6
369-
370-
- MTUs 1500 and 9000
371-
372-
- DPDK throughputs measuring using PMDs
373-
374-
- Nexus Kubernetes cluster status
375-
376-
- Volume tests
377-
378-
- PVCs in block and access mode
379-
380-
- PVS write and read tests
359+
(assertions) that are run and available. Each asserter can be further expanded
360+
that loads an accordion pane, which provides more details of the asserter,
361+
including the description of the test and any thresholds to be measured and asserted against.
381362

382363
### Display of Test Results
383364

384365
Display of Test results section with all successful tests:
385366

386367
![test results success](./media/irt/irt-test-success.png)
387368

388-
In case of any failures, the asserts are highlighted in red.
369+
If there are any failures, the assertions are highlighted in red.
389370

390371
![test results failure](./media/irt/irt-test-failure.png)
391372

@@ -397,11 +378,11 @@ description of it under standard log.
397378
An example of an Asserter:
398379

399380
*Asserters \[It\] res-test-dpdk-naks-84f5b - network: \'l3network-704\'
400-
(**PMD) average of Rx-pps \[17668558\] should be greater than 8000000*
381+
(**PMD) average of Rx-pps \[17668558\] should be greater than 8000000*.
401382

402383
The above example of an assert reads as the Rx (receive)-pps (packets
403-
per seconds) for l3network-704 is 17668558 which is greater than the
404-
expected 8000000
384+
per seconds) for l3network-704 is 17668558, which is greater than the
385+
expected 8000000.
405386

406387
![irt success details](./media/irt/irt-detail-success.png)
407388

@@ -418,49 +399,47 @@ for debugging purposes. It consists of four test suites and every suite
418399
consists of the suite relevant tests that expand to provide details.
419400
Failures of any specific tests are highlighted in red.
420401

421-
- Setup Suite - A common suite to deploy Nexus Resources as defined in
422-
the Arm Template required for the framework and tests.
402+
- Setup Suite:
403+
- Sets up test execution and deploys Nexus Resources as defined in
404+
the ARM Template required for the framework and tests.
423405

424-
- Inject Suite -- Injects the required environment variables and test
406+
- Inject Suite:
407+
- Injects the required environment variables and test
425408
data to support the testing of NAKS resources
426409

427-
- Collect Suite -- The collect suite collects the data published by
410+
- Collect Suite:
411+
- The collect suite collects the data published by
428412
the Setup Suite.
429413

430-
- Cleanup Suite -- Deletes the Nexus resources created for the tests
414+
- Cleanup Suite:
415+
- Deletes the Nexus resources created for the tests
431416
after the data is collected.
432417

433418
![irt debug section](./media/irt/irt-debug-section.png)
434419

435420
## Extras Section
436421

437-
This is an information only section that provides additional details of
438-
the Nexus platform. There are no assertions/tests that represent this
422+
This section is an informational only, that provides additional details about
423+
the Nexus instance. There are no assertions/tests that represent this
439424
section. It helps operators to check the status of underlying cluster
440425
resources and tenant resources running on the cluster after IRT is
441426
executed.
442427

443-
Extras section consists of results displayed by running two different
444-
text files separately.
428+
The Extras section consists of results displayed by running two different
429+
script files separately.
445430

446-
- Platform Validation Results -- Displays the Nexus under cloud
447-
deployed resources details and their current statuses, including
448-
Cluster Manager details and its extensions, Fabric related details,
449-
Nexus cluster and its extensions, BareMetal Machines, Arc related
450-
and Storage appliances.
431+
- Platform Validation Results:
432+
- Displays the Nexus under cloud deployed resources details and their current statuses, including Cluster Manager details and its extensions, Fabric related details, Nexus cluster and its extensions, BareMetal Machines, Arc related, and Storage appliances.
451433

452-
- Tenant workloads Validation Results -- Displays the Nexus tenant
453-
resources details and their current statuses running on the Nexus
454-
cluster, including displaying of L2 and L3 Isolation Domains, Cloud
455-
Service networks, default cni networks, L2 and L3 networks, trunked
456-
networks, available list of VMs and Nexus Kubernetes clusters.
434+
- Tenant workloads Validation Results:
435+
- Displays the Nexus tenant resources details and their current statuses running on the Nexus cluster, including displaying of L2 and L3 Isolation Domains, Cloud Service networks, default cni networks, L2 and L3 networks, trunked networks, available list of VMs…
457436

458437
## Troubleshooting
459438

460439
Asserters and debug sections with failures are effective troubleshooting
461440
methods to address failures and technical problems.
462441

463-
If you still have questions, please [contact
442+
If you still have questions, [contact
464443
support](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade).
465444
For more information about Support plans, see [Azure Support
466445
plans](https://azure.microsoft.com/support/plans/response/)

0 commit comments

Comments
 (0)