Skip to content

Commit a9b43fa

Browse files
authored
Merge pull request #296427 from pjw711/pjw/second-pure
Add support for multiple storage appliances
2 parents b9f6122 + 8486502 commit a9b43fa

16 files changed

+345
-52
lines changed

articles/operator-nexus/TOC.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
items:
1717
- name: Storage overview
1818
href: concepts-storage.md
19+
- name: Multiple storage appliances
20+
href: concepts-storage-multiple-appliances.md
1921
- name: Storage for Nexus Kubernetes
2022
href: concepts-storage-kubernetes.md
2123
- name: Cluster deployment and upgrades
@@ -405,6 +407,12 @@
405407
items:
406408
- name: Due To Bare Metal Machine Power Failure
407409
href: troubleshoot-kubernetes-cluster-stuck-workloads-due-to-power-failure.md
410+
- name: Storage Appliance
411+
expanded: false
412+
items:
413+
- name: Troubleshoot Multiple Storage appliances
414+
href: troubleshoot-multiple-storage-appliances.md
415+
408416
- name: FAQ
409417
href: azure-operator-nexus-faq.md
410418
- name: Reference

articles/operator-nexus/azure-operator-nexus-faq.md

Lines changed: 27 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,9 @@ The following sections cover some of the frequently asked questions for Azure Op
1616

1717
### What services does Azure Operator Nexus provide?
1818
Azure Operator Nexus is a managed hybrid cloud platform that supports carrier-grade network workloads. Here, the management plane lives in Azure and the control plane and user plane gets deployed on operators' premises or in Azure. It simplifies provisioning of new network services and optimizes deployment of network functions and applications on premises. The end customer can deploy containerized applications (on an on premises Nexus AKS cluster) or a virtualized workload to run these network functions. Customer gets out of the box integration with many Azure services such as Azure Monitor, Azure Container Registry, and Azure Kubernetes Services.
19-
19+
2020
### How do I interact with Operator Nexus instance?
21-
You can interact with Operator Nexus like any other Azure services using AZ CLI, API, ARM template, or portal. You can alternatively use BICEP templates.
21+
You can interact with Operator Nexus like any other Azure services using AZ CLI, API, ARM template, or portal. You can alternatively use BICEP templates.
2222

2323
### Does customer need to deploy any resources in their subscription to deploy Azure Operator Nexus instances?
2424
Yes, there are some resources that customer needs to create in the respective region under their Azure subscriptions. Some of these include creation of a pair of Network Fabric Controller and Cluster Manager resource, Log Analytics Workspace, a storage account. For more details, please refer to [Azure Operator Nexus documentation](howto-azure-operator-nexus-prerequisites.md).
@@ -33,32 +33,32 @@ Yes, to ensure carrier-grade performance and high degrees of automation, you nee
3333
Customers should design their services with Intra-rack redundancy, Inter-rack redundancy, and globally load balancing across multiple instances. Also, for high availability, plan to spread your instances across multiple Azure regions.
3434

3535
### How do updates work to on-premises and to Azure components?
36-
Upgrades to Operator Nexus are made in two phases - Management bundle upgrades and Runtime bundle upgrades. Management bundle upgrades deals with the upgrades of Controllers in Azure, Cluster Managers in customer subscription and on-premises instances. In on-premises instances, it includes the Kubernetes controllers responsible for maintaining the state of infra resources.
37-
36+
Upgrades to Operator Nexus are made in two phases - Management bundle upgrades and Runtime bundle upgrades. Management bundle upgrades deals with the upgrades of Controllers in Azure, Cluster Managers in customer subscription and on-premises instances. In on-premises instances, it includes the Kubernetes controllers responsible for maintaining the state of infra resources.
37+
3838
Updates of Management bundle may cause interruptions to provisioning activities but it doesn't impacts the customers running workloads. Customers don't control or drive these upgrades, but these upgrades are essential to provide customers with the options to update to new runtime-based upgrades within their on-premises instances.
39-
39+
4040
On the other hand, Runtime bundle upgrades deals with the components that require updates to the OS (Operating System) and/or workload supporting components. The update of the runtime bundle is entirely under the control of the customer and APIs can be used to perform these updates. You might observe some workload impacts during this upgrade.
4141

4242
### Is the storage appliance a must required device?
43-
For near-edge SKUs, Storage appliance is a part of the Hardware infrastructure that Operator needs to procure.
43+
For near-edge SKUs, Storage appliance is a part of the Hardware infrastructure that Operator needs to procure. You can optionally choose to deploy a second storage appliance in an Azure Operator Nexus instance. For more information, see [Multiple storage appliances](./concepts-storage-multiple-appliances.md).
4444

4545
### Does Operator Nexus provide best practices or blueprints for deploying network functions on Operator Nexus?
46-
Yes, Operator Nexus comes with Nexus Ready program. With this program, Microsoft is working with industry leading Network Function partners to validate that their network functions to ensure they can run on Nexus platform. We validate these network functions on regular intervals to ensure that they stay compliant with newer versions of Nexus. Operators can now get consistent and scalable deployment of multi-vendor network functions with the Nexus Ready program.
46+
Yes, Operator Nexus comes with Nexus Ready program. With this program, Microsoft is working with industry leading Network Function partners to validate that their network functions to ensure they can run on Nexus platform. We validate these network functions on regular intervals to ensure that they stay compliant with newer versions of Nexus. Operators can now get consistent and scalable deployment of multi-vendor network functions with the Nexus Ready program.
4747

4848
### What data stays on premises and what is available in Azure?
4949
From Infrastructure perspective, the data is managed via Azure APIs. The telemetry from these layers gets collected and is visible under customer subscription. Customers can use Log Analytics Workspace, storage accounts or other Analytics services in Azure to look into the telemetry from Infra layers.
50-
50+
5151
For tenant workloads, the images get stored in ACRs (Azure Container Registry) and once deployed. Microsoft provides an option to collect the telemetry from tenant workloads into Azure but Customers can choose alternative tooling they wish to collect telemetry data or to analyze it.
5252

5353
### If an Azure region doesn't exist in my country/region, can I still use Operator Nexus?
54-
Yes, all you need is ExpressRoute connectivity to an Azure region. ExpressRoute connectivity is available at many locations. For more information, see the [Geo-locations](../expressroute/expressroute-locations.md#locations) and [connectivity providers](../expressroute/expressroute-locations.md#partners).
54+
Yes, all you need is ExpressRoute connectivity to an Azure region. ExpressRoute connectivity is available at many locations. For more information, see the [Geo-locations](../expressroute/expressroute-locations.md#locations) and [connectivity providers](../expressroute/expressroute-locations.md#partners).
5555

5656
### Can I move my resources from one subscription to another?
5757
Currently, we don't support resource moves. If you need to move resources, you can consider deleting the existing controllers and using the ARM template to create another one in another location.
5858

59-
### How many instances can be associated to a cluster manager/fabric controller pair?
60-
The number of Azure Operator Nexus instances, a single pair of Network Fabric Controller and Cluster Manager can manage depends on multiple factors. It can be influenced by factors like size of Operator Nexus instances, ExpressRoute circuit bandwidth, number, and frequency of optional metrics collection, number of workloads running in Instance, destination for workload telemetry data collection and other factors.
61-
59+
### How many instances can be associated to a cluster manager/fabric controller pair?
60+
The number of Azure Operator Nexus instances, a single pair of Network Fabric Controller and Cluster Manager can manage depends on multiple factors. It can be influenced by factors like size of Operator Nexus instances, ExpressRoute circuit bandwidth, number, and frequency of optional metrics collection, number of workloads running in Instance, destination for workload telemetry data collection and other factors.
61+
6262
For more information, see [limits & quotas](reference-limits-and-quotas.md).
6363

6464
### Is it viable to redeploy a cluster that is currently running? If so, what safeguards are in place to prevent accidental deployments?
@@ -80,49 +80,49 @@ Yes, Azure Operator Nexus provides the ability to create customized VMs for host
8080

8181
### How do I update a bare metal server?
8282
To update a bare metal server within an Azure Operator Nexus instance, you can use the Azure APIs. Any update for bare metal server is part of runtime bundle upgrade. This upgrade requires the Nexus instance connectivity back to Azure.
83-
83+
8484
### What OS runs on the bare metal server? Can I bring my own?
8585
Bare metal servers are deployed with Microsoft Azure Linux (previously called CBL Mariner OS), which is thoroughly tested and is compatible with required Azure agents. There are no plans to support any other OS offering for Bare metal servers.
8686

8787
### How many servers does Operator Nexus support? How many racks?
88-
An Operator Nexus instance can have up to eight compute racks and each rack hosting upto 16 servers. These compute servers are used for running actual tenant workloads.
88+
An Operator Nexus instance can have up to eight compute racks and each rack hosting up to 16 servers. These compute servers are used for running actual tenant workloads.
8989

9090
## Networks
9191

9292
### Does Operator Nexus support IPv6?
93-
Yes, Azure Operator Nexus provides support for both IPV4 and IPV6 configuration across all layers of the stack.
94-
93+
Yes, Azure Operator Nexus provides support for both IPV4 and IPV6 configuration across all layers of the stack.
94+
9595
### What are the networking requirements for Azure Operator Nexus?
9696
Here are some of the network requirements for Azure Operator Nexus:
97-
* Customers need to work with Microsoft partners for setting up ExpressRoute connections,
97+
* Customers need to work with Microsoft partners for setting up ExpressRoute connections,
9898
* PE (Provider Edge) device supports 400G or 100G connections to CE (Customer Edge) device in Operator Nexus instance
9999
* PE must have routes to ExpressRoutes
100-
* IP address blocks defined for various services, VLANs for iDrac, PXE, Storage, OAM etc.
101-
100+
* IP address blocks defined for various services, VLANs for iDrac, PXE, Storage, OAM etc.
101+
102102
For more information, see [Network fabric controller](howto-configure-network-fabric-controller.md) and [Network fabric](howto-configure-network-fabric.md).
103-
103+
104104
### What is an isolation domain?
105105
Isolation domains enable Layer 2 or Layer 3 connectivity between workloads hosted across the Azure Operator Nexus instance and external networks. These constructs segment a network into authentication domains and enforces communication within required boundaries.
106-
106+
107107
### Does Operator Nexus support a single ToR (Top of Rack) device?
108-
For near-edge, the fabric is designed based on high availability model and the reason you can't have just one ToR switch.
108+
For near-edge, the fabric is designed based on high availability model and the reason you can't have just one ToR switch.
109109

110110
### Is Network packet Broker (NPB) a hard requirement?
111-
For near-edge SKUs, NPBs will be part of the BOM.
111+
For near-edge SKUs, NPBs will be part of the BOM.
112112

113113
### How do I configure the load balancing service in a Nexus Kubernetes cluster?
114114
The load balancer allows external services to access the services running within the cluster. You can refer to the [load balancer](./howto-kubernetes-service-load-balancer.md) article for more information.
115-
115+
116116
## Tenant workloads
117117

118-
### Can I bring my own K8S cluster?
118+
### Can I bring my own K8S cluster?
119119
Nexus platform offers customers an option to either create Nexus AKS clusters with multiple Kubernetes versions or Virtual Machines for running their workloads.
120120

121121
### What VNFs (Virtualized Network Functions) and CNFs (Containerized Network Functions) are certified on the platform?
122122
Validation of VNF and CNF functions is an ongoing activity. We'll publish the list of certified VNFs and CNFs soon.
123123

124124
### If a VNF or CNF isn't certified, can I still use it?
125-
Indeed, you can collaborate with the Nexus team to ensure there are no limitations that would prevent you from deploying these workloads.
125+
Indeed, you can collaborate with the Nexus team to ensure there are no limitations that would prevent you from deploying these workloads.
126126

127127
### How do I deploy a VM/Container?
128128
Customers can use APIs to deploy VM and Nexus AKS clusters in an Operator Nexus instance. For a richer experience, Microsoft offers another service AOSM (Azure Operator Service Manager) which allows you to automate deployment of your Containerized (CNF) and Virtualized (VNF) Network Functions.
@@ -131,7 +131,7 @@ Customers can use APIs to deploy VM and Nexus AKS clusters in an Operator Nexus
131131
Nexus AKS clusters have a support for Read Write Many (RWX) & Read Write Once (RWO) storage classes.
132132

133133
### How can Operator Nexus VMs be made highly available?
134-
The Operators can choose to deploy VMs across multiple bare metal machines and across racks to achieve high availability.
134+
The Operators can choose to deploy VMs across multiple bare metal machines and across racks to achieve high availability.
135135

136136
## Observability
137137

articles/operator-nexus/concepts-cluster-deployment-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ The ephemeral bootstrap node sequentially provisions each KCP node, and if a KCP
3737

3838
After successful provisioning of KCP nodes, the deployment action proceeds to provision NMP nodes in parallel. If an NMP node fails to provision, the cluster deployment action fails, resulting in the cluster status being marked as failed.
3939

40-
Upon successful provisioning of NMP nodes, a storage appliance is created before the deployment action proceeds with provisioning the compute nodes. Compute nodes are provisioned in parallel, and once the defined compute node threshold is met, the cluster status transitions from Deploying to Running. However, the remaining nodes continue undergoing the provisioning process until they too are successfully provisioned.
40+
Upon successful provisioning of NMP nodes, up to two storage appliances are created before the deployment action proceeds with provisioning the compute nodes. Compute nodes are provisioned in parallel, and once the defined compute node threshold is met, the cluster status transitions from Deploying to Running. However, the remaining nodes continue undergoing the provisioning process until they too are successfully provisioned.
4141

4242

4343
## Cluster operations

articles/operator-nexus/concepts-nexus-availability.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ Go through the following steps to help plan an Operator Nexus deployment.
5959

6060
6. Operator Nexus supports between 1 and 8 racks per site inclusive, with each rack containing 4, 8, 12 or 16 servers. All racks must be identical in terms of number of servers. See [here](./reference-near-edge-compute.md) for specifics of the resource available for workloads. See the following diagram, and also [this article](./reference-limits-and-quotas.md) for other limits and quotas that might have an impact.
6161

62-
7. Operator Nexus supports one or two storage appliances. Currently, these arrays are available to workload NFs running as Kubernetes nodes. Workloads running as VMs use local storage from the server they're instantiated on.
62+
7. Operator Nexus supports one or two Pure storage appliances. Storage appliances typically reach high levels of availability and data durability; precise numbers vary per vendor and per product. Azure Operator Nexus has redundant connectivity to the storage appliances on the management and data networks. Storage appliances provide storage to workload NFs running on Kubernetes. There is no support for automatic or manual migration of volumes between storage appliances, and both storage appliances are located in the aggregator rack. This means, from a storage appliance perspective, Nexus deployments with two storage appliances have the same availability characteristics as deployments with one storage appliance: a volume is only ever on one storage appliance.
6363

6464
8. Other factors to consider are the number of available physical sites, and any per-site limitations such as bandwidth or power.
6565

articles/operator-nexus/concepts-storage-kubernetes.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.custom: template-concept
1111

1212
# Azure Operator Nexus storage for Kubernetes
1313

14-
Each Azure Operator Nexus provides two types of persistent storage to Nexus Kubernetes cluster tenant workloads: *nexus-volume* and *nexus-shared*. Operators select the type of storage they need by creating Persistent Volume Claims (PVCs) using the *nexus-volume* or *nexus-shared* storage class. All data stored in persistent volumes is stored on a storage appliance deployed on-premises as part of the Azure Operator Nexus instance.
14+
Each Azure Operator Nexus provides two types of persistent storage to Nexus Kubernetes cluster tenant workloads: *nexus-volume* and *nexus-shared*. Operators select the type of storage they need by creating Persistent Volume Claims (PVCs) using the *nexus-volume* or *nexus-shared* storage class. All data stored in persistent volumes is stored on a storage appliance deployed on-premises as part of the Azure Operator Nexus instance. Azure Operator Nexus requires one storage appliance and supports up to two storage appliances per Azure Operator Nexus instance.
1515

1616
> [!IMPORTANT]
1717
> Azure Operator Nexus doesn't support ephemeral volumes. Nexus recommends that the persistent volume storage mechanisms described in this document are used for all Nexus Kubernetes cluster workload volumes as these mechanisms provide the highest levels of performance and availability. All storage in Azure Operator Nexus is provided by the storage appliance. There's no support for storage provided by baremetal machine disks.
@@ -45,6 +45,8 @@ status:
4545
phase: Bound
4646
```
4747
48+
Some Azure Operator Nexus deployments may have two storage appliances installed. Persistent volume claims (PVCs) using the *nexus-volume* storage class can place the associated persistent volumes onto a specific storage appliance by using the *storageApplianceName* annotation. More information is available in [this document](./concepts-storage-multiple-appliances.md).
49+
4850
### StorageClass: nexus-shared
4951
5052
In situations where a shared file system is required, the *nexus-shared* storage class is available. This storage class provides a highly available shared storage solution by enabling multiple pods in the same Nexus Kubernetes cluster to concurrently access and share the same volume. The *nexus-shared* storage class is backed by a highly available NFS storage service. This NFS storage service (storage pool currently limited to a maximum size of 1 TiB) is available per Cloud Service Network (CSN). The NFS storage service is deployed automatically on creation of a CSN resource. Any Nexus Kubernetes cluster attached to the CSN can provision persistent volumes from this shared storage pool. Nexus-shared supports both Read Write Once (RWO) and Read Write Many (RWX) access modes. What that means is that the workload applications can make use of either of these access modes to access the shared storage.

0 commit comments

Comments
 (0)