You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/operator-service-manager/get-started-with-cluster-registry.md
+28-8Lines changed: 28 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,17 +19,20 @@ Improve resiliency for cloud native network functions with Azure Operator Servic
19
19
* First version, with HA for NF kubernetes extension: 2.0.2810-144
20
20
21
21
## Introduction
22
-
Azure Operator Service Manager (AOSM) cluster registry (CR) enables a local copy of container images in the Nexus K8s cluster. When the containerized network function (CNF) is installed with cluster registry enabled, the container images are pulled from the remote AOSM artifact store and saved to a local registry. With cluster register, CNF access to container images survives loss of connectivity to the remote artifact store.
22
+
Azure Operator Service Manager (AOSM) cluster registry (CR) enables a local copy of container images in the Nexus K8s cluster. When the containerized network function (CNF) is installed with cluster registry enabled, the container images are pulled from the remote AOSM artifact store and saved to this local cluster registry. Using a mutating webhook, cluster registry automatically intercepts image requests and substitutes the local registry path, to avoid publisher packaging changes. With cluster register, CNF access to container images survives loss of connectivity to the remote artifact store.
23
23
24
-
### Key use cases
24
+
### Key use cases and benefits
25
25
Cloud native network functions (CNF) need access to container images, not only during the initial deployment using AOSM artifact store, but also to keep the network function operational. Some of these scenarios include:
26
26
* Pod restarts: Stopping and starting a pod can result in a cluster node pulling container images from the registry.
27
27
* Kubernetes scheduler operations: During pod to node assignments, according to scheduler profile rules, if the new node does not have the container images locally cached, the node pulls container images from the registry.
28
28
29
-
In the above scenarios, if there's a temporary issue with accessing the AOSM artifact store, the cluster registry provides the necessary container images to prevent disruption to the running CNF. Also, the AOSM cluster registry feature decreases the number of image pull requests on AOSM artifact store since each Nexus K8s node pulls container images from the cluster registry instead of the AOSM artifact store.
29
+
Benefits of using AOSM cluster registry:
30
+
* Provides the necessary local images to prevent CNF disruption where connectivity to AOSM artifact store is lost.
31
+
* Decreases the number of image pulls on AOSM artifact store, since each cluster node now pulls images only from the local registry.
32
+
* Overcomes issues with malformed registry URLs, by using a mutating webhook to substitute the proper local registry URL path.
30
33
31
34
## How cluster registry works
32
-
AOSM cluster registry is enabled using the Network Function Operator Arc K8s extension. The following CLI shows how cluster registry is enabled on a Nexus K8s cluster.
35
+
AOSM cluster registry is enabled using the Network Function Operator (NFO) Arc K8s extension. The following CLI shows how cluster registry is enabled on a Nexus K8s cluster.
33
36
```bash
34
37
az k8s-extension create --cluster-name
35
38
--cluster-type {connectedClusters}
@@ -58,6 +61,23 @@ When the cluster registry feature is enabled in the Network Function Operator Ar
58
61
> [!NOTE]
59
62
> If the user doesn't provide any input, a default persistent volume of 100 GB is used.
60
63
64
+
### Cluster registry components
65
+
The cluster registry feature deploys helper pods on the target edge cluster to assist the NFO extension.
66
+
67
+
#### Component reconciler
68
+
* This main pod takes care of reconciling component Custom Resource Objects (CROs) created by K8sBridge with the help of the Microsoft.Kubernetes resource provider (RP), Hybrid Relay, and Arc agent running on the cluster.
69
+
70
+
#### Pod mutating webhook
71
+
* These pods implement Kubernetes mutating admission webhooks, serving an instance of the mutate API. The mutate API does two things:
72
+
* It modifies the image registry path to the local registry IP, substituting out the AOSM artifact store Azure container registry (ACR).
73
+
* It creates an Artifact CR on the edge cluster.
74
+
75
+
#### Artifact reconciler
76
+
* This pod reconciles artifact CROs created by the mutating webhook.
77
+
78
+
#### Registry
79
+
* This pod stores and retrieves container images for CNF.
80
+
61
81
## High availability and resiliency considerations
62
82
The AOSM NF extension relies uses a mutating webhook and edge registry to support key features.
63
83
* Onboarding helm charts without requiring customization of image path.
@@ -78,7 +98,7 @@ With HA, cluster registry and webhook pods now support a replicaset with a minim
78
98
#### DeploymentStrategy
79
99
* A rollingUpdate strategy is used to help achieve zero downtime upgrades and support gradual rollout of applications. Default maxUnavailable configuration allows only one pod to be taken down at a time, until enough pods are created to satisfying redundancy policy.
80
100
#### Pod Disruption Budget
81
-
* A policy distruption budget (PDB) protects pods from voluntary disruption and is deployed alongside Deployment, ReplicaSet, or StatefulSet objects. For AOSM operator pods, a PDB with minAvailable parameter of 2 is used.
101
+
* A policy disruption budget (PDB) protects pods from voluntary disruption and is deployed alongside Deployment, ReplicaSet, or StatefulSet objects. For AOSM operator pods, a PDB with minAvailable parameter of 2 is used.
82
102
#### Pod anti-affinity
83
103
* Pod anti-affinity controls distribution of application pods across multiple nodes in your cluster. With HA, AOSM pod anti-affinity using the following parameters:
84
104
* A scheduling mode is used to define how strictly the rule is enforced.
@@ -100,8 +120,8 @@ With HA, cluster registry and webhook pods now support a replicaset with a minim
100
120
101
121
#### Horizontal scaling
102
122
* In Kubernetes, a HorizontalPodAutoscaler (HPA) automatically updates a workload resource with the aim of automatically scaling the workload to match demand. AOSM operator pods have the following HPA policy parameters configured;
103
-
* A minimum replicas of three.
104
-
* A maximum replicas of five.
123
+
* A minimum replica of three.
124
+
* A maximum replica of five.
105
125
* A targetAverageUtilization for cpu and memory of 80%.
106
126
107
127
#### Resource limits
@@ -112,7 +132,7 @@ All AOSM operator containers are configured with appropriate request, limit for
112
132
113
133
#### Known HA Limitations
114
134
* Nexus AKS (NAKS) clusters with single active node in system agent pool are not suitable for highly available. Nexus production topology must use at least three active nodes in system agent pool.
115
-
* The nexus-shared storage class is a network file system (NFS) storage service. This NFS storage service is available per Cloud Service Network (CSN). Any Nexus Kubernetes cluster attached to the CSN can provision persistent volume from this shared storage pool. The storage pool is currently limited to a maximum size of 1TiB as of Network Cloud (NC) 3.10 where-as NC 3.12 has a 16-TiB option.
135
+
* The nexus-shared storage class is a network file system (NFS) storage service. This NFS storage service is available per Cloud Service Network (CSN). Any Nexus Kubernetes cluster attached to the CSN can provision persistent volume from this shared storage pool. The storage pool is currently limited to a maximum size of 1 TiB as of Network Cloud (NC) 3.10 where-as NC 3.12 has a 16-TiB option.
116
136
* Pod Anti affinity only deals with the initial placement of pods, subsequent pod scaling, and repair, follows standard K8s scheduling logic.
0 commit comments