Merge pull request #289686 from msftadam/patch-35

prmerger-automator[bot] · web-flow · commit 028f9de8f7a9 · 2024-11-01T19:15:03.000Z
Update get-started-with-cluster-registry.md
diff --git a/articles/operator-service-manager/get-started-with-cluster-registry.md b/articles/operator-service-manager/get-started-with-cluster-registry.md
@@ -19,17 +19,20 @@ Improve resiliency for cloud native network functions with Azure Operator Servic
 * First version, with HA for NF kubernetes extension: 2.0.2810-144
 
 ## Introduction
-Azure Operator Service Manager (AOSM) cluster registry (CR) enables a local copy of container images in the Nexus K8s cluster. When the containerized network function (CNF) is installed with cluster registry enabled, the container images are pulled from the remote AOSM artifact store and saved to a local registry. With cluster register, CNF access to container images survives loss of connectivity to the remote artifact store.
+Azure Operator Service Manager (AOSM) cluster registry (CR) enables a local copy of container images in the Nexus K8s cluster. When the containerized network function (CNF) is installed with cluster registry enabled, the container images are pulled from the remote AOSM artifact store and saved to this local cluster registry. Using a mutating webhook, cluster registry automatically intercepts image requests and substitutes the local registry path, to avoid publisher packaging changes. With cluster register, CNF access to container images survives loss of connectivity to the remote artifact store.
 
-### Key use cases
+### Key use cases and benefits
 Cloud native network functions (CNF) need access to container images, not only during the initial deployment using AOSM artifact store, but also to keep the network function operational. Some of these scenarios include:
 * Pod restarts: Stopping and starting a pod can result in a cluster node pulling container images from the registry.
 * Kubernetes scheduler operations: During pod to node assignments, according to scheduler profile rules, if the new node does not have the container images locally cached, the node pulls container images from the registry.
 
-In the above scenarios, if there's a temporary issue with accessing the AOSM artifact store, the cluster registry provides the necessary container images to prevent disruption to the running CNF. Also, the AOSM cluster registry feature decreases the number of image pull requests on AOSM artifact store since each Nexus K8s node pulls container images from the cluster registry instead of the AOSM artifact store.
+Benefits of using AOSM cluster registry:
+* Provides the necessary local images to prevent CNF disruption where connectivity to AOSM artifact store is lost.
+* Decreases the number of image pulls on AOSM artifact store, since each cluster node now pulls images only from the local registry.
+* Overcomes issues with malformed registry URLs, by using a mutating webhook to substitute the proper local registry URL path.
 
 ## How cluster registry works
-AOSM cluster registry is enabled using the Network Function Operator Arc K8s extension. The following CLI shows how cluster registry is enabled on a Nexus K8s cluster.
+AOSM cluster registry is enabled using the Network Function Operator (NFO) Arc K8s extension. The following CLI shows how cluster registry is enabled on a Nexus K8s cluster.
 ```bash
 az k8s-extension create --cluster-name
                         --cluster-type {connectedClusters}
@@ -58,6 +61,23 @@ When the cluster registry feature is enabled in the Network Function Operator Ar
 > [!NOTE]
 > If the user doesn't provide any input, a default persistent volume of 100 GB is used.
 
+### Cluster registry components
+The cluster registry feature deploys helper pods on the target edge cluster to assist the NFO extension.
+
+#### Component reconciler
+* This main pod takes care of reconciling component Custom Resource Objects (CROs) created by K8sBridge with the help of the Microsoft.Kubernetes resource provider (RP), Hybrid Relay, and Arc agent running on the cluster.
+
+#### Pod mutating webhook
+* These pods implement Kubernetes mutating admission webhooks, serving an instance of the mutate API. The mutate API does two things:
+  * It modifies the image registry path to the local registry IP, substituting out the AOSM artifact store Azure container registry (ACR).
+  * It creates an Artifact CR on the edge cluster.
+
+#### Artifact reconciler
+* This pod reconciles artifact CROs created by the mutating webhook.
+
+#### Registry
+* This pod stores and retrieves container images for CNF.
+
 ## High availability and resiliency considerations 
 The AOSM NF extension relies uses a mutating webhook and edge registry to support key features. 
 * Onboarding helm charts without requiring customization of image path.
@@ -78,7 +98,7 @@ With HA, cluster registry and webhook pods now support a replicaset with a minim
 #### DeploymentStrategy
 * A rollingUpdate strategy is used to help achieve zero downtime upgrades and support gradual rollout of applications. Default maxUnavailable configuration allows only one pod to be taken down at a time, until enough pods are created to satisfying redundancy policy.
 #### Pod Disruption Budget
-* A policy distruption budget (PDB) protects pods from voluntary disruption and is deployed alongside Deployment, ReplicaSet, or StatefulSet objects. For AOSM operator pods, a PDB with minAvailable parameter of 2 is used.
+* A policy disruption budget (PDB) protects pods from voluntary disruption and is deployed alongside Deployment, ReplicaSet, or StatefulSet objects. For AOSM operator pods, a PDB with minAvailable parameter of 2 is used.
 #### Pod anti-affinity
 * Pod anti-affinity controls distribution of application pods across multiple nodes in your cluster. With HA, AOSM pod anti-affinity using the following parameters:
   * A scheduling mode is used to define how strictly the rule is enforced.
@@ -100,8 +120,8 @@ With HA, cluster registry and webhook pods now support a replicaset with a minim
 
 #### Horizontal scaling
 * In Kubernetes, a HorizontalPodAutoscaler (HPA) automatically updates a workload resource with the aim of automatically scaling the workload to match demand. AOSM operator pods have the following HPA policy parameters configured;
-  * A minimum replicas of three.
-  * A maximum replicas of five.
+  * A minimum replica of three.
+  * A maximum replica of five.
   * A targetAverageUtilization for cpu and memory of 80%.
   
 #### Resource limits
@@ -112,7 +132,7 @@ All AOSM operator containers are configured with appropriate request, limit for
 
 #### Known HA Limitations
 * Nexus AKS (NAKS) clusters with single active node in system agent pool are not suitable for highly available. Nexus production topology must use at least three active nodes in system agent pool.
-* The nexus-shared storage class is a network file system (NFS) storage service. This NFS storage service is available per Cloud Service Network (CSN). Any Nexus Kubernetes cluster attached to the CSN can provision persistent volume from this shared storage pool. The storage pool is currently limited to a maximum size of 1TiB as of Network Cloud (NC) 3.10 where-as NC 3.12 has a 16-TiB option.
+* The nexus-shared storage class is a network file system (NFS) storage service. This NFS storage service is available per Cloud Service Network (CSN). Any Nexus Kubernetes cluster attached to the CSN can provision persistent volume from this shared storage pool. The storage pool is currently limited to a maximum size of 1 TiB as of Network Cloud (NC) 3.10 where-as NC 3.12 has a 16-TiB option.
 * Pod Anti affinity only deals with the initial placement of pods, subsequent pod scaling, and repair, follows standard K8s scheduling logic.
 
 ## Frequently Asked Questions