You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -62,10 +62,10 @@ When the cluster registry feature is enabled in the Network Function Operator Ar
62
62
The AOSM NF extension relies uses a mutating webhook and edge registry to support key features.
63
63
* Onboarding helm charts without requiring customization of image path.
64
64
* A local cluster registry to accelerate pod operations and enable disconnected-moded support.
65
-
Given these components are essential, they need to be highly-available and resilient.
65
+
These essential components need to be highlyavailable and resilient.
66
66
67
67
### Summary of changes for HA
68
-
With HA, cluster registry and webhook pods now support a replicaset with a minimum of 3 replicas and a maximum of 5 replicas. The replicaset key configuration is as follows:
68
+
With HA, cluster registry and webhook pods now support a replicaset with a minimum of three replicas and a maximum of five replicas. The replicaset key configuration is as follows:
69
69
* Gradual rollout upgrade strategy is used.
70
70
* PodDisruptionBudgets (PDB) are used for availability during voluntary disruptions.
71
71
* Pod Anti-affinity is used to spread pods evenly across nodes.
@@ -74,43 +74,46 @@ With HA, cluster registry and webhook pods now support a replicaset with a minim
74
74
* Pods scale horizontally under CPU and memory load.
75
75
76
76
#### Replicas
77
-
*Running multiple copies, or replicas, of an application provides the first level of redundancy. Both cluster registry and webhook are defined as kind:deployment with a minimum of 3 replicas.
77
+
*A cluster running multiple copies, or replicas, of an application provides the first level of redundancy. Both cluster registry and webhook are defined as 'kind:deployment' with a minimum of three replicas.
78
78
#### DeploymentStrategy
79
79
* A rollingUpdate strategy is used to help achieve zero downtime upgrades and support gradual rollout of applications. Default maxUnavailable configuration allows only one pod to be taken down at a time, until enough pods are created to satisfying redundancy policy.
80
80
#### Pod Disruption Budget
81
-
* A PDB protects pods from voluntary disruption and is deployed alongside Deployment, ReplicaSet or StatefulSet objects. For AOSM operator pods a PDG with minAvailable parameter of 2 is used.
81
+
* A policy distruption budget (PDB) protects pods from voluntary disruption and is deployed alongside Deployment, ReplicaSet, or StatefulSet objects. For AOSM operator pods, a PDB with minAvailable parameter of 2 is used.
82
82
#### Pod anti-affinity
83
-
* Pod anti-affinity conmtrols distribution of application pods across multiple nodes in your cluster. With HA, AOSM pod anti-affinity using the following parameters:
84
-
* A scheduling mode is used that defines how strictly the rule is enforced.
85
-
* requiredDuringSchedulingIgnoredDuringExecution(Hard): Pods must be scheduled in a way that satisfies the defined rule. If no topologies that meet the rule's requirements are available, the pod will not be scheduled at all.
86
-
* preferredDuringSchedulingIgnoredDuringExecution(Soft): This rule type expresses a preference for scheduling pods but doesn't enforce a strict requirement. If topologies that meet the preference criteria are available, Kubernetes will try to schedule the pod. If no such topologies are available, the pod can still be scheduled on other nodes that do not meet the preference.
87
-
* A Label Selector is used to target specific pods for which the affinity will be applied.
88
-
* A Topology Key is used that defines the node needs.
89
-
* Nexus node placement is spread evenly across zones by design, so spreading the pods across nodes will also give zonal redundancy.
90
-
*For AOSM operator pods, a soft anti-affinity with weight 100 and topology key based on node hostnames is used.
83
+
* Pod anti-affinity controls distribution of application pods across multiple nodes in your cluster. With HA, AOSM pod anti-affinity using the following parameters:
84
+
* A scheduling mode is used to define how strictly the rule is enforced.
85
+
* requiredDuringSchedulingIgnoredDuringExecution(Hard): Pods must be scheduled in a way that satisfies the defined rule. If no topologies that meet the rule's requirements are available, the pod is not scheduled.
86
+
* preferredDuringSchedulingIgnoredDuringExecution(Soft): This rule type expresses a preference for scheduling pods but doesn't enforce a strict requirement. If topologies that meet the preference criteria are available, Kubernetes schedules the pod. If no such topologies are available, the pod can still be scheduled on other nodes that do not meet the preference.
87
+
* A Label Selector is used to target specific pods for which the affinity is applied.
88
+
* A Topology Key is used to define the node needs.
89
+
* Nexus node placement is spread evenly across zones by design, so spreading the pods across nodes also gives zonal redundancy.
90
+
* AOSM operator pods use a soft anti-affinity with weight 100 and topology key based on node hostnames is used.
91
91
92
92
#### Storage
93
-
* Since AOSM edge registry has multiple replicas which are spread across nodes, the persistent volume must support “ReadWriteMany” (RWX) access mode. PVC “nexus-shared” volume is available on Nexus clusters and supports RWX access mode.
93
+
* Since AOSM edge registry has multiple replicas which are spread across nodes, the persistent volume must support ReadWriteMany (RWX) access mode. PVC “nexus-shared” volume is available on Nexus clusters and supports RWX access mode.
94
94
95
95
#### Monitoring via Readiness Probes
96
96
* AOSM uses http readiness probes to know when a container is ready to start accepting traffic. A pod is considered ready when all containers are ready. When a Pod is not ready, it is removed from the service load balancers.
97
97
98
98
#### System node pool
99
-
* All AOSM operator pods are assigned to the system node pool. This prevents misconfigured or rouge application pods from impacting system pods.
99
+
* All AOSM operator pods are assigned to the system node pool. This pool prevents misconfigured or rouge application pods from impacting system pods.
100
100
101
101
#### Horizontal scaling
102
-
* In Kubernetes, a HorizontalPodAutoscaler (HPA) automatically updates a workload resource with the aim of automatically scaling the workload to match demand. AOSM operator pods have a HPA policy configured requiring minimum replicas of 3, maximum replicas of 5, and targetAverageUtilization of cpu and memory of 80%.
102
+
* In Kubernetes, a HorizontalPodAutoscaler (HPA) automatically updates a workload resource with the aim of automatically scaling the workload to match demand. AOSM operator pods have the following HPA policy parameters configured;
103
+
* A minimum replicas of three.
104
+
* A maximum replicas of five.
105
+
* A targetAverageUtilization for cpu and memory of 80%.
103
106
104
107
#### Resource limits
105
-
* Resources limits are used to prevent a resource overload on the nodes where AOSM pods are running. AOSM uses two resource parameters to limit both CPU and memory consumtion.
106
-
***Resource request** - The minimum amount that should be reserved for a pod. This should be set to resource usage under normal load for your application.
107
-
***Resource limit** - The maximum amount that a pod should ever use, if usage reaches the limit it will be terminated.
108
+
* Resources limits are used to prevent a resource overload on the nodes where AOSM pods are running. AOSM uses two resource parameters to limit both CPU and memory consumption.
109
+
***Resource request** - The minimum amount that should be reserved for a pod. This value should be set to resource usage under normal load for your application.
110
+
***Resource limit** - The maximum amount that a pod should ever use, if usage reaches the limit it is terminated.
108
111
All AOSM operator containers are configured with appropriate request, limit for CPU and memory.
109
112
110
113
#### Known HA Limitations
111
-
* Nexus AKS (NAKS) clusters with single active node in system agent pool are not suitable for highly available. Nexus production production topology must use at least 3 active nodes in system agent pool.
112
-
* The nexus-shared storage class is backed by a network file system (NFS) storage service. This NFS storage service is available per Cloud Service Network (CSN). Any Nexus Kubernetes cluster attached to the CSN can provision persistent volume from this shared storage pool. The storage pool is currently limited to a maximum size of 1TiB as of Network Cloud (NC) 3.10 where-as NC 3.12 will have a 16TiB option.
113
-
* Pod Anti affinity only deals with the initial placement of pods, subsequent pod scaling and repair follows standard k8s scheduling logic.
114
+
* Nexus AKS (NAKS) clusters with single active node in system agent pool are not suitable for highly available. Nexus production topology must use at least three active nodes in system agent pool.
115
+
* The nexus-shared storage class is a network file system (NFS) storage service. This NFS storage service is available per Cloud Service Network (CSN). Any Nexus Kubernetes cluster attached to the CSN can provision persistent volume from this shared storage pool. The storage pool is currently limited to a maximum size of 1TiB as of Network Cloud (NC) 3.10 where-as NC 3.12 has a 16-TiB option.
116
+
* Pod Anti affinity only deals with the initial placement of pods, subsequent pod scaling, and repair, follows standard K8s scheduling logic.
114
117
115
118
## Frequently Asked Questions
116
119
* Can I use AOSM cluster registry with a CNF application previously deployed?
0 commit comments