Skip to content

Commit 1748bef

Browse files
committed
Introduce Distributed Zone with 3rd Party Storage DT
This DT is the same as bgp-l3-xl but it also has three zones configured with topology CRDs used to either spread pods accross zones or keep them within a zone. There is a separate cinder-volume and manila-share service per zone. It uses a NetApp as an iSCSI backend for Cinder and an NFS backend for Manila. Glance uses Cinder as its backend and is configured with multiple stores. Signed-off-by: John Fulton <[email protected]> Co-authored-by: Claude (AI Assistant) [email protected]
1 parent ef5d477 commit 1748bef

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+6690
-0
lines changed

automation/mocks/dz-storage.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
bgp-l3-xl.yaml

automation/vars/dz-storage.yaml

Lines changed: 223 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,223 @@
1+
---
2+
vas:
3+
dz-storage:
4+
stages:
5+
- pre_stage_run:
6+
- name: Apply taint on worker-9
7+
type: cr
8+
definition:
9+
spec:
10+
taints:
11+
- effect: NoSchedule
12+
key: testOperator
13+
value: 'true'
14+
- effect: NoExecute
15+
key: testOperator
16+
value: 'true'
17+
kind: Node
18+
resource_name: worker-9
19+
state: patched
20+
- name: Disable rp_filters on OCP nodes
21+
type: cr
22+
definition:
23+
spec:
24+
profile:
25+
- data: |
26+
[main]
27+
summary=Optimize systems running OpenShift (provider specific parent profile)
28+
include=-provider-${f:exec:cat:/var/lib/ocp-tuned/provider},openshift
29+
30+
[sysctl]
31+
net.ipv4.conf.enp8s0.rp_filter=0
32+
net.ipv4.conf.enp9s0.rp_filter=0
33+
name: openshift-no-reapply-sysctl
34+
recommend:
35+
- match:
36+
# applied to all nodes except worker-9, because worker-9 has no enp8s0
37+
- label: kubernetes.io/hostname
38+
value: worker-0
39+
- label: kubernetes.io/hostname
40+
value: worker-1
41+
- label: kubernetes.io/hostname
42+
value: worker-2
43+
- label: kubernetes.io/hostname
44+
value: worker-3
45+
- label: kubernetes.io/hostname
46+
value: worker-4
47+
- label: kubernetes.io/hostname
48+
value: worker-5
49+
- label: kubernetes.io/hostname
50+
value: worker-6
51+
- label: kubernetes.io/hostname
52+
value: worker-7
53+
- label: kubernetes.io/hostname
54+
value: worker-8
55+
- label: node-role.kubernetes.io/master
56+
operand:
57+
tunedConfig:
58+
reapply_sysctl: false
59+
priority: 15
60+
profile: openshift-no-reapply-sysctl
61+
api_version: tuned.openshift.io/v1
62+
kind: Tuned
63+
resource_name: openshift-no-reapply-sysctl
64+
namespace: openshift-cluster-node-tuning-operator
65+
state: present
66+
name: nncp-configuration
67+
path: examples/dt/dz-storage/control-plane/networking/nncp
68+
wait_conditions:
69+
- >-
70+
oc -n openstack wait nncp
71+
-l osp/nncm-config-type=standard
72+
--for jsonpath='{.status.conditions[0].reason}'=SuccessfullyConfigured
73+
--timeout=600s
74+
values:
75+
- name: network-values
76+
src_file: values.yaml
77+
build_output: nncp.yaml
78+
79+
- name: networking
80+
path: examples/dt/dz-storage/control-plane/networking
81+
wait_conditions:
82+
- >-
83+
oc -n metallb-system wait pod
84+
-l app=metallb -l component=speaker
85+
--for condition=Ready
86+
values:
87+
- name: network-values
88+
src_file: nncp/values.yaml
89+
build_output: networking.yaml
90+
91+
- path: examples/dt/dz-storage/topology
92+
wait_conditions:
93+
- >-
94+
oc -n openstack wait
95+
--for=jsonpath='{.metadata.name}'=azone-node-affinity
96+
topology/azone-node-affinity --timeout=60s
97+
values:
98+
- name: node-zone-labels
99+
src_file: node-zone-labels.yaml
100+
build_output: topology.yaml
101+
102+
# allow 60m (not 30m) for larger control plane on more nodes
103+
- name: control-plane
104+
path: examples/dt/dz-storage/control-plane
105+
wait_conditions:
106+
- >-
107+
oc -n openstack wait openstackcontrolplane
108+
controlplane
109+
--for condition=Ready
110+
--timeout=60m
111+
values:
112+
- name: network-values
113+
src_file: networking/nncp/values.yaml
114+
- name: service-values
115+
src_file: service-values.yaml
116+
build_output: control-plane.yaml
117+
post_stage_run:
118+
- name: Create BGPConfiguration after controlplane is deployed
119+
type: cr
120+
definition:
121+
spec: {}
122+
api_version: network.openstack.org/v1beta1
123+
kind: BGPConfiguration
124+
resource_name: bgpconfiguration
125+
namespace: openstack
126+
state: present
127+
128+
- name: edpm-computes-r0-nodeset
129+
path: examples/dt/dz-storage/edpm/computes/r0
130+
wait_conditions:
131+
- >-
132+
oc -n openstack wait openstackdataplanenodeset
133+
r0-compute-nodes
134+
--for condition=SetupReady
135+
--timeout=600s
136+
values:
137+
- name: edpm-r0-compute-nodeset-values
138+
src_file: values.yaml
139+
build_output: edpm-r0-compute-nodeset.yaml
140+
141+
- name: edpm-computes-r1-nodeset
142+
path: examples/dt/dz-storage/edpm/computes/r1
143+
wait_conditions:
144+
- >-
145+
oc -n openstack wait openstackdataplanenodeset
146+
r1-compute-nodes
147+
--for condition=SetupReady
148+
--timeout=600s
149+
values:
150+
- name: edpm-r1-compute-nodeset-values
151+
src_file: values.yaml
152+
build_output: edpm-r1-compute-nodeset.yaml
153+
154+
- name: edpm-computes-r2-nodeset
155+
path: examples/dt/dz-storage/edpm/computes/r2
156+
wait_conditions:
157+
- >-
158+
oc -n openstack wait openstackdataplanenodeset
159+
r2-compute-nodes
160+
--for condition=SetupReady
161+
--timeout=600s
162+
values:
163+
- name: edpm-r2-compute-nodeset-values
164+
src_file: values.yaml
165+
build_output: edpm-r2-compute-nodeset.yaml
166+
167+
- name: edpm-networkers-r0-nodeset
168+
path: examples/dt/dz-storage/edpm/networkers/r0
169+
wait_conditions:
170+
- >-
171+
oc -n openstack wait openstackdataplanenodeset
172+
r0-networker-nodes
173+
--for condition=SetupReady
174+
--timeout=600s
175+
values:
176+
- name: edpm-r0-networker-nodeset-values
177+
src_file: values.yaml
178+
build_output: edpm-r0-networker-nodeset.yaml
179+
180+
- name: edpm-networkers-r1-nodeset
181+
path: examples/dt/dz-storage/edpm/networkers/r1
182+
wait_conditions:
183+
- >-
184+
oc -n openstack wait openstackdataplanenodeset
185+
r1-networker-nodes
186+
--for condition=SetupReady
187+
--timeout=600s
188+
values:
189+
- name: edpm-r1-networker-nodeset-values
190+
src_file: values.yaml
191+
build_output: edpm-r1-networker-nodeset.yaml
192+
193+
- name: edpm-networkers-r2-nodeset
194+
path: examples/dt/dz-storage/edpm/networkers/r2
195+
wait_conditions:
196+
- >-
197+
oc -n openstack wait openstackdataplanenodeset
198+
r2-networker-nodes
199+
--for condition=SetupReady
200+
--timeout=600s
201+
values:
202+
- name: edpm-r2-networker-nodeset-values
203+
src_file: values.yaml
204+
build_output: edpm-r2-networker-nodeset.yaml
205+
206+
- name: edpm-deployment
207+
path: examples/dt/dz-storage/edpm/deployment
208+
wait_conditions:
209+
- >-
210+
oc -n openstack wait openstackdataplanedeployment
211+
edpm-deployment
212+
--for condition=Ready
213+
--timeout=120m
214+
values:
215+
- name: edpm-deployment-values
216+
src_file: values.yaml
217+
build_output: edpm-deployment.yaml
218+
post_stage_run:
219+
- name: Wait until computes are ready
220+
type: playbook
221+
source: "../../playbooks/bgp-l3-computes-ready.yml"
222+
extra_vars:
223+
num_computes: 6

examples/dt/dz-storage/README.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Distributed Zones with BGP and third party storage
2+
3+
This Deployed Topology (DT) is the same as [bgp-l3-xl](../bgp-l3-xl)
4+
but it also has the following:
5+
6+
- Three zones:
7+
- zone A CoreOS: ocp-worker-0 ocp-worker-1 ocp-worker-2, ocp-master-0
8+
- zone B CoreOS: ocp-worker-3 ocp-worker-4 ocp-worker-5, ocp-master-1
9+
- zone C CoreOS: ocp-worker-6 ocp-worker-7 ocp-worker-8, ocp-master-2
10+
- zone A RHEL: r0-compute-0, r0-compute-1, r0-networker-0, leaf-0, leaf-1
11+
- zone B RHEL: r1-compute-0, r1-compute-1, r1-networker-0, leaf-2, leaf-3
12+
- zone C RHEL: r2-compute-0, r2-compute-1, r2-networker-0, leaf-4, leaf-5
13+
- [Toplogy CRDs](https://github.com/openstack-k8s-operators/infra-operator/pull/325) are
14+
used to either spread pods accross zones or keep them within a zone.
15+
- Self Node Remdiation and Node Health Checks
16+
- It is assumed that a Storage Array is physically located in each of the zones.
17+
This examples uses a NetApp as an iSCSI backend for Cinder and an NFS backend for Manila.
18+
- Glance uses Cinder as its backend and is configured with multiple stores
19+
- There is a separate cinder-volume and manila-share service per zone
20+
21+
The CRs included within this DT should be applied on an environment
22+
where EDPM and OCP nodes are connected through a spine/leaf
23+
network. The BGP protocol should be enabled on those spine and leaf
24+
routers. See [bgp-l3-xl](../bgp-l3-xl) for information about
25+
the BGP Dynamic Routing configuration.
26+
27+
worker-9 is tained so that regular openstack workloads are not
28+
scheduled on it. It is not included into any rack or zone. It
29+
is not connected to any leaves, but to a router connected to
30+
the spines. It exists for running tests from outside.
31+
32+
## Prerequisites
33+
34+
- Chapters 2 and 6 from
35+
[Self Node Remdiation and Node Health Checks](https://docs.redhat.com/en/documentation/workload_availability_for_red_hat_openshift/24.4/html-single/remediation_fencing_and_maintenance)
36+
have been completed so that when the Node Health Check (NHC) Operator
37+
detects an unhealthy node, it creates a Self Node Remediation (SNR) CR
38+
with the `Automatic` strategy (which will taint an unhealthy node so
39+
that its pods are rescheduled).
40+
41+
- A storage class has been created. If [LVMS](https://docs.redhat.com/en/documentation/openshift_container_platform/4.16/html/storage/configuring-persistent-storage#persistent-storage-using-lvms)
42+
is used, then if a node fails, the system does not allow the attached
43+
volume to be mounted on a new node because it is already assigned to
44+
the failed node. This prevents SNR from rescheduling pods with PVCs.
45+
46+
## Considerations
47+
48+
### Networking
49+
50+
See the "Considerations/Constraints" section of [bgp-l3-xl](../bgp-l3-xl)
51+
52+
### Block Storage Access
53+
54+
#### Local Access
55+
56+
In this example there is a storage array in each availability zone and a cinder-volume service pod is deployed on worker nodes in the same zone, i.e. local to that array. For example, in AZ1 the storage array with IP address 10.1.0.6 is on the local storage network 10.1.0.0/24 and a cinder-volume pod for AZ1 is configured on a worker node with access to that storage network. Compute nodes in AZ1 should be on the same network and have access to the same array.
57+
58+
#### Remote Access
59+
60+
It is also necessary for the storage array in each zone to be accessible by worker nodes in remote zones. For example, the glance pod in AZ1 is configured with multiple stores including the cinder-volume service in local AZ1 and the cinder-volume service in remote AZ2. This access is necessary so that an image may be uploaded to the glance store in AZ1 and then copied to the glance store in AZ2. The same access is also required to retype volumes between zones.
61+
62+
The example here uses iSCSI and IP routing can be configured to support the remote access described above. If FC is used in place of iSCSI, then the switches need to be zoned to support the same types of remote and local access.
63+
64+
### File Storage Access
65+
66+
#### Local Access
67+
68+
In this example there is a storage array in each availability zone and a manila-share service pod is deployed on worker nodes local to that array. For example, in AZ1 the storage array with IP address 10.1.0.6 is on the local storage network 10.1.0.0/24 and a manila-share pod for AZ1 is configured on a worker node with access to that storage network. Compute nodes in AZ1 should be on the same network and have access to the same array if they will use the shares hosted on that array.
69+
70+
#### Remote Access
71+
72+
It is also possible for the storage array in each zone to be accessible by worker nodes in remote zones. For example, access could be granted to a share which is hosted in AZ1 to Nova instances which are hosted in AZ2. It’s possible that network latency between availability zones might affect storage performance. It’s up to the administrator to grant access to shares to ensure only local access or allow remote access.
73+
74+
## Stages
75+
76+
All stages must be executed in the order listed below. Everything is required unless otherwise indicated.
77+
78+
1. [Configure taints on the OCP worker](configure-taints.md)
79+
2. [Disable RP filters on OCP nodes](disable-rp-filters.md)
80+
3. [Install the OpenStack K8S operators and their dependencies](../../common/)
81+
4. [Apply metallb customization required to run a speaker pod on the OCP tester node](metallb/)
82+
5. [Define Zones and Toplogies](topology/)
83+
6. [Configure networking and deploy the OpenStack control plane with storage](control-plane.md)
84+
7. [Create BGPConfiguration after controplane is deployed](bgp-configuration.md)
85+
8. [Configure and deploy the dataplane - networker and compute nodes](data-plane.md)
86+
9. [Validate Distributed Zone Storage](validate.md)
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Create BGPConfiguration after controplane is deployed
2+
3+
An empty BGPConfiguration Openshift resource needs to be created.
4+
The infra-operator will detect this resource is created and will automatically
5+
apply the required Openshift BGP configuration.
6+
OCP 4.18 release is necessary for this.
7+
8+
The following CR needs to be applied:
9+
```
10+
apiVersion: network.openstack.org/v1beta1
11+
kind: BGPConfiguration
12+
metadata:
13+
name: bgpconfiguration
14+
namespace: openstack
15+
spec: {}
16+
```
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Apply taints on OCP tester node
2+
3+
This OCP worker node should not run any Openstack service apart from those
4+
created by the test-operator.
5+
It should also run a metallb's speaker pod, in order to obtain the proper
6+
network configuration.
7+
Due to this, taints should be configured on this worker.
8+
9+
Execute the following command:
10+
```
11+
oc patch node/worker-9 --type merge --patch '
12+
spec:
13+
taints:
14+
- effect: NoSchedule
15+
key: testOperator
16+
value: "true"
17+
- effect: NoExecute
18+
key: testOperator
19+
value: "true"
20+
'
21+
```

0 commit comments

Comments
 (0)