-
Notifications
You must be signed in to change notification settings - Fork 109
Set metrics storage for HCI #674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
When deploy FR2 using HCI using ci framework:
./deploy-architecture.sh -e zuul_log_collection=true -e cifmw_nolog=false -e cifmw_run_tests=false
It first `kustomize_deploy/control-plane.yaml` gets deployed and
afterwards updates it `kustomize_deploy/nodeset-post-ceph.yaml`
which updates that telemetry section and sets:
~~~
persistentVolumeClaim:
resources:
requests:
storage: 20G
~~
~~~
$ oc get osctlplane -n openstack -w
NAME STATUS MESSAGE
controlplane True Setup complete
controlplane True Setup complete
controlplane False OpenStackControlPlane RabbitMQ in progress
controlplane False OpenStackControlPlane RabbitMQ error occured rabbitmq-notifications(Secret "cert-rabbitmq-notifications-svc" not found)
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress
~~~
Since the initial deploy was with 10G and now switches to 20G (there
usually are no 20G local storage volumes), the ctlplane is stuck in
progress.
This updates the hci example to use 10G with local-storage storage
class to align with the initial setup.
Jira: OSPRH-23687
Signed-off-by: Martin Schuppert <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: stuggi The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| requests: | ||
| storage: 20G | ||
| storage: 10G | ||
| storageClassName: local-storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hardcoding this, we can use what the user specifies as storageClass from the values.yaml referenced in [1]. So then we can add a new replacement to [2] to inject the value the user set for storageClass in values.yaml.
[1]
| - control-plane/networking/nncp/values.yaml |
[2] https://github.com/openstack-k8s-operators/architecture/blob/main/va/hci/kustomization.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this [1], but obviously with a different path in fieldPaths. Or we could perhaps leave the storageClassName out all-together. Not sure if the Telemetry operator would auto-populate it during reconcile or not.
[1]
architecture/lib/control-plane/storage/kustomization.yaml
Lines 5 to 15 in 851e915
| replacements: | |
| # Storage class configuration | |
| - source: | |
| kind: ConfigMap | |
| name: network-values | |
| fieldPath: data.storageClass | |
| targets: | |
| - select: | |
| kind: OpenStackControlPlane | |
| fieldPaths: | |
| - spec.storageClass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not populating the global storage-class in earlier releases here. it was added later in openstack-k8s-operators/openstack-operator@7920e8b
I did it like this as it was done the same in fed27da
iiuc what you meant, I tried stuggi@e677b10 but it fails with:
TASK [kustomize_deploy : Build kustomized content for examples/va/hci/control-plane chdir={{ _chdir }}, _raw_params=oc kustomize] ***********************************************************
Friday 09 January 2026 10:15:57 +0000 (0:00:00.068) 0:14:21.673 ********
Friday 09 January 2026 10:15:57 +0000 (0:00:00.068) 0:14:21.671 ********
task path: /home/zuul/src/github.com/openstack-k8s-operators/ci-framework/roles/kustomize_deploy/tasks/execute_step.yml:229
fatal: [localhost]: FAILED! =>
changed: true
cmd:
- oc
- kustomize
delta: '0:00:00.133827'
end: '2026-01-09 10:15:57.623324'
msg: non-zero return code
rc: 1
start: '2026-01-09 10:15:57.489497'
stderr: 'error: accumulating components: accumulateDirectory: "recursed accumulation
of path ''/home/zuul/src/github.com/openstack-k8s-operators/architecture/va/hci'':
accumulating components: accumulateDirectory: \"recursed accumulation of path ''/home/zuul/src/github.com/openstack-k8s-operators/architecture/lib/control-plane/storage'':
unable to find field \\\"spec.telemetry.template.metricStorage.customMonitoringStack.prometheusConfig.persistentVolumeClaim.storageClassName\\\"
in replacement target\""'
stderr_lines:
- 'error: accumulating components: accumulateDirectory: "recursed accumulation of
path ''/home/zuul/src/github.com/openstack-k8s-operators/architecture/va/hci'':
accumulating components: accumulateDirectory: \"recursed accumulation of path ''/home/zuul/src/github.com/openstack-k8s-operators/architecture/lib/control-plane/storage'':
unable to find field \\\"spec.telemetry.template.metricStorage.customMonitoringStack.prometheusConfig.persistentVolumeClaim.storageClassName\\\"
in replacement target\""'
stdout: ''
stdout_lines: []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to make the kustomize replacement line work, we'd have to add the customMonitoringStack structure to the base OpenStackControlPlane resource at architecture/lib/control-plane/base/openstackcontrolplane.yaml. Not sure if that is something we want/can do? might affect other jobs?
When deploy FR2 using HCI using ci framework:
./deploy-architecture.sh -e zuul_log_collection=true -e cifmw_nolog=false -e cifmw_run_tests=false
It first
kustomize_deploy/control-plane.yamlgets deployed and afterwards updates itkustomize_deploy/nodeset-post-ceph.yamlwhich updates that telemetry section and sets:$ oc get osctlplane -n openstack -w
NAME STATUS MESSAGE
controlplane True Setup complete
controlplane True Setup complete
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress
controlplane False OpenStackControlPlane Telemetry in progress