-
Notifications
You must be signed in to change notification settings - Fork 280
Description
What steps did you take and what happened:
We experienced issues with the OpenStack infrastructure that led to frequent live migrations and caused several VMs to enter the ERROR state.
Once the OpenStack infrastructure was restored to a healthy state, the affected VMs returned to the ACTIVE state, and the corresponding Kubernetes nodes became Ready again.
However, the associated OpenStackServer resources remained in the ERROR state and did not recover.
Here is the event log from one affected VM:
openstack server event list test-md-0-vgh6x-52bdl-s4n9v --long
+------------------------------------------+--------------------------------------+----------------+----------------------------+---------+----------------------------------+----------------------------------+
| Request ID | Server ID | Action | Start Time | Message | Project ID | User ID |
+------------------------------------------+--------------------------------------+----------------+----------------------------+---------+----------------------------------+----------------------------------+
| req-242cf8b4-c47e-462e-9e7a-2f08a80f7f2d | b4b4f1e1-3583-4adc-88c6-b7b523cd9478 | live-migration | 2025-07-15T10:29:43.000000 | None | 2e0ab246be0e407c88a45f4755abdd17 | 25774e95e5f34a5aa7d3fd4cff3a86ef |
| req-87634299-cc38-48cd-8796-2aefa4c18fa5 | b4b4f1e1-3583-4adc-88c6-b7b523cd9478 | live-migration | 2025-07-15T10:18:19.000000 | None | 2e0ab246be0e407c88a45f4755abdd17 | 25774e95e5f34a5aa7d3fd4cff3a86ef |
| req-169f86ce-f115-4584-93a7-9edacd9a2555 | b4b4f1e1-3583-4adc-88c6-b7b523cd9478 | live-migration | 2025-07-12T14:38:00.000000 | None | 2e0ab246be0e407c88a45f4755abdd17 | 25774e95e5f34a5aa7d3fd4cff3a86ef |
| req-3aca2c17-11a0-4309-852c-acd1735413b8 | b4b4f1e1-3583-4adc-88c6-b7b523cd9478 | reboot | 2025-07-12T14:37:50.000000 | None | 2e0ab246be0e407c88a45f4755abdd17 | 25774e95e5f34a5aa7d3fd4cff3a86ef |
| req-8d6ebe23-368b-4fc0-b0a4-c73c6ecebd2b | b4b4f1e1-3583-4adc-88c6-b7b523cd9478 | live-migration | 2025-07-12T14:00:50.000000 | Error | 2e0ab246be0e407c88a45f4755abdd17 | 25774e95e5f34a5aa7d3fd4cff3a86ef |
| req-03c7d1e6-9e88-4f2d-9410-d944b230fb6f | b4b4f1e1-3583-4adc-88c6-b7b523cd9478 | live-migration | 2025-07-12T12:52:58.000000 | Error | 2e0ab246be0e407c88a45f4755abdd17 | 25774e95e5f34a5aa7d3fd4cff3a86ef |
The CAPO controller logs the following message and does not attempt to reconcile the resource from that state:
Not reconciling server in error state. See openStackServer.status or previously logged error for details
Why does CAPO not attempt to reconcile an OpenStackServer that is in the ERROR state, even after the underlying VM has recovered?
What did you expect to happen:
Reconciliation should have picked up the new state of the OpenStackServer
Environment:
Cluster API Provider OpenStack version: v0.12.3
Cluster-API version: v1.10.2
OpenStack version:
Minikube/KIND version:
Kubernetes version (use kubectl version): 1.32.5
OS (e.g. from /etc/os-release): Ubuntu 24.04
Metadata
Metadata
Assignees
Labels
Type
Projects
Status