🐛 Handle SOFT_DELETED and DELETED states in server deletion #2834

bnallapeta · 2025-11-14T04:09:46Z

What this PR does / why we need it:
When OpenStack's soft-delete is enabled, deleted servers enter SOFT_DELETED state instead of being immediately purged, causing CAPO's deletion poll to timeout and stall cluster cleanup.
This PR updates the deletion poll to treat SOFT_DELETED (and DELETED) as success, allowing reconciliation to proceed while respecting OpenStack's reclaim policy.

** Special Notes **

Design Approach
Updated DeleteInstance to treat servers in SOFT_DELETED or DELETED state as successfully deleted. This allows cluster deletion to complete when OpenStack has soft delete enabled, while respecting the cloud admin's recovery policy. CAPO proceeds with cleanup, and OpenStack handles permanent deletion per its configured reclaim interval.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2618

TODOs:

squashed commits
if necessary:
- includes documentation
- adds unit tests

/hold

Signed-off-by: Bharath Nallapeta <[email protected]>

k8s-ci-robot · 2025-11-14T04:09:53Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

netlify · 2025-11-14T04:09:59Z

✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name	Link
🔨 Latest commit	`7121d6f`
🔍 Latest deploy log	https://app.netlify.com/projects/kubernetes-sigs-cluster-api-openstack/deploys/6916ab8f6a7d4c0009cb5264
😎 Deploy Preview	https://deploy-preview-2834--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

mandre

While this allows DeleteInstance() to complete when OpenStack is configured with soft-deletion, I'm a bit unclear what happens for all the resources the server depends on (volumes, trunk ports). Are we leaking them?
Also, what happens when the server is brought back to life?

Were you able to check this change against an OpenStack cloud configured with soft-deletion?

bnallapeta · 2025-11-17T12:50:07Z

@mandre

I'm a bit unclear what happens for all the resources the server depends on (volumes, trunk ports). Are we leaking them?

When CAPO creates servers with volumes, it sets DeleteOnTermination=true. So, when the server is deleted, OpenStack automatically deletes the volumes. This is not on CAPO.
Ports and trunks are explicitly cleaned up by CAPO after DeleteInstance returns, regardless of soft delete. So, behavior isn't changed with this change.

Also, what happens when the server is brought back to life?

Afaik, once CAPO is done with its cycle, it removes all the labels, finalizers etc and thus, from CAPO's perspective, the server becomes an orphaned resource. Even if it is brought back later, it won't be associated with CAPO. (@lentzi90 could you please chime in here?)

Were you able to check this change against an OpenStack cloud configured with soft-deletion?

No.

mandre · 2025-11-17T13:59:07Z

@mandre

I'm a bit unclear what happens for all the resources the server depends on (volumes, trunk ports). Are we leaking them?

When CAPO creates servers with volumes, it sets DeleteOnTermination=true. So, when the server is deleted, OpenStack automatically deletes the volumes. This is not on CAPO. Ports and trunks are explicitly cleaned up by CAPO after DeleteInstance returns, regardless of soft delete. So, behavior isn't changed with this change.

However, because the server is still there, I expect the port cleanup to fail, so perhaps we're creating new problems with this change, leading to resource leaks. It would be good to double check against a real environment. Or are we simply considering the server and associated resources as "not CAPO's problem anymore?". In which case, I think it would deserve being stated explicitly in the docs, because it would then be the user's responsibility to clean the resources after the soft-deletion period ended.

bnallapeta · 2025-11-18T06:00:41Z

@mandre let me test this on a real env and get back.

fix: Handle SOFT_DELETED and DELETED states in server deletion

7121d6f

Signed-off-by: Bharath Nallapeta <[email protected]>

github-project-automation bot added this to CAPO Roadmap Nov 14, 2025

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2025

github-project-automation bot moved this to Inbox in CAPO Roadmap Nov 14, 2025

k8s-ci-robot requested a review from lentzi90 November 14, 2025 04:09

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 14, 2025

k8s-ci-robot requested a review from smoshiur1237 November 14, 2025 04:09

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 14, 2025

bnallapeta changed the title ~~fix: Handle SOFT_DELETED and DELETED states in server deletion~~ 🐛 Handle SOFT_DELETED and DELETED states in server deletion Nov 14, 2025

mandre reviewed Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 Handle SOFT_DELETED and DELETED states in server deletion #2834

🐛 Handle SOFT_DELETED and DELETED states in server deletion #2834

Uh oh!

bnallapeta commented Nov 14, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Nov 14, 2025

Uh oh!

netlify bot commented Nov 14, 2025 •

edited

Loading

Uh oh!

mandre left a comment

Uh oh!

bnallapeta commented Nov 17, 2025 •

edited

Loading

Uh oh!

mandre commented Nov 17, 2025

Uh oh!

bnallapeta commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🐛 Handle SOFT_DELETED and DELETED states in server deletion #2834

Are you sure you want to change the base?

🐛 Handle SOFT_DELETED and DELETED states in server deletion #2834

Uh oh!

Conversation

bnallapeta commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Nov 14, 2025

Uh oh!

netlify bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Uh oh!

mandre left a comment

Choose a reason for hiding this comment

Uh oh!

bnallapeta commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mandre commented Nov 17, 2025

Uh oh!

bnallapeta commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bnallapeta commented Nov 14, 2025 •

edited

Loading

netlify bot commented Nov 14, 2025 •

edited

Loading

bnallapeta commented Nov 17, 2025 •

edited

Loading