Skip to content

Conversation

@rabi
Copy link
Contributor

@rabi rabi commented Oct 31, 2025

This PR implements the service cleanup functionality with a new edpm_cleanup role.

  • Creates a state file to keep track of deployed services on the nodes
  • Services not in nodeset services list would be cleaned by adding cleanup service to the nodeset services list or a new deployment with cleanup in 'servicesOverride'
  • If a service drops a container it would be automatically cleaned up
  • A service cleanup will remove containers, startup_config and other config files along with running the specific cleanup tasks provided in cleanup.yaml of a role

jira: https://issues.redhat.com/browse/OSPRH-19243

Assisted-by: claude-4-sonnet
Signed-off-by: rabi [email protected]

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 31, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rabi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rabi rabi force-pushed the service_cleanup branch 17 times, most recently from 101f6c8 to ab712c4 Compare November 3, 2025 16:08
@rabi rabi changed the title WIP Add service cleanup automation Add service cleanup automation Nov 4, 2025
@rabi
Copy link
Contributor Author

rabi commented Nov 4, 2025

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/b2b01c949c734fed82b507e66b194d38

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 28m 27s
podified-multinode-edpm-deployment-crc FAILURE in 1h 36m 55s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 36m 28s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 44s
cifmw-crc-podified-edpm-baremetal-bootc FAILURE in 1h 41m 28s
✔️ noop SUCCESS in 0s
edpm-ansible-tempest-multinode FAILURE in 1h 39m 24s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 26m 41s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 11m 30s
✔️ edpm-ansible-molecule-edpm_frr SUCCESS in 7m 28s
✔️ edpm-ansible-molecule-edpm_ovn_bgp_agent SUCCESS in 7m 43s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 9m 00s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 13m 09s

@rabi
Copy link
Contributor Author

rabi commented Nov 4, 2025

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/271f0207e6dc48628c664ab69c5810e1

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 25m 11s
podified-multinode-edpm-deployment-crc FAILURE in 1h 37m 38s
cifmw-crc-podified-edpm-baremetal FAILURE in 1h 44m 30s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 41s
cifmw-crc-podified-edpm-baremetal-bootc FAILURE in 1h 47m 30s
✔️ noop SUCCESS in 0s
edpm-ansible-tempest-multinode FAILURE in 1h 43m 22s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 27m 19s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 12m 01s
✔️ edpm-ansible-molecule-edpm_frr SUCCESS in 7m 17s
✔️ edpm-ansible-molecule-edpm_ovn_bgp_agent SUCCESS in 7m 48s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 9m 03s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 10m 56s

@rabi
Copy link
Contributor Author

rabi commented Nov 4, 2025

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3fb9e8fd282b4237aec2cc26e955391e

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 21m 21s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 17m 08s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 31m 13s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 23s
✔️ cifmw-crc-podified-edpm-baremetal-bootc SUCCESS in 1h 30m 44s
✔️ noop SUCCESS in 0s
✔️ edpm-ansible-tempest-multinode SUCCESS in 1h 34m 42s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 25m 40s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 10m 50s
✔️ edpm-ansible-molecule-edpm_frr SUCCESS in 6m 58s
✔️ edpm-ansible-molecule-edpm_ovn_bgp_agent SUCCESS in 7m 14s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 8m 23s
adoption-standalone-to-crc-ceph-provider FAILURE in 2h 07m 11s

@rabi
Copy link
Contributor Author

rabi commented Nov 5, 2025

recheck

1 similar comment
@rabi
Copy link
Contributor Author

rabi commented Nov 5, 2025

recheck

@rabi
Copy link
Contributor Author

rabi commented Nov 5, 2025

recheck

3 similar comments
@rabi
Copy link
Contributor Author

rabi commented Nov 5, 2025

recheck

@rabi
Copy link
Contributor Author

rabi commented Nov 5, 2025

recheck

@rabi
Copy link
Contributor Author

rabi commented Nov 6, 2025

recheck

rabi added a commit to rabi/openstack-operator that referenced this pull request Nov 6, 2025
Some services used edpm_container_standalone and others directly use
edpm_container_manage and the way we generate the config files and use
them is spread across multiple locations which is confusing and difficult
to troubleshoot. This sanitizes the locations.

This is required to identify all the services deployed vs to be deployed
and remove orphaned containers/services.

jira: https://issues.redhat.com/browse/OSPRH-19243

Signed-off-by: rabi <[email protected]>
@rabi rabi force-pushed the service_cleanup branch 4 times, most recently from 4234e41 to 3709924 Compare November 10, 2025 05:07
@rabi
Copy link
Contributor Author

rabi commented Nov 11, 2025

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/262e1cc9d3134ea4a5b73aa0caf3c4fe

Warning:
Dependency cycle detected and project openstack-k8s-operators/ci-framework doesn't allow circular dependencies

@rabi
Copy link
Contributor Author

rabi commented Nov 12, 2025

recheck

rabi added 2 commits November 13, 2025 19:48
This will ensure that we don't have issues during update

Signed-off-by: rabi <[email protected]>
@rabi rabi force-pushed the service_cleanup branch 2 times, most recently from 05b6f47 to 8348188 Compare November 13, 2025 14:37
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3a2df1fba3484160b5bf1530e94fb079

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 47m 23s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 27m 01s
cifmw-crc-podified-edpm-baremetal MERGE_CONFLICT in 3m 40s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 42s
cifmw-crc-podified-edpm-baremetal-bootc MERGE_CONFLICT in 3m 49s
✔️ noop SUCCESS in 0s
edpm-ansible-tempest-multinode MERGE_CONFLICT in 3m 53s
✔️ edpm-ansible-molecule-edpm_kernel SUCCESS in 27m 14s
✔️ edpm-ansible-molecule-edpm_nova SUCCESS in 11m 32s
✔️ edpm-ansible-molecule-edpm_frr SUCCESS in 6m 38s
✔️ edpm-ansible-molecule-edpm_ovn_bgp_agent SUCCESS in 7m 32s
✔️ edpm-ansible-molecule-edpm_telemetry_power_monitoring SUCCESS in 8m 30s
✔️ edpm-ansible-molecule-edpm_update SUCCESS in 6m 55s
adoption-standalone-to-crc-ceph-provider ERROR Failed to update project openstack-k8s-operators/architecture in 1m 22s

This PR implements the service cleanup functionality with an
edpm_cleanup role.

- edpm_container_standalone creates a state file to keep track of
deployed services on the nodes
- Containers for services not in service list of nodeset would be
cleaned by adding cleanup service to the nodeset services list or
a new deployment with cleanup in 'servicesOverride'
- If a service drops a container it would be automatically cleaned
up
- Service cleanup will also remove containers, startup_config and other
config files along with running the specific cleanup tasks provided in
cleanup.yaml of a role

jira: https://issues.redhat.com/browse/OSPRH-19243

Assisted-by: claude-4-sonnet

Signed-off-by: rabi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant