Skip to content

Conversation

@michael-richey
Copy link
Collaborator

What does this PR do?

This fix is to delete items in the correct order. We've always created them in the correct order, for example if a Dashboard depends on an SLO being there we create the SLO before creating the Dashboard. The opposite wasn't true though, we didn't always delete the Dashboard before deleting the SLO.

Description of the Change

Topological sort for deleting things.

@michael-richey michael-richey marked this pull request as ready for review January 16, 2026 15:11
@michael-richey michael-richey requested a review from a team as a code owner January 16, 2026 15:11
Copy link
Collaborator

@heyronhay heyronhay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of suggestions!

sorter.prepare()
# If prepare() succeeds, no cycles
return None
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an overly broad catch, is there a more specific error we can catch here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, fixed to pull in specifically the CycleError

await r_class._send_action_metrics("delete", _id, Status.FAILURE.value)
self.config.logger.error(f"error deleting resource {resource_type} with id {_id}: {str(e)}")
finally:
# Mark as done in cleanup sorter if it exists
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to test if it was actually deleted successfully first?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes, but I think that's a bigger restructure. Calling done(q_item) frees up the sorter so it doesn't hang forever. If we have a dashboard->monitor->slo dependency chain and the dashboard fails to get deleted then the calls to delete the monitor and slo will still happen (and will both fail). That's no different from today. To actively skip those other delete calls we need to start tracking every dependency chain I think. It would be better of course, just a bigger change.

@michael-richey michael-richey merged commit 8be4b0b into main Jan 16, 2026
11 checks passed
@michael-richey michael-richey deleted the michael.richey/remove-dependencies-first branch January 16, 2026 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants