-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
I noticed this issue while testing the PlatformService DNS. The platform service essentially deploys external-dns onto clusters with specific purposes. If the Cluster
gets a deletion timestamp, the platform service removes the external-dns deployment.
The problem is that the corresponding ClusterProvider starts removing the cluster as soon as it has the deletion timestamp, basically causing a race between deployment and cluster deletion.
In my test with ClusterProvider kind, this not only led to the deployment not being properly deleted, but it also caused a HelmRelease
resource stuck in deletion (because the target cluster was not reachable anymore), which had to be cleaned up manually.
The solution to this problem is that ClusterProviders must wait with any deleting operations until the Cluster
resource has only their own finalizer left and no 'foreign' ones anymore.
Done Criteria
- Implemented for ClusterProvider Gardener
- Implemented for ClusterProvider kind