-
Notifications
You must be signed in to change notification settings - Fork 30
Race condition when deleting vault resources #117
Description
What happened?
We have the following setup:
- Crossplane v2.0.2
- Vault Provider v3.0.3
We configure is a single vault.m.upbound.io/v1beta1 ClusterProviderConfig for creating vault resources. For multi tenancy, we create a namespace for each project that needs vault resources, but they all use the same ClusterProviderConfig. The namespace has a finalizer which is the trigger to delete the vault resources associated with that ns/tenant. We built a controller to listen for namespace deletion so that the vault resources are cleaned up when the namespace is deleted.
We seem to be encountering some sort of race condition.
When a namespace is deleted, our controller attempts to delete all the vault resources that had been created for that tenant. Most of the time this works, but sometimes, we encounter a resource that refuses to be deleted.
This is state of an example resource (but as above, its can be for any vault resource: sometimes its Policy, sometimes SecretV2, etc)
apiVersion: vault.vault.m.upbound.io/v1alpha1
kind: Policy
metadata:
annotations:
crossplane.io/composition-resource-name: vaultReadPolicy
crossplane.io/external-create-pending: 2026-01-07T16:08:20Z
crossplane.io/external-create-succeeded: 2026-01-07T16:08:20Z
crossplane.io/external-name: test-service-read
creationTimestamp: 2026-01-07T16:08:20Z
deletionGracePeriodSeconds: 0
deletionTimestamp: 2026-01-07T16:14:26Z
finalizers:
- finalizer.managedresource.crossplane.io
generation: 3
labels:
app.kubernetes.io/component: comp
app.kubernetes.io/name: kname
crossplane.io/composite: service-read-1
name: test-service-read
namespace: test-service
ownerReferences:
- apiVersion: v1alpha1
blockOwnerDeletion: true
controller: true
kind: XReadCredentials
name: test-service-read-credentials
uid: 6cb05f42-e705-42f5-9168-551de66ccd4f
resourceVersion: "107969487"
uid: 16e84b10-a4df-4336-b1e5-1187f688e0e0
spec:
forProvider:
name: test-service-read
namespace: test-service
policy: <REDACTED>
initProvider: {}
managementPolicies:
- "*"
providerConfigRef:
kind: ClusterProviderConfig
name: vault-provider-config
status:
atProvider:
id: test-service-read
name: test-service-read
namespace: test-service
policy: ""
conditions:
- lastTransitionTime: 2026-01-07T16:14:33Z
observedGeneration: 3
reason: Deleting
status: "False"
type: Ready
- lastTransitionTime: 2026-01-07T16:14:49Z
message: 'connect failed: cannot initialize the Terraform plugin SDK async
external client: cannot get terraform setup: cannot track ProviderConfig
usage: cannot apply ProviderConfigUsage: cannot create object:
providerconfigusages.vault.m.upbound.io
"16e84b10-a4df-4336-b1e5-1187f688e0e0" is forbidden: unable to create
new content in namespace test-service because it is being
terminated'
observedGeneration: 3
reason: ReconcileError
status: "False"
type: Synced
- lastTransitionTime: 2026-01-07T16:08:21Z
reason: Success
status: "True"
type: LastAsyncOperation
The only way through is to delete the finalizer manually on the resource which is obviously not ideal. Its not clear (to us) if the message is the cause - trying to create a ProviderConfigUsage in a namespace that is being deleted. To be clear, the namespace deletion IS the event to trigger the deletion of the resource, so there might be some chicken/egg issue?
How can we reproduce it?
Setup is provided above
What environment did it happen in?
Our QA env
- Crossplane v2.0.2
- Vault Provider v3.0.3