Skip to content

VolumeFailedDelete events #1177

@tcldr

Description

@tcldr

TL;DR

CSI controller is generating frequent 'VolumeFailedDelete' warning events with the message 'persistentvolume pvc- is still attached to node '

Expected behavior

Persistent volumes attach/detach smoothly

Observed behavior

When attaching/detaching persistent volumes, particularly for high-churn use cases both the CSI controller and PV dependent pods are producing lots of warnings and adding latency into the cluster.

Minimal working example

Try setting up ARC (Actions Runner Controller for GitHub actions) in K8s mode and triggering a few workflows:

---
apiVersion: v1
kind: Namespace
metadata:
  name: arc-runners
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: arc-runner-scale-set
  namespace: arc-runners
spec:
  interval: 30m
  chart:
    spec:
      chart: gha-runner-scale-set
      sourceRef:
        kind: HelmRepository
        name: arc
        namespace: arc-systems
      interval: 12h
      version: 0.13.0
  values:
    maxRunners: 2
    minRunners: 1
    githubConfigUrl: https://github.com/<YOUR REPO>
    githubConfigSecret: <YOUR REPO SECRET>
    runnerScaleSetName: self-hosted-runner
    containerMode:
      type: kubernetes
      kubernetesModeWorkVolumeClaim:
        accessModes: ["ReadWriteOnce"]
        storageClassName: hcloud-volumes
        resources:
          requests:
            storage: 1Gi
    template:
      spec:
        securityContext:
          fsGroup: 1001
        containers:
        - name: runner
          image: ghcr.io/actions/actions-runner:latest
          command: ["/home/runner/run.sh"]

Log output

apiVersion: v1
kind: Event
metadata:
  name: pvc-efa81f1e-869e-49dc-99f4-9390cd00758a.187a66094e8c405a
  namespace: default
  uid: 323dca32-2ebb-4f30-9877-a2ebf07dff8b
  resourceVersion: '55312424'
  creationTimestamp: '2025-11-22T17:59:54Z'
  selfLink: >-
    /api/v1/namespaces/default/events/pvc-efa81f1e-869e-49dc-99f4-9390cd00758a.187a66094e8c405a
spec: {}
involvedObject:
  kind: PersistentVolume
  name: pvc-efa81f1e-869e-49dc-99f4-9390cd00758a
  uid: ad3c9a95-0843-4e7f-a6c6-3a80a2255d4a
  apiVersion: v1
  resourceVersion: '55312391'
reason: VolumeFailedDelete
message: >-
  persistentvolume pvc-efa81f1e-869e-49dc-99f4-9390cd00758a is still attached to
  node cloud-02
source:
  component: >-
    csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
firstTimestamp: '2025-11-22T17:59:54Z'
lastTimestamp: '2025-11-22T18:00:01Z'
count: 4
type: Warning
eventTime: null
reportingComponent: >-
  csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
reportingInstance: ''
---
apiVersion: v1
kind: Event
metadata:
  name: my-cloud-tp9sg-runner-hhjpx.187a65295d0af25f
  namespace: arc-runners
  uid: e11e7370-37cb-4860-886c-8180d3616367
  resourceVersion: '55308459'
  creationTimestamp: '2025-11-22T17:43:52Z'
  selfLink: >-
    /api/v1/namespaces/arc-runners/events/my-cloud-tp9sg-runner-hhjpx.187a65295d0af25f
spec: {}
involvedObject:
  kind: Pod
  namespace: arc-runners
  name: my-cloud-tp9sg-runner-hhjpx
  uid: 9a871964-3534-4f2f-b9c2-97668b05fdf5
  apiVersion: v1
  resourceVersion: '55308341'
reason: FailedMount
message: >-
  MountVolume.SetUp failed for volume "pvc-efa81f1e-869e-49dc-99f4-9390cd00758a"
  : rpc error: code = Internal desc = failed to publish volume: format of disk
  "/dev/disk/by-id/scsi-0HC_Volume_104028078" failed: type:("ext4")
  target:("/var/lib/kubelet/pods/9a871964-3534-4f2f-b9c2-97668b05fdf5/volumes/kubernetes.io~csi/pvc-efa81f1e-869e-49dc-99f4-9390cd00758a/mount")
  options:("defaults") errcode:(exit status 1) output:(mke2fs 1.47.1
  (20-May-2024)

  The file /dev/disk/by-id/scsi-0HC_Volume_104028078 does not exist and no size
  was specified.

  ) 
source:
  component: kubelet
  host: cloud-02
firstTimestamp: '2025-11-22T17:43:52Z'
lastTimestamp: '2025-11-22T17:43:56Z'
count: 4
type: Warning
eventTime: null
reportingComponent: kubelet
reportingInstance: cloud-02
---
apiVersion: v1
kind: Event
metadata:
  name: my-cloud-tp9sg-runner-hhjpx-work.187a652518b33d76
  namespace: arc-runners
  uid: c05af831-89e3-4380-91cf-4f7ac3718f9f
  resourceVersion: '55308317'
  creationTimestamp: '2025-11-22T17:43:34Z'
  selfLink: >-
    /api/v1/namespaces/arc-runners/events/my-cloud-tp9sg-runner-hhjpx-work.187a652518b33d76
spec: {}
involvedObject:
  kind: PersistentVolumeClaim
  namespace: arc-runners
  name: my-cloud-tp9sg-runner-hhjpx-work
  uid: efa81f1e-869e-49dc-99f4-9390cd00758a
  apiVersion: v1
  resourceVersion: '55308263'
reason: ProvisioningFailed
message: >-
  failed to provision volume with StorageClass "hcloud-volumes": rpc error: code
  = DeadlineExceeded desc = context deadline exceeded
source:
  component: >-
    csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
firstTimestamp: '2025-11-22T17:43:34Z'
lastTimestamp: '2025-11-22T17:43:34Z'
count: 1
type: Warning
eventTime: null
reportingComponent: >-
  csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
reportingInstance: ''
---
apiVersion: v1
kind: Event
metadata:
  name: my-cloud-tp9sg-runner-r7728.187a65228e26c5c3
  namespace: arc-runners
  uid: 6696bc08-f201-433e-af85-6207aa913412
  resourceVersion: '55308253'
  creationTimestamp: '2025-11-22T17:43:23Z'
  selfLink: >-
    /api/v1/namespaces/arc-runners/events/my-cloud-tp9sg-runner-r7728.187a65228e26c5c3
spec: {}
involvedObject:
  kind: Pod
  namespace: arc-runners
  name: my-cloud-tp9sg-runner-r7728
  uid: 67965bc8-169f-4229-b15d-dd065a687968
  apiVersion: v1
  resourceVersion: '55308196'
reason: FailedScheduling
message: >-
  running PreBind plugin "VolumeBinding": binding volumes: pod does not exist
  any more: pod "my-cloud-tp9sg-runner-r7728" not found
source:
  component: default-scheduler
firstTimestamp: '2025-11-22T17:43:23Z'
lastTimestamp: '2025-11-22T17:43:23Z'
count: 1
type: Warning
eventTime: null
reportingComponent: default-scheduler
reportingInstance: ''
---
apiVersion: v1
kind: Event
metadata:
  name: my-cloud-tp9sg-runner-hhjpx.187a6522647ba8dc
  namespace: arc-runners
  uid: 2b1b99f7-5e0c-4032-80e3-0115f50b1c13
  resourceVersion: '55308215'
  creationTimestamp: '2025-11-22T17:43:22Z'
  selfLink: >-
    /api/v1/namespaces/arc-runners/events/my-cloud-tp9sg-runner-hhjpx.187a6522647ba8dc
spec: {}
involvedObject:
  kind: Pod
  namespace: arc-runners
  name: my-cloud-tp9sg-runner-hhjpx
  uid: 9a871964-3534-4f2f-b9c2-97668b05fdf5
  apiVersion: v1
  resourceVersion: '55308214'
reason: FailedScheduling
message: >-
  0/3 nodes are available: waiting for ephemeral volume controller to create the
  persistentvolumeclaim "my-cloud-tp9sg-runner-hhjpx-work". preemption: 0/3
  nodes are available: 3 Preemption is not helpful for scheduling.
source:
  component: default-scheduler
firstTimestamp: '2025-11-22T17:43:22Z'
lastTimestamp: '2025-11-22T17:43:22Z'
count: 1
type: Warning
eventTime: null
reportingComponent: default-scheduler
reportingInstance: ''
---
apiVersion: v1
kind: Event
metadata:
  name: pvc-a4919e2d-32ac-4aed-9f3d-a9c189e5e60c.187a650de6d88b55
  namespace: default
  uid: 3be65217-10dd-4c8b-aaa0-faf4155c4174
  resourceVersion: '55307693'
  creationTimestamp: '2025-11-22T17:41:54Z'
  selfLink: >-
    /api/v1/namespaces/default/events/pvc-a4919e2d-32ac-4aed-9f3d-a9c189e5e60c.187a650de6d88b55
spec: {}
involvedObject:
  kind: PersistentVolume
  name: pvc-a4919e2d-32ac-4aed-9f3d-a9c189e5e60c
  uid: 5e2fd3f0-f078-4586-98d1-9318c94fda83
  apiVersion: v1
  resourceVersion: '55307658'
reason: VolumeFailedDelete
message: >-
  persistentvolume pvc-a4919e2d-32ac-4aed-9f3d-a9c189e5e60c is still attached to
  node cloud-02
source:
  component: >-
    csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
firstTimestamp: '2025-11-22T17:41:54Z'
lastTimestamp: '2025-11-22T17:42:01Z'
count: 4
type: Warning
eventTime: null
reportingComponent: >-
  csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
reportingInstance: ''
---
apiVersion: v1
kind: Event
metadata:
  name: my-cloud-jf8z8-runner-dt6kp-work.187a650834cdd775
  namespace: arc-runners
  uid: 23fd02b0-1b22-428a-a57a-ef8468c47ec0
  resourceVersion: '55307476'
  creationTimestamp: '2025-11-22T17:41:29Z'
  selfLink: >-
    /api/v1/namespaces/arc-runners/events/my-cloud-jf8z8-runner-dt6kp-work.187a650834cdd775
spec: {}
involvedObject:
  kind: PersistentVolumeClaim
  namespace: arc-runners
  name: my-cloud-jf8z8-runner-dt6kp-work
  uid: bf39e236-c4aa-452d-873b-7539accc100f
  apiVersion: v1
  resourceVersion: '55307423'
reason: ProvisioningFailed
message: >-
  failed to provision volume with StorageClass "hcloud-volumes": rpc error: code
  = DeadlineExceeded desc = context deadline exceeded
source:
  component: >-
    csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
firstTimestamp: '2025-11-22T17:41:29Z'
lastTimestamp: '2025-11-22T17:41:29Z'
count: 1
type: Warning
eventTime: null
reportingComponent: >-
  csi.hetzner.cloud_hcloud-csi-controller-567f89966f-4x2w5_2950c7a3-f602-4696-95f9-d014acc08356
reportingInstance: ''

Additional information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions