Skip to content

Intermittent csi-driver-spiffe failure: Unable to mount cert #42

@warrior-abhijit

Description

@warrior-abhijit

We have encountered intermittent issues where the CSI driver spiffe fails to mount the certificate on a pod.
This problem appears to occur more frequently when the CSI driver spiffe pod restarts.
Upon restarting the CSI driver spiffe pod, it seems to lose track of which pod certificates need to be renewed.
Interestingly, manually restarting the affected pod results in the correct mounting of new certificates.

We observed the following error messages in the csi-driver-spiffe log:

csi/manager "msg"="Failed to issue certificate, retrying after applying exponential backoff" "error"="waiting for request: certificaterequest.cert-manager.io \"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx\" not found" "volume_id"="csi-xxxxxxxx"
.......
csi/driver "msg"="failed processing request" "error"="timed out waiting for the condition" "request"={}
"rpc_method"="/csi.v1.Node/NodePublishVolume"
.......

We have already reviewed a previously closed issue (cert-manager/csi-driver#78) and updated the CSI data directory, but this did not resolve the problem.
We are actively looking for workarounds to address this behavior. One potential solution we are considering is utilizing a liveness probe.
We are seeking guidance on how to further identify and potentially resolve this issue.
Any suggestions regarding additional information we can provide would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions