-
Notifications
You must be signed in to change notification settings - Fork 651
Description
Note: Reposted here for better visibility.
Summary
SwarmKit’s CSI implementation currently uses a flat StagingTargetPath of the form /data/staged/<volume-id> when issuing NodeStageVolume requests. While this complies with the CSI spec, it introduces a compatibility issue with CSI plugins like [Ceph-CSI(https://github.com/ceph/ceph-csi), which interpret StagingTargetPath as a parent path and append their own subdirectory (usually based on their internal volume ID) to it.
This mismatch causes plugins like Ceph-CSI to reject requests with errors such as:
rpc error: code = Internal desc = staging path /data/staged/<vol-id> does not exist on node
The CSI spec states:
The CO MUST ensure that the
parentdirectory ofstaging_target_pathexists prior to issuing the call. spec link
However, in this context, the plugin treats the provided staging_target_path as a parent, not a final target, which leads to confusion and a failure to comply with each other's expectations.
What Actually Happens
- SwarmKit provides
/data/staged/<vol-id>asStagingTargetPath(and enforces/data/stagedas the parent path). - Ceph-CSI reinterprets this
StagingTargetPathas aStagingParentPath, appending its own volume identifier as a subdirectory. - As a result, the plugin expects the CO to have created
/data/staged/<vol-id>— which the CO considers to be the full staging target.
Why This Is a Problem
This behavior leads to:
- Failure in volume staging due to the plugin rejecting the staging path.
- Confusing debug sessions because neither side is strictly violating the spec — but are interpreting it differently.
- Difficulty packaging existing CSI plugins for SwarmKit (e.g.,
cephcsi) without patching them to bypass the assumption.
Suggested Fix / Proposal
Following the direction that for example Kubelet (K8s) takes, SwarmKit could ensure that the full, exact StagingTargetPath it passes to the plugin is created beforehand. This would resolve the plugin-side validation issue and increase compatibility with CSI plugins that follow Kubernetes conventions.
Alternatively, SwarmKit could:
- Add a
VolumeContextkey that hints the plugin not to append further directory levels. - Document this behavior explicitly for plugin authors.
Environment
- Docker version:
26.0.2 - Plugin:
cephcsi/rbd, configured as a managed plugin
Temporary Workaround
I've tested a patched Ceph-CSI’s ValidateNodeStageVolumeRequest to conditionally mkdirAll() the passed StagingTargetPath when not running on Kubernetes. While this solves the issue, it’s a hack that should not be required.
Context: ceph/cephcsi issues/3696