Skip to content

Latest commit

 

History

History
100 lines (78 loc) · 3.39 KB

File metadata and controls

100 lines (78 loc) · 3.39 KB

Volume Condition Reporter

The Volume Condition Reporter uses the Container Storage Interface Specification's NodeGetVolumeStats operation to detect if a PersistentVolume has an abnormal condition. CSI drivers can return the condition of a volume in the NodeVolumeStatsResponse message.

Usage

The Volume Condition Reporter is disabled by default. Enabling the --enable-volume-condition for the CSI-Addons sidecar starts the Volume Condition Reporter.

Abnormal Volume Condition reporting

Once enabled, the healthy and abnormal volume condition is reported in the logs of the CSI-Addons sidecar, and as an Event for the PersistentVolumeClaim.

Users will see the Event in their Namespace, and also when they describe (with kubectl describe ...) the PersistentVolumeClaim.

Future Enhancements

Additional options for reporting include:

  • include the volume condition in the metrics (similar to KEP-4132)

  • generate an event for one or more of

    1. the PersistentVolume
    2. the Pod that uses the PersistentVolumeClaim
    3. the Node where the volume condition is abnormal
  • annotate one or more of

    1. the PersistentVolume
    2. the PersistentVolumeClaim
    3. the Pod that uses the PersistentVolumeClaim
    4. the Node where the volume condition is abnormal

      unlikely acceptable, needs permissions to the Node object

Potential Consumers of Abnormal Volume Condition check results

More feedback on the reporting and recovery steps are needed, but there are potential approaches that could use the reported volume condition:

  • Rook is a Kubernetes Operator that is able to Network Fence a workernode where a Ceph volume is unhealthy.

  • Node Problem Detector provides a generic interface for reporting problems on a node. A project like medik8s can remedy node problems once they are reported.

Dependencies

The NodeGetVolumeStats operation in the current CSI Specification (v1.8.0) defines the VolumeCondition as an alpha feature. Very few CSI-drivers seem to implement the volume condition at the moment. Drivers that implement the feature, are required to expose VOLUME_CONDITION as a NodeServiceCapability, otherwise the Volume Condition Reporter will not be able to check the condition of the volume.

Required Permissions (RBAC)

When a Kubernetes cluster uses Role Based Access Control (RBAC) like OpenShift, the CSI-Addons sidecar requires extra permissions to check and report the volume condition.

---
# permissions for csi-addons sidecar to create events.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: csiaddons-events-editor-role
rules:
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch
  - apiGroups:
      - ""
    resources:
      - persistentvolumes
      - persistentvolumeclaims
    verbs:
      - get