Skip to content

Update document about error handling #5462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions keps/sig-storage/3751-volume-attributes-class/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@
- [Delete PVC](#delete-pvc)
- [Modify PVC](#modify-pvc)
- [Implementation & Handling Failure](#implementation--handling-failure)
- [Handling of non-final errors](#handling-of-non-final-errors)
- [Handling of final errors](#handling-of-final-errors)
- [Transition from VAC(A) to VAC(B)](#transition-from-vaca-to-vacb)
- [Transition from nil-VAC to VAC(A)](#transition-from-nil-vac-to-vaca)
- [Handling of infeasible errors](#handling-of-infeasible-errors)
- [Test Plan](#test-plan)
- [Prerequisite testing updates](#prerequisite-testing-updates)
- [Unit tests](#unit-tests)
Expand Down Expand Up @@ -690,6 +695,43 @@ ModifyVolume is only allowed on bound PVCs. Under the ModifyVolume call, it will
### Implementation & Handling Failure

VolumeAttributesClass parameters can be considered as best-effort parameters, the CSI driver should report the status of bad parameters as INVALID_ARGUMENT and the volume would fall back to a workable default configuration.
It is expected that CSI driver will not apply partial application of parameters if one or more parameters are invalid. We are proposing CSI spec change to tighten the wording for this - https://github.com/container-storage-interface/spec/pull/597

In general Kubernetes sidecars classify all CSI errors in three different classes. Namely:

- Non-final errors (such as `DeadlineExceeded`), which indicate a transient error, which may be because of timeout or some other temporary failure. The CSI driver may have already volume modification in-progress.
- Final errors (such as `Internal`), which indicate a definitive error from CSI driver and this typically means CSI driver is no longer processing this request after error is returned.
- Infeasible Errors (e.g., `InvalidArgument`): This is a subset of final errors indicating the request itself is invalid and will never succeed.

#### Handling of non-final errors

In general `external-resizer` will not attempt modification to new VAC, if modification to previous applied VAC is failing with some kind of non-final error.

This policy safeguards against potential quota abuse that can occur if users time their requests strategically.
`external-resizer` will only permit transition to new VAC, only if transition to previous VAC has succeeded or failed with a final error. This is one of the main reasons - `targetVolumeAttributesClassName` field is required in pvc's status.

In other words, `external-resizer` will keep working towards `targetVolumeAttributesClassName` for non-final errors regardless of user specified change in `.spec.volumeAttributeClassName`.

#### Handling of final errors

##### Transition from VAC(A) to VAC(B)

If volume modification to a VAC is failing with a final error and users wishes to either cancel and move to a different VAC, then they MUST first set VAC of PVC to A. Only after transition to original VAC(A) is successful, is the user allowed to move to a different VAC.

##### Transition from nil-VAC to VAC(A)

If volume modification to a VAC is failing with final but not-infeasible error, then external-resizer will keep trying to reconcile to VAC(A), regardless of any user initiated changes in `.spec.volumeAttributeClassName`. Only after transition to VAC(A) is successful, the user is allowed to move the PVC to a different VAC.

#### Handling of infeasible errors

If volume modification to a VAC is failing with infeasible error, then users can either set VAC to previously specified value in `status.currentVolumeAttributesClass` or set to `nil` if no VAC was specified. In both the cases, external-resizer will stop trying to reconcile the volume modification.

Please note if PVC already had a `currentVolumeAttributesClass` in its status, then setting VAC to `nil` is not allowed.

User can also set VAC to a different VAC if transition to a VAC fails with a infeasible error. This is allowed with the assumption that, volume was not modified when previous VAC application failed with a infeasible error.

![Error recovery flow](./modify-volume.png)


### Test Plan

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.