Skip to content

Commit bced55f

Browse files
authored
Merge pull request #4692 from gnufied/change-rec-milestonre-31
Change milestone to 1.31
2 parents 94569b2 + 408a6e6 commit bced55f

File tree

3 files changed

+22
-1
lines changed

3 files changed

+22
-1
lines changed
184 KB
Loading

keps/sig-storage/1790-recover-resize-failure/README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
- [Goals](#goals)
1010
- [Non-Goals](#non-goals)
1111
- [Proposal](#proposal)
12+
- [Making allocatedResourceStatus not change unnecessarily for every error in 1.31](#making-allocatedresourcestatus-not-change-unnecessarily-for-every-error-in-131)
1213
- [Making resizeStatus more general in v1.28](#making-resizestatus-more-general-in-v128)
1314
- [Implementation](#implementation)
1415
- [User flow stories](#user-flow-stories)
@@ -108,6 +109,26 @@ As part of this proposal, we are mainly proposing three changes:
108109
- NodeExpansionFailed // state set when expansion has failed in kubelet with a terminal error. Transient errors don't set NodeExpansionFailed.
109110
3. Update quota code to use `max(pvc.Spec.Resources, pvc.Status.AllocatedResources)` when evaluating usage for PVC.
110111

112+
### Making allocatedResourceStatus not change unnecessarily for every error in 1.31
113+
114+
We are trying to reduce number of state changes which can happen when volume expansion on either the kubelet or external-resizer fails.
115+
116+
We are considering following gRPC error codes as "infeasible":
117+
- INVALID_ARGUMENt
118+
- OUT_OF_RANGE
119+
- NOT_FOUND
120+
121+
In the external-resizer if `ControllerExpandVolume` fails with any of the error codes above, controller expansion will be marked as failed and resizing will be retried at slower rate. For all the other errors - an event will be generated and a condition will be added to PVC that expansion has failed, but state change will not be recorded in `allocatedResourceStatus`.
122+
123+
124+
On the node side - `allocatedResourceStatus` will only be updated with failed expansion if:
125+
- `NodeExpandVolume` failed with one of the `infeasible` error codes from above.
126+
- `NodeExpandVolume` failed with a final error and there is a pending pvc size request change from the user.
127+
128+
This will allow external-resizer to recover safely from node expansion failures too.
129+
130+
![New flow kubelet](./Expanding volume - Kubelet Loop.png)
131+
111132
### Making resizeStatus more general in v1.28
112133

113134
After [some discussion](https://github.com/kubernetes/kubernetes/pull/116335#issuecomment-1624566731) with sig-storage folks, we are proposing that we rename `pvc.Status.ResizeStatus` to `pvc.Status.AllocatedResourceStatus` and make it a map.

keps/sig-storage/1790-recover-resize-failure/kep.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ see-also:
2121
replaces:
2222
superseded-by:
2323

24-
latest-milestone: "v1.30"
24+
latest-milestone: "v1.31"
2525
stage: "alpha"
2626

2727
milestone:

0 commit comments

Comments
 (0)