You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-storage/1790-recover-resize-failure/README.md
+15-10Lines changed: 15 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,12 +106,14 @@ We however do have a problem with quota calculation because if a previously issu
106
106
To solve aforementioned problem - we propose that, a new field will be added to PVC, called `pvc.Status.AllocatedResources`. When a PVC is created - this field defaults to `pvc.Spec.Resources` but when user expands the PVC,
107
107
and when expansion-controller starts volume expansion - it will set `pvc.Status.AllocatedResources` to user requested value in `pvc.Spec.Resources` before performing expansion. The quota calculation will be updated to use `max(pvc.Spec.Resources, pvc.Status.AllocatedResources)` which will ensure that abusing quota will not be possible.
108
108
109
-
When user reduces `pvc.Spec.Resources`, expansion-controller will set `pvc.Status.AllocatedResources` to lower value (thereby giving quota back to the user) - only if current actual size of volume is less than or equal to `pvc.Spec.Resources`. It will do that only after fetching current size of the volume by using `ControllerGetVolume` CSI RPC call.
109
+
When user reduces `pvc.Spec.Resources`, expansion-controller will set `pvc.Status.AllocatedResources` to lower value (thereby giving quota back to the user) - only if current actual size of volume is less than or equal to `pvc.Spec.Resources` after entire control-plane and node side expansion is finished. It will fetch actual size of the volume by using `ControllerGetVolume` CSI RPC call. It is possible to track completion of resizing operation in external-resizer via function - https://github.com/kubernetes-csi/external-resizer/blob/master/pkg/controller/controller.go#L394.
110
110
111
111
If CSI driver does not have `GET_VOLUME` controller capability(or `ControllerGetVolume` does not report volume size) and `pvc.Spec.Resources` < `pvc.Status.AllocatedResources` (i.e user is attempting to reduce size of a volume that expansion controller previously tried to expand) - then although expansion-controller will try volume expansion with value in `pvc.Spec.Resources` - it will not reduce reported value in `pvc.Status.AllocatedResources`, which will result in no quota being restored to the user. In other words - for CSI drivers that don't have `GET_VOLUME` controller capability - `pvc.Status.AllocatedResources` will report highest requested value and reducing `pvc.Spec.Resources` will not result in reduction of used quota.
112
112
113
113
*Note:* This proposal expects that users can not modify `pvc.Status` directly and cheat quota system. In general this should be fine because users should not have access to edit status of almost anything. Link to discussion on slack - https://kubernetes.slack.com/archives/CJUQN3E4T/p1620059624022100
114
114
115
+
*Note:* It is expected that `ControllerGetVolume` RPC call returns disk size not filesystem size of the volume. We will ensure that there are sanity test around this and this distinction is made more clear in the CSI spec.
116
+
115
117
#### User flow stories
116
118
117
119
##### Case 0 (default PVC creation):
@@ -125,12 +127,12 @@ If CSI driver does not have `GET_VOLUME` controller capability(or `ControllerGet
125
127
- Expansion controller starts expanding the volume and sets `pvc.Status.AllocatedResources` to `100Gi`.
126
128
- Expansion to 100Gi fails and hence `pv.Spec.Capacity` and `pvc.Status.Capacity `stays at 10Gi.
127
129
- User requests size to 20Gi.
128
-
- Expansion controller notices that `pvc.Spec.Resources` < `pvc.Status.AllocatedResources` (meaning user tried to reduce size).
130
+
- Expansion controller tries expanding the PVC/PV to `20Gi` size.
131
+
- Expansion succeeds and `pvc.Status.Capacity` and `pv.Spec.Capacity` report new size as `20Gi`.
132
+
- After completion, expansion controller notices that `pvc.Spec.Resources` < `pvc.Status.AllocatedResources` (meaning user tried to reduce size).
129
133
- Expansion controller fetches actual size of the volume using `ControllerGetVolume` CSI RPC call.
130
-
- Expansion controller sees that `actual_volume_size`(10Gi) < `pvc.Spec.Resources` (20Gi), so it sets `pvc.Status.AllocatedResources` to 20Gi.
131
-
- Expansion controller continues with rest of the expansion flow.
134
+
- Expansion controller sees that `actual_volume_size`(20Gi) < `pvc.Status.AllocatedResources` (100Gi), so it sets `pvc.Status.AllocatedResources` to 20Gi.
132
135
- Quota controller sees a reduction in used quota because `max(pvc.Spec.Resources, pvc.Status.AllocatedResources)` is 20Gi.
133
-
- Expansion succeeds and `pvc.Status.Capacity` and `pv.Spec.Capacity` report new size as `20Gi`.
134
136
135
137
##### Case 2 (controller+node expandable with no GET_VOLUME capability):
136
138
- User increases 10Gi PVC to 100Gi by changing - `pvc.spec.resources.requests["storage"] = "100Gi"`
@@ -139,22 +141,25 @@ If CSI driver does not have `GET_VOLUME` controller capability(or `ControllerGet
139
141
- Expansion controller starts expanding the volume and sets `pvc.Status.AllocatedResources` to `100Gi`.
140
142
- Expansion to 100Gi fails and hence `pv.Spec.Capacity` and `pvc.Status.Capacity `stays at 10Gi.
141
143
- User requests size to 20Gi.
144
+
- Expansion controller tries expanding the PVC/PV to `20Gi` size.
145
+
- Expansion succeeds and `pvc.Status.Capacity` and `pv.Spec.Capacity` report new size as `20Gi`.
142
146
- Expansion controller notices that `pvc.Spec.Resources` < `pvc.Status.AllocatedResources` (meaning user tried to reduce size).
143
147
- Expansion controller sees that CSI driver does not have `GET_VOLUME` controller capability.
144
-
- Expansion controller tries expansion to new value in `pvc.Spec.Resources` anyways (`20Gi`).
145
148
-`pvc.Status.AllocatedResources` still reports `100Gi` and hence there is no change in used quota.
146
-
- Expansion succeeds and `pvc.Status.Capacity` and `pv.Spec.Capacity` report new size as `20Gi`.
- User increases 10Gi PVC to 100Gi by changing `pvc.spec.resources.requests["storage"] = "100Gi"`
151
153
-`pvc.Status.AllocatedResources` still reports `10Gi`.
152
154
- Quota controller uses `max(pvc.Status.AllocatedResources, pvc.Spec.Resources)` and adds `90Gi` to used quota.
153
-
- Expansion controller slow starts expanding the volume and sets `pvc.Status.AllocatedResources` to `100Gi` (before expanding).
155
+
- Expansion controller slowly starts expanding the volume and sets `pvc.Status.AllocatedResources` to `100Gi` (before expanding).
154
156
- At this point -`pv.Spec.Capacity` and `pvc.Status.Capacity` stays at 10Gi until the resize is finished.
155
157
- While the storage backend is re-sizing the volume, user requests size 20Gi by changing `pvc.spec.resources.requests["storage"] = "20Gi"`
158
+
- Expansion succeeds previous expansion and `pvc.Status.Capacity` and `pv.Spec.Capacity` report new size as `100Gi`.
159
+
- Expansion controller notices that `pvc.Spec.Resources` < `pvc.Status.AllocatedResources` (meaning user tried to reduce size).
160
+
- Expansion controller fetches actual size of the volume using `ControllerGetVolume` CSI RPC call.
161
+
- Expansion controller sees that `actual_volume_size`(100Gi) >= `pvc.Status.AllocatedResources` (100Gi), so it does not make any modifictions to `pvc.Status.AllocatedResources`.
156
162
- Quota controller sees no change in storage usage by the PVC because `pvc.Status.AllocatedResources` is 100Gi.
157
-
- Expansion succeeds and `pvc.Status.Capacity` and `pv.Spec.Capacity` report new size as `100Gi`, as that's what the volume plugin did.
0 commit comments