File tree Expand file tree Collapse file tree 1 file changed +9
-6
lines changed
keps/sig-storage/1790-recover-resize-failure Expand file tree Collapse file tree 1 file changed +9
-6
lines changed Original file line number Diff line number Diff line change @@ -306,11 +306,8 @@ after expansion is complete even with older kubelet. No recovery from expansion
306
306
_ This section must be completed when targeting beta graduation to a release._
307
307
308
308
* ** How can an operator determine if the feature is in use by workloads?**
309
- For a PVC that has undergone recovery from expansion failure successfully, it is not possible
310
- to identify the fact that - PVC used this feature. But for PVCs for which
311
- recovery failed even after reducing size, an operator can determine the feature in-use
312
- by looking at newly introduced ` pvc.Status.ResizeStatus ` field.
313
-
309
+ Any volume that has been recovered will emit a metric: ` operation_operation_volume_recovery_total{state='success', volume_name='pvc-abce'} ` .
310
+
314
311
* ** What are the SLIs (Service Level Indicators) an operator can use to
315
312
determine the health of the service?**
316
313
- [ ] Metrics
@@ -340,7 +337,13 @@ _This section must be completed when targeting beta graduation to a release._
340
337
341
338
* ** Are there any missing metrics that would be useful to have to improve
342
339
observability if this feature?**
343
- Not applicable.
340
+ We are planning to add new counter metrics that will record success and failure of recovery operations.
341
+ In cases where recovery fails, the counter will forever be increasing until an admin action resolves the error.
342
+
343
+ Tentative name of metric is - ` operation_operation_volume_recovery_total{state='success', volume_name='pvc-abce'} `
344
+
345
+ The reason of using PV name as a label is - we do not expect this feature to be used in a cluster very often
346
+ and hence it should be okay to use name of PVs that were recovered this way.
344
347
345
348
### Dependencies
346
349
You can’t perform that action at this time.
0 commit comments