@@ -388,8 +388,12 @@ different feature gates that control various aspects of expansion.
388
388
- Describe the mechanism :
389
389
- Will enabling / disabling the feature require downtime of the control
390
390
plane?
391
+ Enabling/Disabling this feature does not require complete downtime of control-plane
392
+ and feature gates can be enabled progressively on different control-plane nodes.
391
393
- Will enabling / disabling the feature require downtime or reprovisioning
392
394
of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled).
395
+ Enabling this feature can be enabled progressively on nodes and as expansion is enabled
396
+ on the node then volume expansion will happen on kubelet.
393
397
394
398
# ##### Does enabling the feature change any default behavior?
395
399
@@ -457,21 +461,13 @@ Having said that if file system requires expansion during mount then it is obvio
457
461
458
462
- [ ] Metrics
459
463
- controller expansion operation duration :
460
- - Metric name : storage_operation_duration_seconds{operation_name=expand_volume}
464
+ - Metric name : storage_operation_duration_seconds{operation_name=expand_volume, status=success|fail-unknown }
461
465
- [Optional] Aggregation method : percentile
462
466
- Components exposing the metric : kube-controller-manager
463
- - controller expansion operation errors :
464
- - Metric name : storage_operation_errors_total{operation_name=expand_volume}
465
- - [Optional] Aggregation method : cumulative counter
466
- - Components exposing the metric : kube-controller-manager
467
467
- node expansion operation duration :
468
- - Metric name : storage_operation_duration_seconds{operation_name=volume_fs_resize}
468
+ - Metric name : storage_operation_duration_seconds{operation_name=volume_fs_resize, status=success|fail-unknown }
469
469
- [Optional] Aggregation method : percentile
470
470
- Components exposing the metric : kubelet
471
- - node expansion operation errors :
472
- - Metric name : storage_operation_errors_total{operation_name=volume_fs_resize}
473
- - [Optional] Aggregation method : cumulative counter
474
- - Components exposing the metric : kubelet
475
471
- CSI operation metrics :
476
472
- Metric name : csi_sidecar_operations_seconds
477
473
- [Optional] Aggregation method : percentile
@@ -481,6 +477,8 @@ Having said that if file system requires expansion during mount then it is obvio
481
477
- Details :
482
478
483
479
# ##### Are there any missing metrics that would be useful to have to improve observability of this feature?
480
+ We are going to add equivalent of intree storage_operation metrics for volume expansion when
481
+ expansion is performed externally via external-resizer.
484
482
485
483
# ## Dependencies
486
484
@@ -504,14 +502,18 @@ Yes enabling this feature requires new API calls.
504
502
- GET PV
505
503
- List PVs
506
504
- originating components : kubelet, kube-controller-manager, external-resizer
507
- - resync duration : 10mins
505
+ - resync duration : 10mins (also user configurable)
508
506
- Update to PVCs :
509
507
- API operations
510
508
- PATCH PVC
511
509
- GET PVC
512
510
- List PVC
513
511
- originating components : kubelet, kube-controller-manager, external-resizer
514
- - resync duration : 10mins
512
+ - resync duration : 10mins (also user configurable)
513
+
514
+ If user enables protection for not expanding PVCs that are in-use, external-resizer will
515
+ also watch *all* pods in the cluster. This is an optional flag in external-resizer and generally
516
+ only needed when some CSI drivers don't want to handle expansion calls for volumes which are potentially in-use by a pod.
515
517
516
518
# ##### Will enabling / using this feature result in introducing new API types?
517
519
@@ -525,9 +527,11 @@ Yes, we expect new calls to modify existing volume objects.
525
527
526
528
Describe them, providing :
527
529
- API type(s) : PVC
528
- - Estimated increase in size : A PVC with conditions could have its size increased by anywhere between 100 to 250B.
529
- - Estimated amount of new objects : (e.g., new Object X for every existing Pod)
530
-
530
+ - Estimated increase in size : A PVC with conditions could have its size increased by anywhere between 100 to 250B.
531
+ - Estimated amount of new objects : (e.g., new Object X for every existing Pod)
532
+ - API type(s) : StorageClass
533
+ - Estimated increase in size : A StorageClass with `AllowVolumeExpansion` has its size increased by 26bytes almost.
534
+ - Estimated amount of new objects : (e.g., new Object X for every existing Pod)
531
535
532
536
# ##### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
533
537
@@ -540,6 +544,9 @@ Enabling this feature should not result in resource usage by significant margin,
540
544
# ## Troubleshooting
541
545
542
546
# ##### How does this feature react if the API server and/or etcd is unavailable?
547
+ Since this feature is user driven and API server or etcd becomes unavailable then users won't be able to expand the PVC.
548
+ But if API server becomes unavailable midway through the expansion process then the expansion controller may not be able
549
+ save updated PVC in api-server but control-flow is designed to retry and recover from such failures.
543
550
544
551
# ##### What are other known failure modes?
545
552
0 commit comments