You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Add Node Volume Health Function](#add-node-volume-health-function)
@@ -136,11 +137,62 @@ Two main parts are involved here in the architecture.
136
137
- Kubelet already collects volume stats from CSI node plugin by calling CSI function NodeGetVolumeStats.
137
138
- In addition to existing volume stats collected already, Kubelet will also check volume condition collected from the same CSI function and log events to Pods if volume condition is abnormal.
138
139
- Note that currently we do not have CSI support for local storage. When the support is available, we will implement relavant CSI monitoring interfaces as well.
140
+
- Expose Volume Health information as Kubelet VolumeStats Metrics.
139
141
140
142
The volume health monitoring by Kubelet will be controlled by a new feature gate called `VolumeHealth`.
141
143
142
144
## Implementation
143
145
146
+
### Kubelet Metrics changes
147
+
148
+
Add a new field in the [VolumeStats metrics API](https://github.com/kubernetes/kubernetes/blob/v1.22.1/staging/src/k8s.io/kubelet/pkg/apis/stats/v1alpha1/types.go#L263).
149
+
150
+
```
151
+
// VolumeStats contains data about Volume filesystem usage.
152
+
type VolumeStats struct {
153
+
// Embedded FsStats
154
+
FsStats `json:",inline"`
155
+
// Name is the name given to the Volume
156
+
// +optional
157
+
Name string `json:"name,omitempty"`
158
+
// Reference to the PVC, if one exists
159
+
// +optional
160
+
PVCRef *PVCReference `json:"pvcRef,omitempty"`
161
+
162
+
// Note: Add the following new field
163
+
// +optional
164
+
// VolumeHealthStats contains data about volume health
// VolumeHealthStats contains data about volume health.
169
+
type VolumeHealthStats struct {
170
+
// Normal volumes are available for use and operating optimally.
171
+
// An abnormal volume does not meet these criteria.
172
+
Abnormal bool `json:"abnormal,omitempty"`
173
+
}
174
+
```
175
+
176
+
Modify [parsePodVolumeStats](https://github.com/kubernetes/kubernetes/blob/v1.22.1/pkg/kubelet/server/stats/volume_stat_calculator.go#L172) to include the new field in the returned `stats.VolumeStats`.
177
+
178
+
The newly added Volume Health stats will be stored in [persistentStats](https://github.com/kubernetes/kubernetes/blob/v1.22.1/pkg/kubelet/server/stats/volume_stat_calculator.go#L168).
179
+
180
+
This is returned in [GetPodVolumeStats](https://github.com/kubernetes/kubernetes/blob/v1.22.1/pkg/kubelet/server/stats/fs_resource_analyzer.go#L99).
181
+
182
+
Since Prometheus does not store string metrics, `volume_health_status` will be stored as either 1 or 0. The `volume_health_status` label could be `status: abnormal`.
183
+
184
+
```
185
+
var volumeHealthMetric = metrics.NewGaugeVec(
186
+
&metrics.GaugeOpts{
187
+
Subsystem: KubeletSubsystem,
188
+
Name: "volume_health_status",
189
+
Help: "Volume health status. The count is either 1 or 0.",
Container Storage Interface (CSI) specification will be modified to provide volume health check leveraging existing RPCs and adding new ones.
@@ -695,7 +747,7 @@ _This section must be completed when targeting alpha to a release._
695
747
For Kubelet, enabling/disabling the feature requires downtime of a node.
696
748
697
749
***Does enabling the feature change any default behavior?**
698
-
Enabling the `VolumeHealth` feature gate will allow Kubelet to monitor volume health and
750
+
Enabling the `VolumeHealth` feature gate will allow Kubelet to monitor volume health, emit new metric, and
699
751
generate events on Pods so it will change the default behavior.
700
752
Enabling the feature from the controller side will allow events to be reported on PVCs when
701
753
abnormal volume conditions are detected.
@@ -704,14 +756,14 @@ _This section must be completed when targeting alpha to a release._
704
756
the enablement)?**
705
757
Yes. Uninstalling the health monitoring controller sidecar will disable the feature from
706
758
the controller side.
707
-
Disabling the feature gate on Kubelet will prevent Kubelet from monitoring volume health.
759
+
Disabling the feature gate on Kubelet will prevent Kubelet from monitoring volume health and emitting the new metric.
708
760
Existing events will not be removed but they will disappear after a period of time.
709
761
Disabling the feature should not break an existing application as these events are for humans
710
762
only, not for automation.
711
763
712
764
***What happens if we reenable the feature if it was previously rolled back?**
713
765
Events will be added to PVCs or Pods when abnormal volume conditions are
714
-
detected again.
766
+
detected again and the new metric will be emitted by Kubelet again.
715
767
716
768
***Are there any tests for feature enablement/disablement?**
717
769
There will be unit tests for the feature `VolumeHealth` enablement/disablement.
@@ -734,23 +786,23 @@ _This section must be completed when targeting beta graduation to a release._
734
786
condition will be reported on PVCs.
735
787
736
788
If enabling the `VolumeHealth` feature fails, no event on volume condition will be
737
-
reported on the pod.
789
+
reported on the pod and the new `volume_stats_health_abnormal` metric won't be emitted.
738
790
739
791
***What specific metrics should inform a rollback?**
740
792
An event will be recorded on the PVC when the controller has successfully retrieved an
741
793
abnormal volume condition from the storage system. When other errors occur in the controller,
742
-
the errors will also be recorded as events.
794
+
the errors will also be recorded as events. When a rollback happens on the controller side, that means the external health monitor controller is uninstalled. After that we won't see events on the PVC due to abnormal volume conditions.
743
795
744
-
In Kubelet, an event will be recorded on the Pod when Kubelet has successfully retrieved an
796
+
In Kubelet, an event will be recorded on the Pod and a `volume_stats_health_abnormal` metric will be emitted when Kubelet has successfully retrieved an
745
797
abnormal volume condition. If the call to `NodeGetVolumeStats` fails for other reasons,
746
798
an error will be returned and whether this will be logged as an event on the Node is up to
747
-
the existing Kubelet logic and will not be changed.
799
+
the existing Kubelet logic and will not be changed. When a rollback happens, that means the feature gate is disabled again. The new metric won't be emitted after that.
748
800
749
801
***Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
750
802
Describe manual testing that was done and the outcomes.
751
803
Longer term, we may want to require automated upgrade/rollback tests, but we
752
804
are missing a bunch of machinery and tooling and can't do that now.
753
-
Manual testing will be done.
805
+
Manual testing will be done to upgrade from 1.22 to 1.23 and downgrade from 1.23 back to 1.22.
754
806
755
807
***Is the rollout accompanied by any deprecations and/or removals of features, APIs,
756
808
fields of API types, flags, etc.?**
@@ -775,16 +827,16 @@ _This section must be completed when targeting beta graduation to a release._
775
827
they are aggregated to show metrics for different sidecars.
776
828
777
829
In Kubelet, an operator can check whether the feature gate `VolumeHealth`
778
-
is enabled.
830
+
is enabled and whether the new metric `volume_stats_health_abnormal` is emitted.
779
831
780
832
***What are the SLIs (Service Level Indicators) an operator can use to determine
csi-external-health-monitor-controller exposes the `csi_sidecar_operations_seconds` metric.
787
-
In Kubelet, a call to `NodeGetVolumeStats` is meant to collect volume stats metrics.
839
+
In Kubelet, a call to `NodeGetVolumeStats` is meant to collect volume stats metrics. The new metric name is `volume_stats_health_abnormal`.
788
840
-[ ] Other (treat as last resort)
789
841
- Details:
790
842
@@ -804,12 +856,15 @@ the health of the service?**
804
856
can look at the ratio of successful vs non-successful statue codes to figure out
805
857
the success/failure ratio.
806
858
859
+
In Kubelet, the new metric `volume_stats_health_abnormal` will be emitted. Whether we can successfully retrieve this metric depending on the CSI call 'NodeGetVolumeStats'. This is an existing call in Kubelet. As long as the CSI driver has implemented this capability to provide volume health, it should be in the response of "NodeGetVolumeStats' call.
860
+
807
861
***Are there any missing metrics that would be useful to have to improve observability
808
862
of this feature?**
809
863
<!--
810
864
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
811
865
implementation difficulties, etc.).
812
866
-->
867
+
No.
813
868
814
869
### Dependencies
815
870
@@ -860,12 +915,13 @@ previous answers based on experience in the field._
860
915
call is needed.
861
916
- API calls that may be triggered by changes of some Kubernetes resources
862
917
(e.g. update of object X triggers new updates of object Y)
918
+
We are adding a new `Abnormal` field to the existing Kubelet metrics API. It will be retrieved by the periodic metrics collection call. We are not changing the existing frequency of that call.
863
919
- periodic API calls to reconcile state (e.g. periodic fetching state,
864
920
heartbeats, leader election, etc.)
865
921
866
922
***Will enabling / using this feature result in introducing new API types?**
867
923
Describe them, providing:
868
-
- API type: No
924
+
- API type: Adding 'Abnormal` field to Kubelet VolumeStats metrics API
869
925
- Supported number of objects per cluster: No
870
926
- Supported number of objects per namespace (for namespace-scoped objects): No
871
927
@@ -876,9 +932,9 @@ provider?**
876
932
***Will enabling / using this feature result in increasing size or count of
877
933
the existing API objects?**
878
934
Describe them, providing:
879
-
- API type(s): No
935
+
- API type(s): Yes. We are adding new 'Abnormal` field to Kubelet VolumeStats metrics API.
880
936
- Estimated increase in size: (e.g., new annotation of size 32B):
881
-
No
937
+
New string of max length of 128 bytes; new int of 4 bytes.
882
938
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
883
939
The controller reports events on PVC while Kubelet reports events on Pod. They work independently of each other. It is recommended that CSI driver should not report duplicate information through the controller and Kubelet. For example, if the controller detects a failure on one volume, it should record just one event on one PVC. If Kubelet detects a failure, it should record an event on every pod used by the affected PVC.
884
940
@@ -888,7 +944,8 @@ the existing API objects?**
888
944
operations covered by [existing SLIs/SLOs]?**
889
945
Think about adding additional work or introducing new steps in between
890
946
(e.g. need to do X to start a container), etc. Please describe the details.
891
-
This feature will periodically query storage systems to get the latest volume conditions. So this will have an impact on the performance of the operations running on the storage systems.
947
+
On the controller side, this feature will periodically query storage systems to get the latest volume conditions. So this will have an impact on the performance of the operations running on the storage systems.
948
+
In Kubelet, `NodeGetVolumeStats` is an existing call, so it won't have additional performance impact.
892
949
893
950
***Will enabling / using this feature result in non-negligible increase of
894
951
resource usage (CPU, RAM, disk, IO, ...) in any components?**
@@ -921,7 +978,7 @@ _This section must be completed when targeting beta graduation to a release._
921
978
- Diagnostics: What are the useful log messages and their required logging
922
979
levels that could help debug the issue?
923
980
Not required until feature graduated to beta.
924
-
If there are log messages indicating abnormal volume conditions but there are no events reported, we can check the timestamp of the messages to see if events have disappeared based on TTL or if they are never reported. If there are problems on the storage systems but they are not reported in logs or events, we can check the logs of the storage systems to figure out why this has happened.
981
+
If there are log messages indicating abnormal volume conditions but there are no events reported or new metric emitted, we can check the timestamp of the messages to see if events have disappeared based on TTL or if they are never reported. If there are problems on the storage systems but they are not reported in logs or events, we can check the logs of the storage systems to figure out why this has happened.
925
982
- Testing: Are there any tests for failure mode? If not, describe why.
926
983
927
984
***What steps should be taken if SLOs are not being met to determine the problem?**
@@ -932,7 +989,8 @@ _This section must be completed when targeting beta graduation to a release._
932
989
933
990
## Implementation History
934
991
935
-
- 20210117: Update KEP for Beta
992
+
- 20210902: Update KEP to add volume health to Kublet metrics.
0 commit comments