You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/2371-cri-pod-container-stats/README.md
+41-8Lines changed: 41 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -78,7 +78,7 @@ There are two main APIs that consumers use to gather stats about running contain
78
78
The Kubelet is responsible for implementing the summary API, and cadvisor is responsible for fulfilling `/metrics/cadvisor`.
79
79
80
80
The [CRI API](https://github.com/kubernetes/cri-api) currently does not provide enough metrics to fully supply all the fields for either endpoint, however is used to fill some fields of the summary API.
81
-
This results in an unclear origin of metrics, duplication of work done between cAdvisor and CRI, and performance implications.
81
+
This results in an unclear origin of metrics, duplication of work done between cAdvisor and CRI, and performance implications.
82
82
83
83
This KEP aims to enhance CRI implementations to be able to fulfill all the stats needs of Kubernetes.
84
84
At a high level, there are two pieces of this:
@@ -176,6 +176,7 @@ Below is a table describing which stats come from what source now, as well a pro
176
176
||N/A |container_last_seen |N/A |cAdvisor |CRI or N/A |
177
177
|||||cAdvisor |CRI or N/A |
178
178
179
+
179
180
## Motivation
180
181
181
182
We want to avoid using cAdvisor for container & pod level stats and move metric collection to the CRI implementation for the following reasons:
@@ -544,7 +545,7 @@ As a requirement for the Beta stage, cAdvisor must support optionally collecting
544
545
545
546
- Conduct research to find the set of metrics from `/metrics/cadvisor` that compliant CRI implementations must expose.
546
547
- Conformance tests for the fields in `/metrics/cadvisor` should be created.
547
-
-Performance tests for CPU/memory usage between Kubelet/cAdvisor/CRI implementation should be added.
548
+
-Validate performance impact of this feature is within allowable margin (or non-existent, ideally).
548
549
- The CRI stats implementation should perform better than they did with CRI+cAdvisor.
549
550
- cAdvisor stats provider may be marked as deprecated (depending on stability of new CRI based implementations).
550
551
- cAdvisor should be able to optionally not report the metrics needed for both summary API and `/metrics/cadvisor`. This behavior will be toggled by the Kubelet feature gate.
@@ -559,8 +560,8 @@ As a requirement for the Beta stage, cAdvisor must support optionally collecting
559
560
560
561
- There needs to be a way for the Kubelet to verify the CRI provider is capable of providing the correct metrics.
561
562
Upon upgrading to a version that relies on this new behavior (assuming the feature gate is enabled),
562
-
Kubelet should fail early if the CRI implementation won't report the expected metrics.
563
-
- For Beta/GA releases, components that rely on `/metrics/cadvisor` should take the decided action (use `/stats/summary`, or use the CRI provided `/metrics/cadvisor`).
563
+
Kubelet should fallback to use cAdvisor if the CRI implementation won't report the expected metrics.
564
+
- For Beta/GA releases, components that rely on `/metrics/cadvisor` should take the decided action (use `/stats/summary`, or use the CRI provided replacement for `/metrics/cadvisor`).
564
565
565
566
### Version Skew Strategy
566
567
@@ -629,17 +630,35 @@ _This section must be completed when targeting beta graduation to a release._
629
630
Try to be as paranoid as possible - e.g., what if some components will restart
630
631
mid-rollout?
631
632
633
+
If the CRI implementation doesn't support the required metrics, and cAdvisor has container metrics collection turned off,
634
+
it is possible the node comes up with no metrics about pods and containers. This should be mitigated by making sure that
635
+
the kubelet probes the CRI implementation and enables cAdvisor metrics collection even if the feature gate is on.
636
+
632
637
***What specific metrics should inform a rollback?**
633
638
639
+
The lack of any metrics reported for pods and containers is the worst case scenerio here, and would require either a rollback or for the feature gate to be disabled.
640
+
634
641
***Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
635
642
Describe manual testing that was done and the outcomes.
636
643
Longer term, we may want to require automated upgrade/rollback tests, but we
637
644
are missing a bunch of machinery and tooling and can't do that now.
638
645
646
+
The source of the metrics is a private matter between the kubelet, CRI implementation and cAdvisor. Since cAdvisor
647
+
in embedded in the kubelet, the two pieces that could move disjointly are kubelet and CRI implementation. The
648
+
quality of the metrics reported by the kubelet/CRI are dependent solely on the Kubelet's configuration at runtime. In other
649
+
words, rolling back and upgrading should have no affect--if the upgrade broke metrics because the CRI didn't support them
650
+
(and measures weren't taken to cause kubelet to fallback to cAdvisor), then a rollback (or toggling of the feature gate)
651
+
would return the metrics from cAdvisor.
652
+
639
653
***Is the rollout accompanied by any deprecations and/or removals of features, APIs,
640
654
fields of API types, flags, etc.?**
641
655
Even if applying deprecation policies, they may still surprise some users.
642
656
657
+
A piece of work for Beta is moving the source of the contents of `/metrics/cadvisor`. If users toggle the feature gate,
658
+
prometheus collectors will have to move the URL. However, it's an expressed intention of the implementation to have the CRI
659
+
report metrics previously reported by cAdvisor, so the contents should not change.
660
+
661
+
643
662
### Monitoring Requirements
644
663
645
664
_This section must be completed when targeting beta graduation to a release._
@@ -649,12 +668,17 @@ _This section must be completed when targeting beta graduation to a release._
649
668
checking if there are objects with field X set) may be a last resort. Avoid
650
669
logs or events for this purpose.
651
670
671
+
The source of the pod and container metrics previously reported to Prometheus by `/metrics/cadvisor` is the CRI implementation, not cAdvisor.
672
+
Further, if the CRI implementation was using the old CRI stats provider, then the memory usage of the cgroup the kubelet and runtime
673
+
were in should go down--as some duplicated work should be unduplicated.
674
+
652
675
***What are the SLIs (Service Level Indicators) an operator can use to determine
653
676
the health of the service?**
654
-
-[] Metrics
677
+
-[x] Metrics
655
678
- Metric name:
656
-
-[Optional] Aggregation method:
679
+
- all pod and container level stats coming from cAdvisor `container_*`
657
680
- Components exposing the metric:
681
+
- Previously cAdvisor, now CRI implementation.
658
682
-[ ] Other (treat as last resort)
659
683
- Details:
660
684
@@ -667,6 +691,9 @@ the health of the service?**
667
691
job creation time) for cron job <= 10%
668
692
- 99,9% of /health requests per day finish with 200 code
669
693
694
+
- Reduction of CPU and memory usage between kubelet and CRI (if previously using CRI stats provider).
695
+
- Minimal (< 2%) of performance hit between CPU and memory between CRI and kubelet (if previously using cAdvisor stats provider).
696
+
670
697
***Are there any missing metrics that would be useful to have to improve observability
671
698
of this feature?**
672
699
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
@@ -691,6 +718,12 @@ _This section must be completed when targeting beta graduation to a release._
691
718
- Impact of its degraded performance or high-error rates on the feature:
692
719
693
720
721
+
- CRI implementation
722
+
- Usage description:
723
+
- Impact of its outage on the feature: The feature, as well as many other pieces of Kubernetes, would not work, as the CRI implementation is vital to the creation and running of Pods.
724
+
- Impact of its degraded performance or high-error rates on the feature: All Kuberetes operations will slow down if the CRI spends too much energy in getting the stats.
725
+
726
+
694
727
### Scalability
695
728
696
729
_For alpha, this section is encouraged: reviewers should consider these questions
@@ -729,7 +762,7 @@ operations covered by [existing SLIs/SLOs]?**
729
762
- This may come because cAdvisor's performance has been fine-tuned, and changing the location of work may loose some optimizations.
730
763
- However, it is explicitly stated that a requirement for transition from Alpha->Beta is little to no performance degradation.
731
764
- The existence of the feature gate will allow users to mitigate this potential blip in performance (by not opting-in).
732
-
***Will enabling / using this feature result in non-negligible increase of
765
+
***Will enabling / using this feature result in non-negligible increase of
733
766
resource usage (CPU, RAM, disk, IO, ...) in any components?**
734
767
- It most likely will reduce resource utilization. Right now, there is duplicate work being done between CRI and cAdvisor.
735
768
This will not happen anymore.
@@ -746,7 +779,7 @@ _This section must be completed when targeting beta graduation to a release._
746
779
***How does this feature react if the API server and/or etcd is unavailable?**
747
780
- Should not change.
748
781
***What are other known failure modes?**
749
-
- Kubelet should fail early if problems are detected with version skew. Nothing else should be affected.
782
+
- Kubelet should fall back to using cAdvisor if errors are detected with version skew. Nothing else should be affected.
750
783
751
784
***What steps should be taken if SLOs are not being met to determine the problem?**
0 commit comments