Change the container and pod information resolution from Kubelet API to APIServer Watcher. #615
Replies: 4 comments 2 replies
-
Replying the @SamYuan1990 comment here #611 (comment)
The Note that, get operations are not efficient and can introduce scalability issues. But we can use the logic of |
Beta Was this translation helpful? Give feedback.
-
While on some dev environment, it is possible to The kube-apiserver access will be needed to track Pods lifecycle and hierarchy (tracked in #301). We can make a more robust kube-apiserver watcher so that kepler can use in conjunction with kubelet API. |
Beta Was this translation helpful? Give feedback.
-
I asked this question but given following seems this is a general and reasonable way if the performance consideration has been well optimized?
|
Beta Was this translation helpful? Give feedback.
-
I created a PR to fix this #594. The PR only watches for pods and filter the events with the node label, minimizing the event updates. Unfortunately, we cannot do the same for job and deployments since they do not have the node label. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone, I need to open a discussion with the community. So could you please give some feedback in our discussion to improve the container/pod information resolution?
The current code retrieves a list of pods from the Kubelet API and searches through the entire list to get container and pod details, such as name and ID. This approach can have performance and scalability issues if the node has a large number of pod and containers and I recently discovered that it also has security vulnerabilities.
I have proposed an approach using the container runtime in the PR #611. However, as pointed by @rootfs, expose the container runtime introduces security vulnerabilities, see here.
Thanks to the @rootfs pointer, I’ve done more investigation in security vulnerabilities and I found that enabling access to the Kubelet API (which is our current implementation) can also introduce security vulnerabilities. This article highlights some of the issues: https://faun.pub/attacking-kubernetes-clusters-using-the-kubelet-api-abafc36126ca. As the container runtime security issue, it might also be possible to execute commands in any container using the Kubelet API with a command like this:
https://${IP_ADDRESS}:10250/run/<namespace>/<pod>/<container> -d "cmd=<command-to-run>"
Additionally, even more critical for Kepler, if the cluster admin disables unauthenticated access to the Kubelet API, Kepler may not be able to resolve pod and container names.
Perhaps it would be worth reconsidering the approach of watching for pods by filtering with the node name, as I previously suggested in this comment: #301 (comment). Although APIServer Watcher can introduce some overhead to the API server, since we can filter the pod list with the node name the overhead should not be too high. Note that the feedback from the Kubernetes sig-scalability community is that pods filtered by spec.nodeName is specially optimized in K8s and should work fine the scalability problem is to list Deployments and Jobs, see here.
Beta Was this translation helpful? Give feedback.
All reactions