Change the container and pod information resolution from Kubelet API to APIServer Watcher. #615

marceloamaral · 2023-04-05T09:56:10Z

marceloamaral
Apr 5, 2023
Maintainer

Hi everyone, I need to open a discussion with the community. So could you please give some feedback in our discussion to improve the container/pod information resolution?

The current code retrieves a list of pods from the Kubelet API and searches through the entire list to get container and pod details, such as name and ID. This approach can have performance and scalability issues if the node has a large number of pod and containers and I recently discovered that it also has security vulnerabilities.

I have proposed an approach using the container runtime in the PR #611. However, as pointed by @rootfs, expose the container runtime introduces security vulnerabilities, see here.

Thanks to the @rootfs pointer, I’ve done more investigation in security vulnerabilities and I found that enabling access to the Kubelet API (which is our current implementation) can also introduce security vulnerabilities. This article highlights some of the issues: https://faun.pub/attacking-kubernetes-clusters-using-the-kubelet-api-abafc36126ca. As the container runtime security issue, it might also be possible to execute commands in any container using the Kubelet API with a command like this: https://${IP_ADDRESS}:10250/run/<namespace>/<pod>/<container> -d "cmd=<command-to-run>"

Additionally, even more critical for Kepler, if the cluster admin disables unauthenticated access to the Kubelet API, Kepler may not be able to resolve pod and container names.

Perhaps it would be worth reconsidering the approach of watching for pods by filtering with the node name, as I previously suggested in this comment: #301 (comment). Although APIServer Watcher can introduce some overhead to the API server, since we can filter the pod list with the node name the overhead should not be too high. Note that the feedback from the Kubernetes sig-scalability community is that pods filtered by spec.nodeName is specially optimized in K8s and should work fine the scalability problem is to list Deployments and Jobs, see here.

marceloamaral · 2023-04-05T10:17:09Z

marceloamaral
Apr 5, 2023
Maintainer Author

Replying the @SamYuan1990 comment here #611 (comment)

My understanding as:
we need pod and container informations.
we should get those in a secure way as much as possible. (vulnerabilities risks)
hence it sounds like a balance/trade off here, between security and feature.
I am not sure that we can reuse some codes from k8s, as kubelet/kubectl to get pod info?
as k8s is our dependency ... we'd better reuse k8s libraries/code.

The kubectl uses a get/watch operation to the APIServer to get pod info. The kubelet uses the container runtime and the APIServer.

Note that, get operations are not efficient and can introduce scalability issues. But we can use the logic of Watchers to receive updates from the APIServer.

1 reply

SamYuan1990 Apr 5, 2023
Maintainer

ok, let's try with watch

rootfs · 2023-04-05T17:27:24Z

rootfs
Apr 5, 2023
Maintainer

Production Kubelet should protect unauthorized write access to the HTTPS endpoint.
RBAC is to prevent unprivileged account to access Kubelet API.
Kepler has only read-only access to kubelet API

While on some dev environment, it is possible to curl the kubelet endpoint. In practice, without a service account brearing token, curl will not be able to access a kubelet endpoint that turns on authentication. For instance, I created a k8s cluster using kubeadm In this env, I have a script to apply a kubelet rbac and get the token so that a standalone kepler binary can access kubelet endpoint.

The kube-apiserver access will be needed to track Pods lifecycle and hierarchy (tracked in #301). We can make a more robust kube-apiserver watcher so that kepler can use in conjunction with kubelet API.

0 replies

jichenjc · 2023-04-06T09:12:21Z

jichenjc
Apr 6, 2023
Maintainer

I asked this question why we use kubelet API directly and I think the answer from rootfs is performance due to only focus on that paticular nodes

but given following seems this is a general and reasonable way if the performance consideration has been well optimized?

Note that the feedback from the Kubernetes sig-scalability community is that pods filtered by spec.nodeName is specially optimized in K8s and should work fine the scalability problem is to list Deployments and Jobs, see [here](https://kubernetes.slack.com/archives/C09QZTRH7/p1666725282738139?thread_ts=1666680527.786469&cid=C09QZTRH7).

1 reply

rootfs Apr 7, 2023
Maintainer

@jichenjc Every watch starts with a LIST, that is what causes the kube-apiserver busy.

marceloamaral · 2023-04-19T05:58:15Z

marceloamaral
Apr 19, 2023
Maintainer Author

I created a PR to fix this #594.

The PR only watches for pods and filter the events with the node label, minimizing the event updates.

Unfortunately, we cannot do the same for job and deployments since they do not have the node label.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change the container and pod information resolution from Kubelet API to APIServer Watcher. #615

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Change the container and pod information resolution from Kubelet API to APIServer Watcher. #615

Uh oh!

Uh oh!

marceloamaral Apr 5, 2023 Maintainer

Replies: 4 comments · 2 replies

Uh oh!

marceloamaral Apr 5, 2023 Maintainer Author

Uh oh!

SamYuan1990 Apr 5, 2023 Maintainer

Uh oh!

rootfs Apr 5, 2023 Maintainer

Uh oh!

jichenjc Apr 6, 2023 Maintainer

Uh oh!

rootfs Apr 7, 2023 Maintainer

Uh oh!

marceloamaral Apr 19, 2023 Maintainer Author

marceloamaral
Apr 5, 2023
Maintainer

Replies: 4 comments 2 replies

marceloamaral
Apr 5, 2023
Maintainer Author

SamYuan1990 Apr 5, 2023
Maintainer

rootfs
Apr 5, 2023
Maintainer

jichenjc
Apr 6, 2023
Maintainer

rootfs Apr 7, 2023
Maintainer

marceloamaral
Apr 19, 2023
Maintainer Author