kubectl describe suffers from N+1 query problem when falling back to prefix search, causing performance issues

### What happened?

Running kubectl describe pod <pod-name> on a pod that does not exist triggers a fallback mechanism that can lead to significant performance degradation on large clusters. Instead of simply reporting that the pod was not found, kubectl proceeds to:

1. List all pods in the namespace.

2. Filter this list in-memory for pods that have <pod-name> as a prefix.

3. For each of the matching pods, it issues a separate API call to fetch its associated events.

This behavior results in an N+1 query problem, where 'N' is the number of pods matching the prefix. On a cluster with a large number of pods and events, this can generate a substantial number of requests to the API server, leading to slow response times for the user and increased load on the cluster's control plane.

For instance, in a cluster with tens of thousands of pods, a single kubectl describe command for a non-existent pod could trigger thousands of individual ListEvents calls. This is not only inefficient but also scales poorly as the cluster grows.

### Root cause

Feature introduced in https://github.com/kubernetes/kubernetes/pull/7467

https://github.com/kubernetes/kubernetes/blob/59526cd4867447956156ae3a602fcbac10a2c335/staging/src/k8s.io/kubectl/pkg/cmd/describe/describe.go#L189

It's even documented in https://kubernetes.io/docs/reference/kubectl/generated/kubectl_describe/
<img width="1857" height="1149" alt="Image" src="https://github.com/user-attachments/assets/21e3eb68-5015-44fb-8e61-3556e3a37a3e" />

### Proposal

Several approaches could mitigate this issue:

* Do nothing: Instruct users of large clusters to avoid this pattern. This places the burden on the user and may not be a scalable solution from a usability perspective.

* Remove the feature: Discontinue the prefix-matching fallback entirely. If a resource is not found by its exact name, kubectl should simply report that. This would be a breaking change but would eliminate the performance problem.

* Add user confirmation: If the exact pod is not found, prompt the user before proceeding with the prefix search, especially in an interactive shell. The prompt could include a warning about the potential for performance issues.
  ```
  No pod 'X' found. Do you want to search for pods with the prefix 'X'? [y/N]
  Warning: This may be inefficient in large clusters.
  ```

* Require an explicit flag: Introduce a new flag, such as --prefix, to enable this search behavior, making it an opt-in feature rather than the default fallback.

* [Recommended] Flatten the event queries: This is the suggested solution. Instead of making N separate calls for events, kubectl could be optimized to make a single, more efficient query. After identifying all pods that match the prefix, it could construct a single ListEvents request using a field selector to fetch events for all pods at once and filter them locally.

* Skip fetching events if number of pods is greater than some number. @soltysh suggestion

### What did you expect to happen?

Single call to kubectl should not generate `N` queries to apiserver.

### How can we reproduce it (as minimally and precisely as possible)?

Noticed it on existing cluster and confimed by reading code, but I expect a reproduction could be as following:

Create deployment `nginx` with 10 pods. Run `kubectl describe pods nginx`. See apiserver logs which should have 10 requests to events, one for each pod



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kubectl describe suffers from N+1 query problem when falling back to prefix search, causing performance issues #1769

What happened?

Root cause

Proposal

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kubectl describe suffers from N+1 query problem when falling back to prefix search, causing performance issues #1769

Description

What happened?

Root cause

Proposal

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions