Skip to content

failed to create pod \"llm-xpyd-roleset-xxxxx" already exists, the server was not able to generate a unique name for the object #1481

@Jeffwan

Description

@Jeffwan

🐛 Describe the bug

E0819 11:25:41.378465       1 controller.go:316] "msg"="Reconciler error" "error"="failed to create pod llm-xpyd-roleset-425c7-prefill-586944488f-2-1: pods \"llm-xpyd-roleset-425c7-prefill-586944488f-2-1\" already exists, the server was not able to generate a unique name for the object" "PodSet"={"name":"llm-xpyd-roleset-425c7-prefill-586944488f-2","namespace":"default"} "controller"="podset-controller" "controllerGroup"="orchestration.aibrix.ai" "controllerKind"="PodSet" "name"="llm-xpyd-roleset-425c7-prefill-586944488f-2" "namespace"="default" "reconcileID"="754a77ff-d0df-40fa-893b-c860060cd10a"
E0819 11:25:46.508799       1 controller.go:316] "msg"="Reconciler error" "error"="failed to create pod llm-xpyd-roleset-425c7-prefill-586944488f-2-1: pods \"llm-xpyd-roleset-425c7-prefill-586944488f-2-1\" already exists, the server was not able to generate a unique name for the object" "PodSet"={"name":"llm-xpyd-roleset-425c7-prefill-586944488f-2","namespace":"default"} "controller"="podset-controller" "controllerGroup"="orchestration.aibrix.ai" "controllerKind"="PodSet" "name"="llm-xpyd-roleset-425c7-prefill-586944488f-2" "namespace"="default" "reconcileID"="872c0dbd-5cb2-451a-9dbb-39a9681c7dfc"

Image

this is a known issue, we should fix this blocker issue. the short term workaround is to delete the corresponding high index as well.

Steps to Reproduce

kubectl get pods
NAME                                            READY   STATUS    RESTARTS   AGE
llm-xpyd-roleset-425c7-decode-76647b8d65-0-0    1/1     Running   0          3s
llm-xpyd-roleset-425c7-decode-76647b8d65-0-1    1/1     Running   0          3s
llm-xpyd-roleset-425c7-prefill-586944488f-0-0   1/1     Running   0          3s
llm-xpyd-roleset-425c7-prefill-586944488f-0-1   1/1     Running   0          3s
llm-xpyd-roleset-425c7-prefill-586944488f-1-0   1/1     Running   0          3s
llm-xpyd-roleset-425c7-prefill-586944488f-1-1   1/1     Running   0          3s
llm-xpyd-roleset-425c7-prefill-586944488f-2-0   1/1     Running   0          3s
llm-xpyd-roleset-425c7-prefill-586944488f-2-1   1/1     Running   0          3s


kubectl delete pod llm-xpyd-roleset-425c7-prefill-586944488f-2-0
pod "llm-xpyd-roleset-425c7-prefill-586944488f-2-0" deleted


kubectl get pods
NAME                                            READY   STATUS    RESTARTS   AGE
llm-xpyd-roleset-425c7-decode-76647b8d65-0-0    1/1     Running   0          6m15s
llm-xpyd-roleset-425c7-decode-76647b8d65-0-1    1/1     Running   0          6m15s
llm-xpyd-roleset-425c7-prefill-586944488f-0-0   1/1     Running   0          6m15s
llm-xpyd-roleset-425c7-prefill-586944488f-0-1   1/1     Running   0          6m15s
llm-xpyd-roleset-425c7-prefill-586944488f-1-0   1/1     Running   0          6m15s
llm-xpyd-roleset-425c7-prefill-586944488f-1-1   1/1     Running   0          6m15s

Expected behavior

it should create the pod successfully.

Environment

N/A

Metadata

Metadata

Labels

kind/bugSomething isn't workingkind/featureCategorizes issue or PR as related to a new feature.priority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions