Component
Helm Chart
Desired use case or feature
When standing up decode and prefill pods with quickstart/llmd-installer.sh, they lack an appropriate readiness probe. Large models take multiple minutes beyond "ready" before they can actually accept requests.
Proposed solution
Add readiness probes that ensure vLLM is ready to accept requests (like httpGet probe to /health), along with probes for any sidecars.
Alternatives
No response
Additional context or screenshots
No response