Skip to content

Conversation

JaredforReal
Copy link
Collaborator

What this PR does / why we need it:

  • add llm-katan to k8s deploy
  • separate core and llm-katan mode, as what we do in Docker Compose deploy
  • update README and docs
  • expand pvc size
  • fix inference-pool selector error

Copy link

netlify bot commented Oct 17, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 320d9d9
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68f256244403ca00087df27d
😎 Deploy Preview https://deploy-preview-466--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/kubernetes/base/kustomization.yaml
  • deploy/kubernetes/deployment.katan.yaml
  • deploy/kubernetes/overlays/core/kustomization.yaml
  • deploy/kubernetes/overlays/llm-katan/kustomization.yaml
  • deploy/kubernetes/README.md
  • deploy/kubernetes/ai-gateway/inference-pool/inference-pool.yaml
  • deploy/kubernetes/config.yaml
  • deploy/kubernetes/deployment.yaml
  • deploy/kubernetes/kustomization.yaml
  • deploy/kubernetes/pvc.yaml

📁 website

Owners: @Xunzhuo, @rootfs, @yuluo-yx
Files changed:

  • website/docs/installation/kubernetes.md

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@JaredforReal JaredforReal marked this pull request as draft October 17, 2025 14:59
@yossiovadia yossiovadia self-requested a review October 17, 2025 17:20
Copy link
Collaborator

@yossiovadia yossiovadia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look solid for moving forward, but consider adding validation for the Qwen model download and possibly making the init container more robust in handling download failures, but it can be a future enhancement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants