How to deploy GPT-OSS 120B on KubeAI with multi-node GPU setup? #589
Unanswered
SupakritCRO
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone 👋
I have a Kubernetes cluster with 2 control plane nodes and 2 worker nodes, each worker has 1× NVIDIA RTX 4090 GPU.
I want to use KubeAI to deploy a GPT-OSS 120B open-weight model as an API or chat GUI for internal use — ideally using distributed inference across the two GPU nodes.
Has anyone tried this setup?
Any tips or real-world experience would be appreciated 🙏
Beta Was this translation helpful? Give feedback.
All reactions