Add private OCI OKE cookbook for Llama Nemotron Nano 8B#117
Add private OCI OKE cookbook for Llama Nemotron Nano 8B#117fede-kamel wants to merge 4 commits intoNVIDIA-NeMo:mainfrom
Conversation
Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
|
@chrisalexiuk-nvidia @anushapant @shashank3959 — Friendly follow-up! This PR adds an OCI OKE deployment cookbook for Llama Nemotron Nano 8B. Would love to get a review when you get a chance. Thanks! |
|
Hey team 👋 — just checking in on this one. Happy to address any feedback or make adjustments to scope if that helps move things along. Let me know if there's anything I can do on my end! @chrisalexiuk-nvidia @anushapant @shashank3959 |
✅ Cross-validated with NeMo Agent Toolkit OCI integrationThis OKE deployment is now serving as the live inference backend for NVIDIA/NeMo-Agent-Toolkit#1804, which adds first-class OCI Generative AI support to the Agent Toolkit. The full Agent Toolkit OCI test suite — 11/11 tests pass — was validated against the |
|
@chrisalexiuk-nvidia @anushapant @shashank3959 — Quick update — we just used this exact OKE deployment to validate the full OCI integration for NeMo Agent Toolkit! 🚀 The Really exciting to see both pieces come together: this PR provides the reproducible OCI deployment, and the Toolkit PR builds on top of it with a first-class integration. Two repos, one Nemotron story on OCI. Looking forward to your review! |
Summary
nvidia/Llama-3.1-Nemotron-Nano-8B-v1us-phoenix-1with no public control-plane endpoint, no public worker IPs, and no public inference endpointvLLMvalues file for a singleVM.GPU.A10.1nodeValidation
Validated against a live Phoenix OKE deployment of
nvidia/Llama-3.1-Nemotron-Nano-8B-v1using a private cluster plus OCI Bastion/tunnel access:terraform planterraform applyPHX-AD-2/health/v1/modelsNotes