diff --git a/README.md b/README.md
index ebe41ced..de2594d4 100644
--- a/README.md
+++ b/README.md
@@ -168,6 +168,7 @@ Practical deployment and model usage guides for Nemotron models.
 |-------|----------|--------------|-----------|
 | [**Nemotron 3 Super 120B A12B**](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16) | Production deployments needing strong reasoning | 1M context, in NVFP4 single B200, RAG & tool calling | [Cookbooks](./usage-cookbook/Nemotron-3-Super) |
 | [**Nemotron 3 Nano 30B A3B**](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16) | Resource-constrained environments | 1M context, sparse MoE hybrid Mamba-2, controllable reasoning | [Cookbooks](./usage-cookbook/Nemotron-3-Nano) |
+| [**Llama-3.1-Nemotron-Nano-8B-v1**](https://huggingface.co/nvidia/Llama-3.1-Nemotron-Nano-8B-v1) | Small-footprint OCI deployments | Validated on private OKE in Phoenix with `vLLM`, OCI Bastion service, tool calling, and OpenAI-compatible `/v1` inference; provides a reproducible OCI path comparable to common AWS GPU/Kubernetes deployment patterns | [Cookbooks](./usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1) |
 | [**NVIDIA-Nemotron-Nano-12B-v2-VL**](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL) | Document intelligence and video understanding | 12B VLM, video reasoning, Efficient Video Sampling | [Cookbooks](./usage-cookbook/Nemotron-Nano2-VL/) |
 | [**Llama-3.1-Nemotron-Safety-Guard-8B-v3**](https://huggingface.co/nvidia/Llama-3.1-Nemotron-Safety-Guard-8B-v3) | Multilingual content moderation | 9 languages, 23 safety categories | [Cookbooks](./usage-cookbook/Llama-3.1-Nemotron-Safety-Guard-V3/) |
 | **Nemotron-Parse** | Document parsing for RAG and AI agents | Table extraction, semantic segmentation | [Cookbooks](./usage-cookbook/Nemotron-Parse-v1.1/) |
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/README.md b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/README.md
new file mode 100644
index 00000000..866a1b3c
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/README.md
@@ -0,0 +1,288 @@
+# Llama-3.1-Nemotron-Nano-8B-v1 on OCI OKE (Private Phoenix Deployment)
+
+This cookbook documents a validated private deployment of
+`nvidia/Llama-3.1-Nemotron-Nano-8B-v1` on **Oracle Cloud Infrastructure (OCI)** using:
+
+- `us-phoenix-1`
+- a **private** Oracle Kubernetes Engine (OKE) cluster
+- a single `VM.GPU.A10.1` worker
+- `vLLM` with an OpenAI-compatible `/v1` endpoint
+
+This guide is intentionally **private-only**:
+
+- no public Kubernetes API endpoint
+- no public worker-node IPs
+- no public inference endpoint
+
+Access is handled through **OCI Bastion** and local port forwarding.
+
+Note: the Terraform sample in this cookbook provisions the **OCI Bastion
+service** for reproducible private access. It does **not** create a public
+bastion host VM.
+
+This gives Nemotron users a reproducible Oracle Cloud deployment path that
+leans into OCI's strengths for enterprise workloads: private OKE control
+planes, managed Bastion access, and a clean separation between infrastructure
+provisioning and model serving.
+
+## Why this configuration
+
+This setup gives Nemotron users a reproducible OCI deployment path with a small
+single-GPU footprint while preserving tool calling, structured output, and
+streaming support.
+
+For teams evaluating cloud options for Nemotron, this sample shows that OCI can
+offer a practical and well-contained production shape: private networking,
+managed access, and a validated GPU-backed serving path in Phoenix.
+
+Validated capabilities on this deployment:
+
+- chat completion
+- structured output
+- tool calling
+- streaming
+- async/concurrent requests
+- OpenAI-compatible model discovery via `/v1/models`
+
+## Tested environment
+
+- Region: `us-phoenix-1`
+- Kubernetes: OKE private cluster
+- GPU shape: `VM.GPU.A10.1`
+- Model: `nvidia/Llama-3.1-Nemotron-Nano-8B-v1`
+- Serving stack: `vLLM`
+- Inference API: OpenAI-compatible `/v1`
+
+## Architecture
+
+1. Create a **private** OKE cluster in Phoenix.
+2. Create a CPU node pool and a GPU node pool.
+3. Use **OCI Bastion** to reach the cluster API locally.
+4. Deploy Nemotron with the checked-in `vLLM` values file.
+5. Keep the inference service internal and validate it through a local
+   port-forward.
+
+## Prerequisites
+
+- OCI tenancy with Phoenix capacity for `VM.GPU.A10.1`
+- OKE permissions
+- OCI Bastion permissions
+- `kubectl`
+- `helm`
+- access to pull the Nemotron model from Hugging Face or an equivalent model
+  artifact source accepted by your environment
+
+## Deployment notes
+
+This cookbook assumes a private OKE cluster. Keep these constraints:
+
+- disable the public Kubernetes control-plane endpoint
+- do not attach public IPs to worker nodes
+- do not expose the model through a public load balancer
+
+The known-good serving values are in
+[`vllm_oke_phoenix_private_values.yaml`](./vllm_oke_phoenix_private_values.yaml).
+
+Terraform for the private Phoenix OKE infrastructure is available in
+[`terraform/`](./terraform/).
+
+That Terraform path was validated end to end in Phoenix through:
+
+- VCN and private subnets
+- private OKE control plane
+- OCI Bastion service
+- CPU node pool
+- GPU node pool on `VM.GPU.A10.1`
+
+Important settings for this single-A10 deployment:
+
+- `maxModelLen: 4096`
+- `gpuMemoryUtilization: 0.95`
+- `enableTool: true`
+- `toolCallParser: llama3_json`
+- `chatTemplate: /vllm-workspace/examples/tool_chat_template_llama3.1_json.jinja`
+
+These settings were required to make the model stable on a single A10 while
+preserving tool-calling behavior.
+
+## Example install flow
+
+Deploy the serving stack with the `vLLM Production Stack` Helm chart using the
+checked-in values file:
+
+```bash
+helm upgrade --install vllm <path-to-vllm-production-stack-helm-chart> \
+  -n default \
+  -f usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/vllm_oke_phoenix_private_values.yaml
+```
+
+Use Bastion plus the private cluster endpoint for cluster access. Then
+port-forward the router service locally:
+
+```bash
+kubectl -n default port-forward svc/vllm-router-service 8080:80
+```
+
+At that point, the local validation endpoint is:
+
+```text
+http://127.0.0.1:8080/v1
+```
+
+## Validation
+
+Health check:
+
+```bash
+curl -s http://127.0.0.1:8080/health
+```
+
+Model discovery:
+
+```bash
+curl -s http://127.0.0.1:8080/v1/models
+```
+
+Chat completion:
+
+```bash
+curl -s http://127.0.0.1:8080/v1/chat/completions \
+  -H 'Content-Type: application/json' \
+  -d '{
+    "model": "nvidia/Llama-3.1-Nemotron-Nano-8B-v1",
+    "messages": [{"role": "user", "content": "Reply with NEMOTRON_OK"}]
+  }'
+```
+
+## Tool-calling smoke test
+
+```bash
+curl -s http://127.0.0.1:8080/v1/chat/completions \
+  -H 'Content-Type: application/json' \
+  -d '{
+    "model": "nvidia/Llama-3.1-Nemotron-Nano-8B-v1",
+    "messages": [{"role": "user", "content": "What time is it in UTC?"}],
+    "tools": [{
+      "type": "function",
+      "function": {
+        "name": "get_utc_time",
+        "description": "Return the current UTC time",
+        "parameters": {
+          "type": "object",
+          "properties": {},
+          "required": []
+        }
+      }
+    }]
+  }'
+```
+
+Expected behavior: the model returns a tool call with `finish_reason` set to
+`tool_calls`.
+
+## Query via OCI Bastion
+
+For this private deployment, query the cluster and model through the **OCI
+Bastion service** plus local forwarding.
+
+Export the Terraform outputs:
+
+```bash
+export BASTION_ID="<terraform output oci_bastion_id>"
+export PRIVATE_API_HOST="<terraform output apiserver_private_host>"
+export REGION="us-phoenix-1"
+export OCI_CLI_PROFILE="API_KEY_AUTH"
+```
+
+Create a Bastion port-forwarding session to the private OKE API:
+
+```bash
+oci bastion session create-port-forwarding \
+  --bastion-id "$BASTION_ID" \
+  --ssh-public-key-file ~/.ssh/id_ed25519.pub \
+  --key-type PUB \
+  --target-port 6443 \
+  --target-private-ip "$PRIVATE_API_HOST" \
+  --display-name nemotron-oke-api \
+  --session-ttl 10800 \
+  --region "$REGION" \
+  --profile "$OCI_CLI_PROFILE"
+```
+
+Inspect the created session and copy the SSH command OCI returns:
+
+```bash
+oci bastion session get \
+  --session-id "<session_ocid>" \
+  --region "$REGION" \
+  --profile "$OCI_CLI_PROFILE"
+```
+
+Run the returned SSH command so that the private Kubernetes API is reachable on
+local port `6443`, then query the cluster:
+
+```bash
+kubectl get nodes
+kubectl -n default get pods
+```
+
+Port-forward the Nemotron router service:
+
+```bash
+kubectl -n default port-forward svc/vllm-router-service 8080:80
+```
+
+At that point, the private model is queryable locally without exposing a public
+inference endpoint:
+
+```bash
+curl -s http://127.0.0.1:8080/v1/models
+```
+
+```bash
+curl -s http://127.0.0.1:8080/v1/chat/completions \
+  -H 'Content-Type: application/json' \
+  -d '{
+    "model": "nvidia/Llama-3.1-Nemotron-Nano-8B-v1",
+    "messages": [{"role": "user", "content": "Reply with NEMOTRON_OK"}]
+  }'
+```
+
+## Operational notes
+
+- Phoenix provided a workable path for this deployment when Chicago GPU capacity
+  was not available.
+- A single A10 is enough for the validated setup, but it requires conservative
+  context sizing.
+- Private access plus local forwarding keeps the control plane and inference
+  path off the public internet.
+
+## Troubleshooting
+
+### The model pod starts but never becomes ready
+
+Reduce context pressure and ensure the `vLLM` values include:
+
+- `maxModelLen: 4096`
+- `gpuMemoryUtilization: 0.95`
+
+### Tool calling does not work
+
+Make sure all of these are set:
+
+- `enableTool: true`
+- `toolCallParser: llama3_json`
+- `chatTemplate: /vllm-workspace/examples/tool_chat_template_llama3.1_json.jinja`
+
+### `kubectl` cannot reach the cluster
+
+This guide assumes a **private** OKE cluster. Re-establish the Bastion tunnel
+before using `kubectl`.
+
+### The endpoint is reachable but `/v1/models` is empty or wrong
+
+Confirm the deployment is serving:
+
+- `nvidia/Llama-3.1-Nemotron-Nano-8B-v1`
+
+and that the router service is forwarding to the Nemotron backend pods.
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/.gitignore b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/.gitignore
new file mode 100644
index 00000000..1a22d40b
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/.gitignore
@@ -0,0 +1,5 @@
+.terraform/
+terraform.tfvars
+terraform.tfstate
+terraform.tfstate.*
+tfplan
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/.terraform.lock.hcl b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/.terraform.lock.hcl
new file mode 100644
index 00000000..a539a586
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/.terraform.lock.hcl
@@ -0,0 +1,145 @@
+# This file is maintained automatically by "terraform init".
+# Manual edits may be lost in future updates.
+
+provider "registry.terraform.io/hashicorp/cloudinit" {
+  version     = "2.3.7"
+  constraints = ">= 2.2.0"
+  hashes = [
+    "h1:M9TpQxKAE/hyOwytdX9MUNZw30HoD/OXqYIug5fkqH8=",
+    "zh:06f1c54e919425c3139f8aeb8fcf9bceca7e560d48c9f0c1e3bb0a8ad9d9da1e",
+    "zh:0e1e4cf6fd98b019e764c28586a386dc136129fef50af8c7165a067e7e4a31d5",
+    "zh:1871f4337c7c57287d4d67396f633d224b8938708b772abfc664d1f80bd67edd",
+    "zh:2b9269d91b742a71b2248439d5e9824f0447e6d261bfb86a8a88528609b136d1",
+    "zh:3d8ae039af21426072c66d6a59a467d51f2d9189b8198616888c1b7fc42addc7",
+    "zh:3ef4e2db5bcf3e2d915921adced43929214e0946a6fb11793085d9a48995ae01",
+    "zh:42ae54381147437c83cbb8790cc68935d71b6357728a154109d3220b1beb4dc9",
+    "zh:4496b362605ae4cbc9ef7995d102351e2fe311897586ffc7a4a262ccca0c782a",
+    "zh:652a2401257a12706d32842f66dac05a735693abcb3e6517d6b5e2573729ba13",
+    "zh:7406c30806f5979eaed5f50c548eced2ea18ea121e01801d2f0d4d87a04f6a14",
+    "zh:7848429fd5a5bcf35f6fee8487df0fb64b09ec071330f3ff240c0343fe2a5224",
+    "zh:78d5eefdd9e494defcb3c68d282b8f96630502cac21d1ea161f53cfe9bb483b3",
+  ]
+}
+
+provider "registry.terraform.io/hashicorp/helm" {
+  version     = "3.1.1"
+  constraints = ">= 3.0.1"
+  hashes = [
+    "h1:47CqNwkxctJtL/N/JuEj+8QMg8mRNI/NWeKO5/ydfZU=",
+    "zh:1a6d5ce931708aec29d1f3d9e360c2a0c35ba5a54d03eeaff0ce3ca597cd0275",
+    "zh:3411919ba2a5941801e677f0fea08bdd0ae22ba3c9ce3309f55554699e06524a",
+    "zh:81b36138b8f2320dc7f877b50f9e38f4bc614affe68de885d322629dd0d16a29",
+    "zh:95a2a0a497a6082ee06f95b38bd0f0d6924a65722892a856cfd914c0d117f104",
+    "zh:9d3e78c2d1bb46508b972210ad706dd8c8b106f8b206ecf096cd211c54f46990",
+    "zh:a79139abf687387a6efdbbb04289a0a8e7eaca2bd91cdc0ce68ea4f3286c2c34",
+    "zh:aaa8784be125fbd50c48d84d6e171d3fb6ef84a221dbc5165c067ce05faab4c8",
+    "zh:afecd301f469975c9d8f350cc482fe656e082b6ab0f677d1a816c3c615837cc1",
+    "zh:c54c22b18d48ff9053d899d178d9ffef7d9d19785d9bf310a07d648b7aac075b",
+    "zh:db2eefd55aea48e73384a555c72bac3f7d428e24147bedb64e1a039398e5b903",
+    "zh:ee61666a233533fd2be971091cecc01650561f1585783c381b6f6e8a390198a4",
+    "zh:f569b65999264a9416862bca5cd2a6177d94ccb0424f3a4ef424428912b9cb3c",
+  ]
+}
+
+provider "registry.terraform.io/hashicorp/http" {
+  version     = "3.5.0"
+  constraints = ">= 3.2.1"
+  hashes = [
+    "h1:dl73+8wzQR++HFGoJgDqY3mj3pm14HUuH/CekVyOj5s=",
+    "zh:047c5b4920751b13425efe0d011b3a23a3be97d02d9c0e3c60985521c9c456b7",
+    "zh:157866f700470207561f6d032d344916b82268ecd0cf8174fb11c0674c8d0736",
+    "zh:1973eb9383b0d83dd4fd5e662f0f16de837d072b64a6b7cd703410d730499476",
+    "zh:212f833a4e6d020840672f6f88273d62a564f44acb0c857b5961cdb3bbc14c90",
+    "zh:2c8034bc039fffaa1d4965ca02a8c6d57301e5fa9fff4773e684b46e3f78e76a",
+    "zh:5df353fc5b2dd31577def9cc1a4ebf0c9a9c2699d223c6b02087a3089c74a1c6",
+    "zh:672083810d4185076c81b16ad13d1224b9e6ea7f4850951d2ab8d30fa6e41f08",
+    "zh:78d5eefdd9e494defcb3c68d282b8f96630502cac21d1ea161f53cfe9bb483b3",
+    "zh:7b4200f18abdbe39904b03537e1a78f21ebafe60f1c861a44387d314fda69da6",
+    "zh:843feacacd86baed820f81a6c9f7bd32cf302db3d7a0f39e87976ebc7a7cc2ee",
+    "zh:a9ea5096ab91aab260b22e4251c05f08dad2ed77e43e5e4fadcdfd87f2c78926",
+    "zh:d02b288922811739059e90184c7f76d45d07d3a77cc48d0b15fd3db14e928623",
+  ]
+}
+
+provider "registry.terraform.io/hashicorp/null" {
+  version     = "3.2.4"
+  constraints = ">= 3.2.1"
+  hashes = [
+    "h1:L5V05xwp/Gto1leRryuesxjMfgZwjb7oool4WS1UEFQ=",
+    "zh:59f6b52ab4ff35739647f9509ee6d93d7c032985d9f8c6237d1f8a59471bbbe2",
+    "zh:78d5eefdd9e494defcb3c68d282b8f96630502cac21d1ea161f53cfe9bb483b3",
+    "zh:795c897119ff082133150121d39ff26cb5f89a730a2c8c26f3a9c1abf81a9c43",
+    "zh:7b9c7b16f118fbc2b05a983817b8ce2f86df125857966ad356353baf4bff5c0a",
+    "zh:85e33ab43e0e1726e5f97a874b8e24820b6565ff8076523cc2922ba671492991",
+    "zh:9d32ac3619cfc93eb3c4f423492a8e0f79db05fec58e449dee9b2d5873d5f69f",
+    "zh:9e15c3c9dd8e0d1e3731841d44c34571b6c97f5b95e8296a45318b94e5287a6e",
+    "zh:b4c2ab35d1b7696c30b64bf2c0f3a62329107bd1a9121ce70683dec58af19615",
+    "zh:c43723e8cc65bcdf5e0c92581dcbbdcbdcf18b8d2037406a5f2033b1e22de442",
+    "zh:ceb5495d9c31bfb299d246ab333f08c7fb0d67a4f82681fbf47f2a21c3e11ab5",
+    "zh:e171026b3659305c558d9804062762d168f50ba02b88b231d20ec99578a6233f",
+    "zh:ed0fe2acdb61330b01841fa790be00ec6beaac91d41f311fb8254f74eb6a711f",
+  ]
+}
+
+provider "registry.terraform.io/hashicorp/random" {
+  version     = "3.8.1"
+  constraints = ">= 3.4.3"
+  hashes = [
+    "h1:u8AKlWVDTH5r9YLSeswoVEjiY72Rt4/ch7U+61ZDkiQ=",
+    "zh:08dd03b918c7b55713026037c5400c48af5b9f468f483463321bd18e17b907b4",
+    "zh:0eee654a5542dc1d41920bbf2419032d6f0d5625b03bd81339e5b33394a3e0ae",
+    "zh:229665ddf060aa0ed315597908483eee5b818a17d09b6417a0f52fd9405c4f57",
+    "zh:2469d2e48f28076254a2a3fc327f184914566d9e40c5780b8d96ebf7205f8bc0",
+    "zh:37d7eb334d9561f335e748280f5535a384a88675af9a9eac439d4cfd663bcb66",
+    "zh:741101426a2f2c52dee37122f0f4a2f2d6af6d852cb1db634480a86398fa3511",
+    "zh:78d5eefdd9e494defcb3c68d282b8f96630502cac21d1ea161f53cfe9bb483b3",
+    "zh:a902473f08ef8df62cfe6116bd6c157070a93f66622384300de235a533e9d4a9",
+    "zh:b85c511a23e57a2147355932b3b6dce2a11e856b941165793a0c3d7578d94d05",
+    "zh:c5172226d18eaac95b1daac80172287b69d4ce32750c82ad77fa0768be4ea4b8",
+    "zh:dab4434dba34aad569b0bc243c2d3f3ff86dd7740def373f2a49816bd2ff819b",
+    "zh:f49fd62aa8c5525a5c17abd51e27ca5e213881d58882fd42fec4a545b53c9699",
+  ]
+}
+
+provider "registry.terraform.io/hashicorp/time" {
+  version     = "0.13.1"
+  constraints = ">= 0.9.1"
+  hashes = [
+    "h1:ZT5ppCNIModqk3iOkVt5my8b8yBHmDpl663JtXAIRqM=",
+    "zh:02cb9aab1002f0f2a94a4f85acec8893297dc75915f7404c165983f720a54b74",
+    "zh:04429b2b31a492d19e5ecf999b116d396dac0b24bba0d0fb19ecaefe193fdb8f",
+    "zh:26f8e51bb7c275c404ba6028c1b530312066009194db721a8427a7bc5cdbc83a",
+    "zh:772ff8dbdbef968651ab3ae76d04afd355c32f8a868d03244db3f8496e462690",
+    "zh:78d5eefdd9e494defcb3c68d282b8f96630502cac21d1ea161f53cfe9bb483b3",
+    "zh:898db5d2b6bd6ca5457dccb52eedbc7c5b1a71e4a4658381bcbb38cedbbda328",
+    "zh:8de913bf09a3fa7bedc29fec18c47c571d0c7a3d0644322c46f3aa648cf30cd8",
+    "zh:9402102c86a87bdfe7e501ffbb9c685c32bbcefcfcf897fd7d53df414c36877b",
+    "zh:b18b9bb1726bb8cfbefc0a29cf3657c82578001f514bcf4c079839b6776c47f0",
+    "zh:b9d31fdc4faecb909d7c5ce41d2479dd0536862a963df434be4b16e8e4edc94d",
+    "zh:c951e9f39cca3446c060bd63933ebb89cedde9523904813973fbc3d11863ba75",
+    "zh:e5b773c0d07e962291be0e9b413c7a22c044b8c7b58c76e8aa91d1659990dfb5",
+  ]
+}
+
+provider "registry.terraform.io/oracle/oci" {
+  version     = "8.5.0"
+  constraints = ">= 4.67.3, >= 7.30.0"
+  hashes = [
+    "h1:YGSTTLRk0vpD4P0dJFt2lZ2XphT2skF9AxBGCkM04z4=",
+    "zh:0289ba575d3749068fc12fdbfa3f44b9780b21a23315eb2ca5bcf73065cc4fe7",
+    "zh:1152fd8451c2b74d87594fda1aa69e6a3f772189b902a592e91fcc57dfe3c48f",
+    "zh:3e4b1a2e345263e48d6be4d6d01fd5976b09af585e4a9314d318ab216304b8f1",
+    "zh:6b88ebb0ed7de80e324124511251561072c8a5f1ae222aa588063a1652ff72e8",
+    "zh:8ef61c735f19e1be9abeeb79debbeacd91e5996b4be5719d61323244e19ebe3d",
+    "zh:8fcdc6701173b59d78f076f8ce4ce01ef127bf5bf65323340e23c0b14da02f9d",
+    "zh:9b12af85486a96aedd8d7984b0ff811a4b42e3d88dad1a3fb4c0b580d04fa425",
+    "zh:a03e6f788876b7408d811eb21056986e15c46876983637e7e5e645fff28d0587",
+    "zh:b1149065247943c0937359e0f2ed5fdce9c2a588e32e90b9c13be64f709f8121",
+    "zh:b375612ef300e7f53797552521d3ec10f3d9465ccbe6d96519314e32d6611c93",
+    "zh:daf49947168641d170f59907b2592f020ab17f5443e8f5a96174219112d51fe2",
+    "zh:e9649887105493b311cbaf180ba635186e1a4c3b5fe7e26ea9bfd06a52aa76f3",
+    "zh:f593bb15d46c5c998401fea9cc3fdf7950b81a53632ecb1bea8d2cc41971ccca",
+    "zh:f7f1f4d0c5922bd0403b989ebed168577164dbfc45181b2e19dcb888e1fc9df7",
+    "zh:fafce2b47e3227dc8068db4f2bf223c4a4b8fefe39f50aeced467eed1bd901e3",
+  ]
+}
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/README.md b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/README.md
new file mode 100644
index 00000000..9d0fe219
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/README.md
@@ -0,0 +1,97 @@
+# Terraform: Private OCI OKE for Llama-3.1-Nemotron-Nano-8B-v1
+
+This Terraform example provisions the **private-only** OCI infrastructure for
+the validated Phoenix deployment described in the parent cookbook.
+
+It is intended to give Nemotron users a reproducible OCI path for NVIDIA model
+serving that highlights Oracle Cloud's operational strengths: private OKE,
+managed Bastion access, and a clean infrastructure-as-code path for GPU-backed
+Nemotron deployments.
+
+It creates:
+
+- a VCN
+- a **private** OKE cluster
+- a private CPU node pool
+- a private GPU node pool targeting `VM.GPU.A10.1`
+- an **OCI Bastion service** resource for private access
+
+It does **not** create:
+
+- a public Kubernetes API endpoint
+- public worker-node IPs
+- a public bastion host
+- a public inference endpoint
+
+## Bastion note
+
+This sample provisions the **OCI Bastion service** so that private-cluster
+access is reproducible from Terraform.
+
+That is intentionally different from creating a public bastion VM:
+
+- no public bastion compute instance is created
+- no worker node receives a public IP
+- the Kubernetes API remains private
+
+If your environment already manages private-cluster access through a separate
+operator workflow, you can remove the `oci_bastion_bastion` resource and keep
+the rest of the sample unchanged.
+
+## Module choice
+
+This wrapper intentionally uses Oracle's official OKE Terraform module:
+
+- `oracle-terraform-modules/oke/oci`
+
+The Nemotron-specific layer in this directory adds:
+
+- the Phoenix defaults
+- the no-public-IP constraints
+- the A10-focused worker pool defaults
+- the OCI Bastion service resource required for private access
+
+## Files
+
+- [`main.tf`](./main.tf) - private OKE cluster, worker pools, OCI Bastion
+- [`variables.tf`](./variables.tf) - deployment inputs
+- [`outputs.tf`](./outputs.tf) - useful IDs and private endpoint information
+- [`terraform.tfvars.example`](./terraform.tfvars.example) - starting point
+
+## Usage
+
+```bash
+cp terraform.tfvars.example terraform.tfvars
+terraform init
+terraform plan
+terraform apply
+```
+
+The validated live run completed successfully in `us-phoenix-1`, including:
+
+- private OKE cluster creation
+- OCI Bastion service creation
+- CPU node pool creation
+- GPU node pool creation on `VM.GPU.A10.1` in `PHX-AD-2`
+
+After the infrastructure is ready:
+
+1. create an OCI Bastion session to reach the private cluster
+2. deploy the model with:
+   - [`../vllm_oke_phoenix_private_values.yaml`](../vllm_oke_phoenix_private_values.yaml)
+3. validate:
+   - `/health`
+   - `/v1/models`
+   - chat completion
+   - tool calling
+   - streaming
+
+## Notes
+
+- The validated live deployment used `us-phoenix-1`.
+- The validated GPU pool used Phoenix `AD-2`, exposed as `gpu_placement_ads`.
+- The Bastion resource here is the OCI managed Bastion service, not a public
+  bastion VM.
+- `ssh_public_key_path` must point to an actual OpenSSH public key file; the
+  wrapper reads the file contents with Terraform's `file()` function before
+  passing it to OKE.
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/main.tf b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/main.tf
new file mode 100644
index 00000000..e9b07078
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/main.tf
@@ -0,0 +1,112 @@
+provider "oci" {
+  config_file_profile = var.config_file_profile
+  tenancy_ocid        = var.tenancy_ocid
+  region              = var.region
+}
+
+locals {
+  common_tags = merge(var.freeform_tags, {
+    model      = "nvidia/Llama-3.1-Nemotron-Nano-8B-v1"
+    deployment = "private-oke"
+    region     = var.region
+  })
+}
+
+module "oke" {
+  source  = "oracle-terraform-modules/oke/oci"
+  version = "5.4.1"
+
+  providers = {
+    oci.home = oci
+  }
+
+  tenancy_id     = var.tenancy_ocid
+  compartment_id = var.compartment_ocid
+  region         = var.region
+
+  cluster_name                      = var.cluster_name
+  kubernetes_version                = var.kubernetes_version
+  cluster_type                      = "enhanced"
+  cni_type                          = "flannel"
+  pods_cidr                         = var.pods_cidr
+  services_cidr                     = var.services_cidr
+  vcn_cidrs                         = var.vcn_cidrs
+  ssh_public_key                    = file(var.ssh_public_key_path)
+  output_detail                     = true
+  create_vcn                        = true
+  create_bastion                    = false
+  create_operator                   = false
+  control_plane_is_public           = false
+  assign_public_ip_to_control_plane = false
+  worker_is_public                  = false
+  allow_worker_internet_access      = true
+  allow_pod_internet_access         = true
+  allow_worker_ssh_access           = false
+  preferred_load_balancer           = "internal"
+  load_balancers                    = "internal"
+  freeform_tags                     = { all = local.common_tags }
+
+  subnets = {
+    cp = {
+      create  = "always"
+      newbits = 13
+      netnum  = 2
+    }
+    workers = {
+      create  = "always"
+      newbits = 2
+      netnum  = 1
+    }
+    pods = {
+      create  = "always"
+      newbits = 2
+      netnum  = 2
+    }
+    int_lb = {
+      create  = "always"
+      newbits = 11
+      netnum  = 16
+    }
+    pub_lb = {
+      create = "never"
+    }
+    bastion = {
+      create = "never"
+    }
+    operator = {
+      create = "never"
+    }
+  }
+
+  worker_pool_mode = "node-pool"
+  worker_pool_size = 1
+  worker_pools = {
+    cpu = {
+      size             = var.cpu_pool_size
+      shape            = var.cpu_shape
+      ocpus            = var.cpu_ocpus
+      memory           = var.cpu_memory_gbs
+      boot_volume_size = 100
+      assign_public_ip = false
+      create           = true
+    }
+    gpu = {
+      size             = var.gpu_pool_size
+      shape            = var.gpu_shape
+      boot_volume_size = var.gpu_boot_volume_size
+      assign_public_ip = false
+      create           = true
+      placement_ads    = var.gpu_placement_ads
+    }
+  }
+}
+
+resource "oci_bastion_bastion" "oci_bastion" {
+  compartment_id               = var.compartment_ocid
+  bastion_type                 = "STANDARD"
+  target_subnet_id             = module.oke.worker_subnet_id
+  client_cidr_block_allow_list = var.bastion_client_cidrs
+  max_session_ttl_in_seconds   = 10800
+  name                         = "${var.cluster_name}-bastion"
+  freeform_tags                = local.common_tags
+}
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/outputs.tf b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/outputs.tf
new file mode 100644
index 00000000..c39a82ee
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/outputs.tf
@@ -0,0 +1,34 @@
+output "cluster_id" {
+  description = "OKE cluster OCID."
+  value       = module.oke.cluster_id
+}
+
+output "cluster_endpoints" {
+  description = "Cluster endpoints; private endpoint should be used."
+  value       = module.oke.cluster_endpoints
+}
+
+output "apiserver_private_host" {
+  description = "Private control-plane host."
+  value       = module.oke.apiserver_private_host
+}
+
+output "vcn_id" {
+  description = "VCN used by the Nemotron deployment."
+  value       = module.oke.vcn_id
+}
+
+output "control_plane_subnet_id" {
+  description = "Private control-plane subnet."
+  value       = module.oke.control_plane_subnet_id
+}
+
+output "worker_subnet_id" {
+  description = "Private worker subnet."
+  value       = module.oke.worker_subnet_id
+}
+
+output "oci_bastion_id" {
+  description = "OCI Bastion service OCID for creating private sessions."
+  value       = oci_bastion_bastion.oci_bastion.id
+}
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/terraform.tfvars.example b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/terraform.tfvars.example
new file mode 100644
index 00000000..9a2bab0c
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/terraform.tfvars.example
@@ -0,0 +1,12 @@
+tenancy_ocid        = "ocid1.tenancy.oc1..exampleuniqueID"
+compartment_ocid    = "ocid1.compartment.oc1..exampleuniqueID"
+config_file_profile = "API_KEY_AUTH"
+region              = "us-phoenix-1"
+cluster_name        = "nemotron-phx-private"
+ssh_public_key_path = "~/.ssh/id_ed25519.pub"
+
+# Restrict Bastion session creation to your current client egress CIDR.
+bastion_client_cidrs = ["203.0.113.10/32"]
+
+# The validated deployment used Phoenix AD-2 for the A10 node pool.
+gpu_placement_ads = [2]
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/variables.tf b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/variables.tf
new file mode 100644
index 00000000..165cabf5
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/variables.tf
@@ -0,0 +1,115 @@
+variable "tenancy_ocid" {
+  description = "OCI tenancy OCID."
+  type        = string
+}
+
+variable "compartment_ocid" {
+  description = "Compartment where the OKE cluster and Bastion service will be created."
+  type        = string
+}
+
+variable "region" {
+  description = "OCI region for the deployment."
+  type        = string
+  default     = "us-phoenix-1"
+}
+
+variable "config_file_profile" {
+  description = "OCI CLI config profile name."
+  type        = string
+  default     = "DEFAULT"
+}
+
+variable "cluster_name" {
+  description = "Name prefix for the private Nemotron OKE deployment."
+  type        = string
+  default     = "nemotron-oci-phx"
+}
+
+variable "ssh_public_key_path" {
+  description = "Path to the OpenSSH public key file used for private worker access."
+  type        = string
+}
+
+variable "vcn_cidrs" {
+  description = "VCN CIDR blocks for the deployment."
+  type        = list(string)
+  default     = ["10.0.0.0/16"]
+}
+
+variable "pods_cidr" {
+  description = "Kubernetes pods CIDR."
+  type        = string
+  default     = "10.244.0.0/16"
+}
+
+variable "services_cidr" {
+  description = "Kubernetes services CIDR."
+  type        = string
+  default     = "10.96.0.0/16"
+}
+
+variable "kubernetes_version" {
+  description = "OKE Kubernetes version."
+  type        = string
+  default     = "v1.33.1"
+}
+
+variable "cpu_pool_size" {
+  description = "Number of CPU worker nodes."
+  type        = number
+  default     = 1
+}
+
+variable "cpu_shape" {
+  description = "Shape for the CPU worker pool."
+  type        = string
+  default     = "VM.Standard.E5.Flex"
+}
+
+variable "cpu_ocpus" {
+  description = "OCPUs for each CPU worker if using a flex shape."
+  type        = number
+  default     = 2
+}
+
+variable "cpu_memory_gbs" {
+  description = "Memory in GB for each CPU worker if using a flex shape."
+  type        = number
+  default     = 16
+}
+
+variable "gpu_pool_size" {
+  description = "Number of GPU worker nodes."
+  type        = number
+  default     = 1
+}
+
+variable "gpu_shape" {
+  description = "Shape for the GPU worker pool."
+  type        = string
+  default     = "VM.GPU.A10.1"
+}
+
+variable "gpu_boot_volume_size" {
+  description = "Boot volume size for GPU workers."
+  type        = number
+  default     = 200
+}
+
+variable "gpu_placement_ads" {
+  description = "Availability domains to target for the GPU node pool. Phoenix AD-2 is `[2]`."
+  type        = list(number)
+  default     = [2]
+}
+
+variable "bastion_client_cidrs" {
+  description = "CIDR blocks allowed to create OCI Bastion sessions."
+  type        = list(string)
+}
+
+variable "freeform_tags" {
+  description = "Optional freeform tags."
+  type        = map(string)
+  default     = {}
+}
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/versions.tf b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/versions.tf
new file mode 100644
index 00000000..1c9c0264
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/terraform/versions.tf
@@ -0,0 +1,10 @@
+terraform {
+  required_version = ">= 1.5.0"
+
+  required_providers {
+    oci = {
+      source  = "oracle/oci"
+      version = ">= 7.30.0"
+    }
+  }
+}
diff --git a/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/vllm_oke_phoenix_private_values.yaml b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/vllm_oke_phoenix_private_values.yaml
new file mode 100644
index 00000000..076bcb83
--- /dev/null
+++ b/usage-cookbook/Llama-3.1-Nemotron-Nano-8B-v1/vllm_oke_phoenix_private_values.yaml
@@ -0,0 +1,30 @@
+# Validated private OCI OKE deployment values for
+# nvidia/Llama-3.1-Nemotron-Nano-8B-v1 on a single VM.GPU.A10.1 node.
+
+servingEngineSpec:
+  runtimeClassName: ""
+  modelSpec:
+    - name: "llama31-nemotron-nano-8b"
+      repository: "vllm/vllm-openai"
+      tag: "latest"
+      modelURL: "nvidia/Llama-3.1-Nemotron-Nano-8B-v1"
+      enableTool: true
+      toolCallParser: "llama3_json"
+      chatTemplate: "/vllm-workspace/examples/tool_chat_template_llama3.1_json.jinja"
+      replicaCount: 1
+      requestCPU: 4
+      requestMemory: "24Gi"
+      requestGPU: 1
+      pvcStorage: "120Gi"
+      pvcAccessMode:
+        - ReadWriteOnce
+      storageClass: "oci-block-storage-enc"
+      nodeSelector:
+        app: gpu
+      tolerations:
+        - key: "nvidia.com/gpu"
+          operator: "Exists"
+          effect: "NoSchedule"
+      vllmConfig:
+        maxModelLen: 4096
+        gpuMemoryUtilization: 0.95
diff --git a/usage-cookbook/README.md b/usage-cookbook/README.md
index f7d79b5c..001121f6 100644
--- a/usage-cookbook/README.md
+++ b/usage-cookbook/README.md
@@ -13,5 +13,4 @@ This directory contains cookbook-style guides showing how to deploy and use the
 - **SGLang Deployment** - Tutorials on serving and interacting with Nemotron via SGLang
 - **NIM Microservice** - Guide to deploying Nemotron as scalable, production-ready endpoints using NVIDIA Inference Microservices (NIM).
 - **Hugging Face Transformers** - Direct loading and inference of Nemotron models with Hugging Face Transformers
-
-
+- **OCI OKE Private Deployment** - A Phoenix-only private deployment guide for `nvidia/Llama-3.1-Nemotron-Nano-8B-v1` using OKE, OCI Bastion service, and `vLLM`, providing a reproducible OCI path comparable to common AWS GPU/Kubernetes deployment patterns.