AI-Hypercomputer · ankitkinra · Jan 30, 2026 · Jan 28, 2026 · Jan 28, 2026 · Jan 28, 2026
diff --git a/training/ironwood/deepseek3-671b/4k-bf16-tpu7x-4x4x8/xpk/README.md b/training/ironwood/deepseek3-671b/4k-bf16-tpu7x-4x4x8/xpk/README.md
@@ -238,15 +238,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/deepseek3-671b/4k-bf16-tpu7x-4x4x8/xpk
+```
+
 ### Run deepseek3-671b Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the deepseek3-671b pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -282,13 +292,19 @@ Please note that `fsdp_shard_on_exp=true` only works if num of experts is divisi
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.
 

diff --git a/training/ironwood/deepseek3-671b/4k-bf16-tpu7x-4x8x8/xpk/README.md b/training/ironwood/deepseek3-671b/4k-bf16-tpu7x-4x8x8/xpk/README.md
@@ -238,15 +238,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/deepseek3-671b/4k-bf16-tpu7x-4x8x8/xpk
+```
+
 ### Run deepseek3-671b Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the deepseek3-671b pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -275,13 +285,19 @@ are expected to use the defaults within the specified `WORKLOAD_IMAGE`.
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.
 

diff --git a/training/ironwood/deepseek3-671b/4k-fp8-tpu7x-4x4x8/xpk/README.md b/training/ironwood/deepseek3-671b/4k-fp8-tpu7x-4x4x8/xpk/README.md
@@ -239,15 +239,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/deepseek3-671b/4k-fp8-tpu7x-4x4x8/xpk
+```
+
 ### Run deepseek-v3 Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the deepseek-v3 pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -298,13 +308,19 @@ To realize these gains, the recipe employs a w8a8g8 (8-bit weights, activations
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.
 

diff --git a/training/ironwood/deepseek3-671b/4k-fp8-tpu7x-4x8x8/xpk/README.md b/training/ironwood/deepseek3-671b/4k-fp8-tpu7x-4x8x8/xpk/README.md
@@ -239,15 +239,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/deepseek3-671b/4k-fp8-tpu7x-4x8x8/xpk
+```
+
 ### Run deepseek3-671b Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the deepseek3-671b pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -298,13 +308,19 @@ To realize these gains, the recipe employs a w8a8g8 (8-bit weights, activations
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.
 

diff --git a/training/ironwood/gpt-oss-120b/8k-bf16-tpu7x-4x4x4/xpk/README.md b/training/ironwood/gpt-oss-120b/8k-bf16-tpu7x-4x4x4/xpk/README.md
@@ -238,15 +238,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/gpt-oss-120b/8k-bf16-tpu7x-4x4x4/xpk
+```
+
 ### Run gpt-oss-120b Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the gpt-oss-120b pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -275,13 +285,19 @@ are expected to use the defaults within the specified `WORKLOAD_IMAGE`.
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.
 

diff --git a/training/ironwood/gpt-oss-120b/8k-bf16-tpu7x-4x8x8/xpk/README.md b/training/ironwood/gpt-oss-120b/8k-bf16-tpu7x-4x8x8/xpk/README.md
@@ -238,15 +238,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/gpt-oss-120b/8k-bf16-tpu7x-4x8x8/xpk
+```
+
 ### Run gpt-oss-120b Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the gpt-oss-120b pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -275,13 +285,19 @@ are expected to use the defaults within the specified `WORKLOAD_IMAGE`.
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.
 

diff --git a/training/ironwood/llama3.1-405b/8k-bf16-tpu7x-4x8x8/README.md b/training/ironwood/llama3.1-405b/8k-bf16-tpu7x-4x8x8/README.md
@@ -238,15 +238,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/llama3.1-405b/8k-bf16-tpu7x-4x8x8
+```
+
 ### Run llama3.1-405b Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the llama3.1-405b pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -275,13 +285,19 @@ are expected to use the defaults within the specified `WORKLOAD_IMAGE`.
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.
 

diff --git a/training/ironwood/llama3.1-405b/8k-fp8-tpu7x-4x8x8/README.md b/training/ironwood/llama3.1-405b/8k-fp8-tpu7x-4x8x8/README.md
@@ -239,15 +239,25 @@ does this for you already):
 gcloud container clusters get-credentials ${CLUSTER_NAME} --project ${PROJECT_ID} --zone ${ZONE}
 ```
 
+## Get the recipe
+```bash
+cd ~
+git clone https://github.com/ai-hypercomputer/tpu-recipes.git
+cd tpu-recipes/training/ironwood/llama3.1-405b/8k-fp8-tpu7x-4x8x8
+```
+
 ### Run llama3.1-405b Pretraining Workload
 
 The `run_recipe.sh` script contains all the necessary environment variables and
 configurations to launch the llama3.1-405b pretraining workload.
 
-To run the benchmark, first make the script executable and then run it:
+Before execution, use `nano ./run_recipe.sh` to edit the script and configure the environment variables to match your specific environment.
+
+To configure and run the benchmark:
 
 ```bash
 chmod +x run_recipe.sh
+nano ./run_recipe.sh
 ./run_recipe.sh
 ```
 
@@ -276,13 +286,19 @@ are expected to use the defaults within the specified `WORKLOAD_IMAGE`.
 ## Monitor the job
 
 To monitor your job's progress, you can use kubectl to check the Jobset status
-and logs:
+and stream logs:
 
 ```bash
 kubectl get jobset -n default ${WORKLOAD_NAME}
-kubectl logs -f -n default jobset/${WORKLOAD_NAME}-0-worker-0
+
+# List pods to find the specific name (e.g., deepseek3-0-0-xxxx)
+kubectl get pods | grep ${WORKLOAD_NAME}
 ```
+Then, stream the logs from the running pod (replace <POD_NAME> with the name you found):
 
+```bash
+kubectl logs -f <POD_NAME>
+```
 You can also monitor your cluster and TPU usage through the Google Cloud
 Console.