Merge pull request #66 from AI-Hypercomputer/bvandermoon-tpu-recipes

bvandermoon · web-flow · commit 748140ed3db5 · 2025-04-16T12:36:35.000-07:00
Bump MaxText recipes to new tpu-recipes version
diff --git a/training/trillium/GPT3-175B-MaxText/bf16/README.md b/training/trillium/GPT3-175B-MaxText/bf16/README.md
@@ -8,9 +8,9 @@ Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/m
 ### Install MaxText and Build Docker Image
 Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/main/training/trillium/MAXTEXT_README.md) to install maxtext and build the docker image. The following variables should be set:
 
-In step 1, use the MaxText [tpu-recipes-v0.1.1](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.1) tag to run this recipe:
+In step 1, use the MaxText [tpu-recipes-v0.1.2](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.2) tag to run this recipe:
 ```
-git checkout tpu-recipes-v0.1.1
+git checkout tpu-recipes-v0.1.2
 ```
 
 In step 2, use the jax-stable-stack image containing JAX 0.5.2:
@@ -43,7 +43,7 @@ completed step: 15, seconds: 17.182, TFLOP/s/device: 384.891, Tokens/s/device: 3
 
 ### Workload Details
 
-For reference, here are the `gpt_3_175b_bf16` workload details as found in `MaxText@tpu-recipes-v0.1.0`:
+For reference, here are the `gpt_3_175b_bf16` workload details as found in `MaxText@tpu-recipes-v0.1.2`:
 
 ```
 MaxTextModel(
@@ -72,4 +72,4 @@ MaxTextModel(
 )
 ```
 
-This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.0/benchmarks/maxtext_trillium_model_configs.py#L287) file within the MaxText repository.
+This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.2/benchmarks/maxtext_trillium_model_configs.py) file within the MaxText repository.
diff --git a/training/trillium/Llama2-70B-MaxText/README.md b/training/trillium/Llama2-70B-MaxText/README.md
@@ -8,9 +8,9 @@ Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/m
 ### Install MaxText and Build Docker Image
 Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/main/training/trillium/MAXTEXT_README.md) to install maxtext and build the docker image. The following variables should be set:
 
-In step 1, use the MaxText [tpu-recipes-v0.1.1](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.1) tag to run this recipe:
+In step 1, use the MaxText [tpu-recipes-v0.1.2(https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.2) tag to run this recipe:
 ```
-git checkout tpu-recipes-v0.1.1
+git checkout tpu-recipes-v0.1.2
 ```
 
 In step 2, use the jax-stable-stack image containing JAX 0.5.2:
@@ -43,7 +43,7 @@ completed step: 16, seconds: 9.052, TFLOP/s/device: 402.274, Tokens/s/device: 90
 
 ### Workload Details
 
-For reference, here are the `llama2_70b_4096_sc` workload details as found in `MaxText@tpu-recipes-v0.1.0`:
+For reference, here are the `llama2_70b_4096_sc` workload details as found in `MaxText@tpu-recipes-v0.1.2`:
 
 ```
 MaxTextModel(
@@ -76,4 +76,4 @@ MaxTextModel(
 )
 ```
 
-This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.0/benchmarks/maxtext_trillium_model_configs.py#L410) file within the MaxText repository.
+This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.2/benchmarks/maxtext_trillium_model_configs.py) file within the MaxText repository.
diff --git a/training/trillium/Llama3-8B-MaxText/v6e-8/README.md b/training/trillium/Llama3-8B-MaxText/v6e-8/README.md
@@ -8,9 +8,9 @@ Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/m
 ### Install MaxText and Build Docker Image
 Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/main/training/trillium/MAXTEXT_README.md) to install maxtext and build the docker image. The following variables should be set:
 
-In step 1, use the MaxText [tpu-recipes-v0.1.1](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.1) tag to run this recipe:
+In step 1, use the MaxText [tpu-recipes-v0.1.2](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.2) tag to run this recipe:
 ```
-git checkout tpu-recipes-v0.1.1
+git checkout tpu-recipes-v0.1.2
 ```
 
 In step 2, use the jax-stable-stack image containing JAX 0.5.2:
@@ -44,7 +44,7 @@ If you would like to run on multiple slices of v6e-8, you may modify the `--num_
 
 ### Workload Details
 
-For reference, here are the `llama3_1_8b_8192_no_collective_matmul` workload details as found in `MaxText@tpu-recipes-v0.1.0`:
+For reference, here are the `llama3_1_8b_8192_no_collective_matmul` workload details as found in `MaxText@tpu-recipes-v0.1.2`:
 
 ```
   MaxTextModel(
@@ -90,4 +90,4 @@ For reference, here are the `llama3_1_8b_8192_no_collective_matmul` workload det
   )
 ```
 
-This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.0/benchmarks/maxtext_trillium_model_configs.py#L858-L901) file within the MaxText repository.
+This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.2/benchmarks/maxtext_trillium_model_configs.py) file within the MaxText repository.
diff --git a/training/trillium/Llama3.1-405B-MaxText/README.md b/training/trillium/Llama3.1-405B-MaxText/README.md
@@ -8,9 +8,9 @@ Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/m
 ### Install MaxText and Build Docker Image
 Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/main/training/trillium/MAXTEXT_README.md) to install maxtext and build the docker image. The following variables should be set:
 
-In step 1, use the MaxText [tpu-recipes-v0.1.1](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.1) tag to run this recipe:
+In step 1, use the MaxText [tpu-recipes-v0.1.2](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.2) tag to run this recipe:
 ```
-git checkout tpu-recipes-v0.1.1
+git checkout tpu-recipes-v0.1.2
 ```
 
 In step 2, use the jax-stable-stack image containing JAX 0.5.2:
@@ -43,7 +43,7 @@ completed step: 14, seconds: 54.803, TFLOP/s/device: 392.454, Tokens/s/device: 1
 
 ### Workload Details
 
-For reference, here are the `llama3_1_405b_8192_pure_fsdp_ici` workload details as found in `MaxText@tpu-recipes-v0.1.0`:
+For reference, here are the `llama3_1_405b_8192_pure_fsdp_ici` workload details as found in `MaxText@tpu-recipes-v0.1.2`:
 
 ```
 MaxTextModel(
@@ -76,4 +76,4 @@ MaxTextModel(
 )
 ```
 
-This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.0/benchmarks/maxtext_trillium_model_configs.py#L767) file within the MaxText repository.
+This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.2/benchmarks/maxtext_trillium_model_configs.py) file within the MaxText repository.
diff --git a/training/trillium/Llama3.1-70B-MaxText/README.md b/training/trillium/Llama3.1-70B-MaxText/README.md
@@ -8,9 +8,9 @@ Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/m
 ### Install MaxText and Build Docker Image
 Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/main/training/trillium/MAXTEXT_README.md) to install maxtext and build the docker image. The following variables should be set:
 
-In step 1, use the MaxText [tpu-recipes-v0.1.1](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.1) tag to run this recipe:
+In step 1, use the MaxText [tpu-recipes-v0.1.2](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.2) tag to run this recipe:
 ```
-git checkout tpu-recipes-v0.1.1
+git checkout tpu-recipes-v0.1.2
 ```
 
 In step 2, use the jax-stable-stack image containing JAX 0.5.2:
@@ -44,7 +44,7 @@ If you would like to run on multiple slices of v6e-256, you may modify the `--nu
 
 ### Workload Details
 
-For reference, here are the `llama3_1_70b_8192` workload details as found in `MaxText@tpu-recipes-v0.1.0`:
+For reference, here are the `llama3_1_70b_8192` workload details as found in `MaxText@tpu-recipes-v0.1.2`:
 
 ```
   MaxTextModel(
@@ -87,4 +87,4 @@ For reference, here are the `llama3_1_70b_8192` workload details as found in `Ma
   )
 ```
 
-This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/243b25e480f7550a0c389fa95cd3adcc716fe0df/benchmarks/maxtext_trillium_model_configs.py#L932-L972) file within the MaxText repository.
+This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/243b25e480f7550a0c389fa95cd3adcc716fe0df/benchmarks/maxtext_trillium_model_configs.py) file within the MaxText repository.
diff --git a/training/trillium/Mistral-7B-MaxText/README.md b/training/trillium/Mistral-7B-MaxText/README.md
@@ -8,9 +8,9 @@ Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/m
 ### Install MaxText and Build Docker Image
 Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/main/training/trillium/MAXTEXT_README.md) to install maxtext and build the docker image. The following variables should be set:
 
-In step 1, use the MaxText [tpu-recipes-v0.1.1](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.1) tag to run this recipe:
+In step 1, use the MaxText [tpu-recipes-v0.1.2](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.2) tag to run this recipe:
 ```
-git checkout tpu-recipes-v0.1.1
+git checkout tpu-recipes-v0.1.2
 ```
 
 In step 2, use the jax-stable-stack image containing JAX 0.5.2:
@@ -44,7 +44,7 @@ If you would like to run on multiple slices of v6e-8, you may modify the `--num_
 
 ### Workload Details
 
-For reference, here are the `mistral_7b` workload details as found in `MaxText@tpu-recipes-v0.1.0`:
+For reference, here are the `mistral_7b` workload details as found in `MaxText@tpu-recipes-v0.1.2`:
 
 ```
   MaxTextModel(
@@ -90,4 +90,4 @@ For reference, here are the `mistral_7b` workload details as found in `MaxText@t
   )
 ```
 
-This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.0/benchmarks/maxtext_trillium_model_configs.py#L1217-L1260) file within the MaxText repository.
+This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.2/benchmarks/maxtext_trillium_model_configs.py) file within the MaxText repository.
diff --git a/training/trillium/Mixtral-8x7B-MaxText/README.md b/training/trillium/Mixtral-8x7B-MaxText/README.md
@@ -8,9 +8,9 @@ Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/m
 ### Install MaxText and Build Docker Image
 Please follow this [link](https://github.com/AI-Hypercomputer/tpu-recipes/blob/main/training/trillium/MAXTEXT_README.md) to install maxtext and build the docker image. The following variables should be set:
 
-In step 1, use the MaxText [tpu-recipes-v0.1.1](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.1) tag to run this recipe:
+In step 1, use the MaxText [tpu-recipes-v0.1.2](https://github.com/AI-Hypercomputer/maxtext/releases/tag/tpu-recipes-v0.1.2) tag to run this recipe:
 ```
-git checkout tpu-recipes-v0.1.1
+git checkout tpu-recipes-v0.1.2
 ```
 
 In step 2, use the jax-stable-stack image containing JAX 0.5.2:
@@ -44,7 +44,7 @@ completed step: 11, seconds: 13.484, TFLOP/s/device: 302.311, Tokens/s/device: 3
 
 ### Workload Details
 
-For reference, here are the `mixtral_8x7b_dropped` workload details as found in `MaxText@tpu-recipes-v0.1.0`:
+For reference, here are the `mixtral_8x7b_dropped` workload details as found in `MaxText@tpu-recipes-v0.1.2`:
 
 ```
 MaxTextModel(
@@ -84,4 +84,4 @@ MaxTextModel(
 )
 ```
 
-This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.0/benchmarks/maxtext_trillium_model_configs.py#L1296) file within the MaxText repository.
+This equivalent workload code can be found in the [maxtext_trillium_model_configs.py](https://github.com/AI-Hypercomputer/maxtext/blob/tpu-recipes-v0.1.2/benchmarks/maxtext_trillium_model_configs.py) file within the MaxText repository.