Skip to content

Commit 12a7e12

Browse files
update README
1 parent 1e4c964 commit 12a7e12

File tree

1 file changed

+4
-4
lines changed
  • inference/trillium/JetStream-Maxtext/Llama-4-Maverick-17B-128E

1 file changed

+4
-4
lines changed

inference/trillium/JetStream-Maxtext/Llama-4-Maverick-17B-128E/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -248,8 +248,8 @@ gcloud container clusters get-credentials $CLUSTER_NAME --region $CLUSTER_REGION
248248
The recipe serves Llama-4-Maverick-17B-128E model using JetStream MaxText Engine on `v6e-32` mulithost slice of TPU v6e Trillium
249249

250250
To start the inference, the recipe launches JetStream MaxText Engine that does the following steps:
251-
1. Downloads the full Llama-4-Maverick-17B-128E model PyTorch checkpoints from [Hugging Face](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Original).
252-
2. Convert the model checkpoints from PyTorch format to JAX Orbax format.
251+
1. Downloads the full Llama-4-Maverick-17B-128E model Hugging Face checkpoints from [Hugging Face](https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E).
252+
2. Convert the model checkpoints from Hugging Face format to JAX Orbax format.
253253
3. Start the JetStream MaxText Engine server.
254254
3. Inference is ready to respond to requests and run benchmarks
255255

@@ -267,8 +267,8 @@ The recipe uses the helm chart to run the above steps.
267267
--dry-run=client -o yaml | kubectl apply -f -
268268
```
269269

270-
2. Convert the checkpoint from PyTorch to Orbax
271-
This job converts the checkpoint from PyTorch format to JAX Orbax format and unscans it for performant serving. This unscanned checkpoint is then stored in the mounted GCS bucket so that it can be used by the TPU nodepool to bring up the JetStream serve in the next step.
270+
2. Convert the checkpoint from Hugging Face to Orbax
271+
This job converts the checkpoint from Hugging Face format to JAX Orbax format and unscans it for performant serving. This unscanned checkpoint is then stored in the mounted GCS bucket so that it can be used by the TPU nodepool to bring up the JetStream serve in the next step.
272272

273273
```bash
274274
cd $RECIPE_ROOT

0 commit comments

Comments
 (0)