amend

hengtaoguo · hengtaoguo · commit c83674338482 · 2025-12-10T20:53:17.000Z
diff --git a/docs/install_maxtext.md b/docs/install_maxtext.md
@@ -156,27 +156,3 @@ install_maxtext_github_deps
 ```
 
 3.  **Run tests:** Run MaxText tests to ensure there are no regressions.
-
-## Appendix: Install XPK for MaxText Multi-host Workloads
-
-> **_NOTE:_** XPK is only required for multi-host TPU configurations (e.g., v5p-128, v6e-256). For single-host training, XPK is not needed and you can run MaxText directly on your TPU VM.
-
-XPK (Accelerated Processing Kit) is a tool designed to simplify the orchestration and management of workloads on Google Kubernetes Engine (GKE) clusters with TPU or GPU accelerators. In MaxText, we use XPK to submit both pre-training and post-training jobs on multi-host TPU configurations.
-
-For your convenience, we provide a minimal installation path below:
-```bash
-# Directly install xpk using pip
-pip install xpk
-
-# Install kubectl
-sudo apt-get update
-sudo apt install snapd
-sudo snap install kubectl --classic
-
-# Install gke-gcloud-auth-plugin
-echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
-curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
-sudo apt update && sudo apt-get install google-cloud-sdk-gke-gcloud-auth-plugin
-```
-
-For detailed setup instructions and advanced features, please refer to the [official XPK documentation](https://github.com/AI-Hypercomputer/xpk).
diff --git a/docs/tutorials/posttraining/rl.md b/docs/tutorials/posttraining/rl.md
@@ -78,7 +78,7 @@ export HF_TOKEN=<Hugging Face access token>
 export BASE_OUTPUT_DIRECTORY=<output directory to store run logs> # e.g., gs://my-bucket/my-output-directory
 
 export RUN_NAME=<name for this run> # e.g., $(date +%Y-%m-%d-%H-%M-%S)
-export MAXTEXT_CKPT_PATH=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME}/0/items
+export MAXTEXT_CKPT_PATH=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME}/0/items  # Actual checkpoint saved with an extra /0/items path suffix
 ```
 
 ## Get your model checkpoint
@@ -93,7 +93,7 @@ python3 -m pip install torch --index-url https://download.pytorch.org/whl/cpu
 python3 -m MaxText.utils.ckpt_conversion.to_maxtext src/MaxText/configs/base.yml \
     model_name=${HF_MODEL} \
     hf_access_token=${HF_TOKEN} \
-    base_output_directory=${MAXTEXT_CKPT_PATH} \
+    base_output_directory=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME} \
     scan_layers=True hardware=cpu skip_jax_distributed_system=true
 
 # Example of converting Llama3.1-70B using --lazy_load_tensor=true which uses around 86GB of RAM
@@ -117,7 +117,7 @@ Run the following command for GRPO:
 python3 -m src.MaxText.rl.train_rl src/MaxText/configs/rl.yml \
   model_name=${MODEL} \
   tokenizer_path=${TOKENIZER} \
-  load_parameters_path=${MAXTEXT_CKPT_PATH}/0/items \
+  load_parameters_path=${MAXTEXT_CKPT_PATH} \
   run_name=${RUN_NAME} \
   base_output_directory=${BASE_OUTPUT_DIRECTORY} \
   hf_access_token=${HF_TOKEN}
@@ -138,7 +138,7 @@ Run the following command for GSPO:
 python3 -m src.MaxText.rl.train_rl src/MaxText/configs/rl.yml \
   model_name=${MODEL} \
   tokenizer_path=${TOKENIZER} \
-  load_parameters_path=${MAXTEXT_CKPT_PATH}/0/items \
+  load_parameters_path=${MAXTEXT_CKPT_PATH} \
   run_name=${RUN_NAME} \
   base_output_directory=${BASE_OUTPUT_DIRECTORY} \
   hf_access_token=${HF_TOKEN} \
@@ -147,7 +147,7 @@ python3 -m src.MaxText.rl.train_rl src/MaxText/configs/rl.yml \
 
 The overview of what this run will do is as follows:
 
-1. We load a policy model and a reference model. Both are copies of `Llama3.1-8b-Instruct`.
+1. We load a policy model and a reference model. Both are copies of the model checkpoint you specified (e.g., `Llama3.1-8b-Instruct`).
 2. Evaluate the policy model's performance on GSM8K math reasoning benchmark.
 3. Train the policy model using GSPO.
 4. Evaluate the policy model's performance on GSM8K math reasoning benchmark after the post-training with GSPO. 
diff --git a/docs/tutorials/posttraining/rl_on_multi_host.md b/docs/tutorials/posttraining/rl_on_multi_host.md
@@ -116,7 +116,7 @@ bash dependencies/scripts/docker_upload_runner.sh CLOUD_IMAGE_NAME=${CLOUD_IMAGE
 
 ## Submit your RL workload via Pathways
 
-Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk).
+Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk). We also provide a quick guide for XPK installation and usage [here](https://maxtext.readthedocs.io/en/latest/run_maxtext/run_maxtext_via_xpk.html).
 
 ### Submit GRPO workload
 ```
diff --git a/docs/tutorials/posttraining/sft_on_multi_host.md b/docs/tutorials/posttraining/sft_on_multi_host.md
@@ -58,7 +58,7 @@ bash dependencies/scripts/docker_upload_runner.sh CLOUD_IMAGE_NAME=$DOCKER_IMAGE
 The `docker_upload_runner.sh` script uploads your Docker image to Artifact Registry.
 
 ## 2. Install XPK
-Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation-via-pip).
+Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation-via-pip). We also provide a quick guide for XPK installation and usage [here](https://maxtext.readthedocs.io/en/latest/run_maxtext/run_maxtext_via_xpk.html).
 
 ## 3. Create GKE cluster
 Use a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster).