diff --git a/docs/guides/run_python_notebook.md b/docs/guides/run_python_notebook.md index 3d60844be..7b864ee89 100644 --- a/docs/guides/run_python_notebook.md +++ b/docs/guides/run_python_notebook.md @@ -69,9 +69,9 @@ You can run Python notebooks on a local JupyterLab environment, giving you full ### Step 1: Set Up TPU VM -In Google Cloud Console: +In Google Cloud Console, create a standalone TPU VM: -1.a. **Compute Engine** → **TPU** → **Create TPU** +1.a. **Compute Engine** → **TPUs** → **Create TPU** 1.b. Example config: - **Name:** `maxtext-tpu-node` @@ -118,12 +118,12 @@ jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root ### Supervised Fine-Tuning (SFT) -- **`sft_qwen3_demo.ipynb`** → Qwen3-0.6B SFT training and evaluation on [OpenAI's GSM8K dataset](https://huggingface.co/datasets/openai/gsm8k) -- **`sft_llama3_demo.ipynb`** → Llama3.1-8B SFT training on [Hugging Face ultrachat_200k dataset](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) +- **`sft_qwen3_demo.ipynb`** → Qwen3-0.6B SFT training and evaluation on [OpenAI's GSM8K dataset](https://huggingface.co/datasets/openai/gsm8k). This notebook is friendly for beginners and runs successfully on Google Colab's free-tier v5e-1 TPU runtime. +- **`sft_llama3_demo.ipynb`** → Llama3.1-8B SFT training on [Hugging Face ultrachat_200k dataset](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k). We recommend running this on a v5p-8 TPU VM using the port-forwarding method. ### Reinforcement Learning (GRPO/GSPO) Training -- **`rl_llama3_demo.ipynb`** → GRPO/GSPO training on [OpenAI's GSM8K dataset](https://huggingface.co/datasets/openai/gsm8k) +- **`rl_llama3_demo.ipynb`** → GRPO/GSPO training on [OpenAI's GSM8K dataset](https://huggingface.co/datasets/openai/gsm8k). We recommend running this on a v5p-8 TPU VM using the port-forwarding method. ## Common Pitfalls & Debugging diff --git a/docs/install_maxtext.md b/docs/install_maxtext.md index d00b63380..275f0b975 100644 --- a/docs/install_maxtext.md +++ b/docs/install_maxtext.md @@ -122,7 +122,7 @@ seed-env \ --output-dir=generated_gpu_artifacts ``` -## 4. Update Project Files +## Step 4: Update Project Files After generating the new requirements, you need to update the files in the MaxText repository. @@ -133,7 +133,7 @@ After generating the new requirements, you need to update the files in the MaxTe 2. **Update `extra_deps_from_github.txt` (if necessary):** Currently, MaxText uses a few dependencies, such as `mlperf-logging` and `google-jetstream`, that are installed directly from GitHub source. These are defined in `base_requirements/requirements.txt`, and the `seed-env` tool will carry them over to the generated requirements files. -## 5. Verify the New Dependencies +## Step 5: Verify the New Dependencies Finally, test that the new dependencies install correctly and that MaxText runs as expected. @@ -155,4 +155,4 @@ uv pip install -e .[tpu] --resolution=lowest install_maxtext_github_deps ``` -3. **Run tests:** Run MaxText tests to ensure there are no regressions. \ No newline at end of file +3. **Run tests:** Run MaxText tests to ensure there are no regressions. diff --git a/docs/tutorials/posttraining/rl.md b/docs/tutorials/posttraining/rl.md index 497713164..d28a308cf 100644 --- a/docs/tutorials/posttraining/rl.md +++ b/docs/tutorials/posttraining/rl.md @@ -29,7 +29,7 @@ For efficient model inference and response generation during this process, we re Let's get started! ## Create virtual environment and Install MaxText dependencies -If you have already completed the [MaxText installation](https://github.com/AI-Hypercomputer/maxtext/blob/main/docs/guides/install_maxtext.md), you can skip to the next section for post-training dependencies installations. Otherwise, please install `MaxText` using the following commands before proceeding. +If you have already completed the [MaxText installation](../../install_maxtext.md), you can skip to the next section for post-training dependencies installations. Otherwise, please install `MaxText` using the following commands before proceeding. ```bash # 1. Clone the repository git clone https://github.com/AI-Hypercomputer/maxtext.git @@ -78,12 +78,20 @@ export HF_TOKEN= export BASE_OUTPUT_DIRECTORY= # e.g., gs://my-bucket/my-output-directory export RUN_NAME= # e.g., $(date +%Y-%m-%d-%H-%M-%S) -export MAXTEXT_CKPT_PATH=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME}/0/items ``` ## Get your model checkpoint -You can convert a Hugging Face checkpoint to MaxText format using the `src/MaxText/utils/ckpt_conversion/to_maxtext.py` script. This is useful if you have a pre-trained model from Hugging Face that you want to use with MaxText. +### Option 1: Using an existing MaxText checkpoint + +If you already have a MaxText-compatible model checkpoint, simply set the following environment variable and move on to the next section. +```bash +export MAXTEXT_CKPT_PATH= # e.g., gs://my-bucket/my-model-checkpoint/0/items +``` + +### Option 2: Converting from a Hugging Face checkpoint + +Otherwise, you can convert a Hugging Face checkpoint to MaxText format using the `src/MaxText/utils/ckpt_conversion/to_maxtext.py` script. This is useful if you have a pre-trained model from Hugging Face that you want to use with MaxText. First, ensure you have the necessary dependencies installed. Then, run the conversion script on a CPU machine. For large models, it is recommended to use the `--lazy_load_tensors` flag to reduce memory usage during conversion. This command will download the Hugging Face model and convert it to the MaxText format, saving it to the specified GCS bucket. @@ -93,7 +101,7 @@ python3 -m pip install torch --index-url https://download.pytorch.org/whl/cpu python3 -m MaxText.utils.ckpt_conversion.to_maxtext src/MaxText/configs/base.yml \ model_name=${HF_MODEL} \ hf_access_token=${HF_TOKEN} \ - base_output_directory=${MAXTEXT_CKPT_PATH} \ + base_output_directory=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME} \ scan_layers=True hardware=cpu skip_jax_distributed_system=true # Example of converting Llama3.1-70B using --lazy_load_tensor=true which uses around 86GB of RAM @@ -107,6 +115,11 @@ python3 -m MaxText.utils.ckpt_conversion.to_maxtext MaxText/configs/base.yml \ --lazy_load_tensors=true ``` +The converted checkpoint will be saved at the following location. Set this environment variable to use it in the following GRPO/GSPO training sessions: +```bash +export MAXTEXT_CKPT_PATH=${BASE_OUTPUT_DIRECTORY}/${RUN_NAME}/0/items +``` + ## Run GRPO @@ -125,7 +138,7 @@ python3 -m src.MaxText.rl.train_rl src/MaxText/configs/rl.yml \ The overview of what this run will do is as follows: -1. We load a policy model and a reference model. Both are copies of `Llama3.1-8b-Instruct`. +1. We load a policy model and a reference model. Both are copies of the model checkpoint you specified (e.g., `Llama3.1-8b-Instruct`). 2. Evaluate the policy model's performance on GSM8K math reasoning benchmark. 3. Train the policy model using GRPO. 4. Evaluate the policy model's performance on GSM8K math reasoning benchmark after the post-training with GRPO. @@ -136,18 +149,18 @@ Run the following command for GSPO: ``` python3 -m src.MaxText.rl.train_rl src/MaxText/configs/rl.yml \ - model_name=llama3.1-8b \ - tokenizer_path=meta-llama/Llama-3.1-8B-Instruct \ - load_parameters_path=gs://path/to/checkpoint/0/items \ - run_name=$WORKLOAD \ - base_output_directory=$OUTPUT_PATH \ - hf_access_token=$HF_TOKEN \ + model_name=${MODEL} \ + tokenizer_path=${TOKENIZER} \ + load_parameters_path=${MAXTEXT_CKPT_PATH} \ + run_name=${RUN_NAME} \ + base_output_directory=${BASE_OUTPUT_DIRECTORY} \ + hf_access_token=${HF_TOKEN} \ loss_algo=gspo-token ``` The overview of what this run will do is as follows: -1. We load a policy model and a reference model. Both are copies of `Llama3.1-8b-Instruct`. +1. We load a policy model and a reference model. Both are copies of the model checkpoint you specified (e.g., `Llama3.1-8b-Instruct`). 2. Evaluate the policy model's performance on GSM8K math reasoning benchmark. 3. Train the policy model using GSPO. 4. Evaluate the policy model's performance on GSM8K math reasoning benchmark after the post-training with GSPO. diff --git a/docs/tutorials/posttraining/rl_on_multi_host.md b/docs/tutorials/posttraining/rl_on_multi_host.md index 54dc717a7..4f1d2bd6e 100644 --- a/docs/tutorials/posttraining/rl_on_multi_host.md +++ b/docs/tutorials/posttraining/rl_on_multi_host.md @@ -29,7 +29,7 @@ For efficient model inference and response generation during this process, we re Let's get started! ## Create virtual environment and Install MaxText dependencies -Follow instructions in [Install MaxText](https://github.com/AI-Hypercomputer/maxtext/blob/main/docs/guides/install_maxtext.md), but +Follow instructions in [Install MaxText](../../install_maxtext.md), but recommend creating the virtual environment outside the `maxtext` directory. @@ -93,7 +93,7 @@ You can install the required dependencies using either of the following two opti ### Option 1: Installing stable releases of tunix and vllm-tpu Run the following bash script to create a docker image with all the dependencies of MaxText, Tunix, vLLM and tpu-inference installed. -In addition to MaxText dependencies, primarily, it installs `vllm-tpu` which is [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) and thereby providing TPU inference for vLLM, with unified JAX and PyTorch support. +In addition to MaxText dependencies, primarily, it installs `vllm-tpu` which is [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) and thereby providing TPU inference for vLLM, with unified JAX and PyTorch support. This build process takes approximately 10 to 15 minutes. ``` bash dependencies/scripts/docker_build_dependency_image.sh MODE=post-training @@ -109,13 +109,14 @@ bash dependencies/scripts/docker_build_dependency_image.sh MODE=post-training PO ``` ### Upload the dependency docker image along with MaxText code +> **Note:** You will need the [**Artifact Registry Writer**](https://docs.cloud.google.com/artifact-registry/docs/access-control#permissions) role to push Docker images to your project's Artifact Registry and to allow the cluster to pull them during workload execution. If you don't have this permission, contact your project administrator to grant you this role through "Google Cloud Console -> IAM -> Grant access". ``` bash dependencies/scripts/docker_upload_runner.sh CLOUD_IMAGE_NAME=${CLOUD_IMAGE_NAME} ``` ## Submit your RL workload via Pathways -Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk). +Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via XPK. You can install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk/blob/main/docs/installation.md). ### Submit GRPO workload ``` diff --git a/docs/tutorials/posttraining/sft_on_multi_host.md b/docs/tutorials/posttraining/sft_on_multi_host.md index 1cad47186..80a008c5c 100644 --- a/docs/tutorials/posttraining/sft_on_multi_host.md +++ b/docs/tutorials/posttraining/sft_on_multi_host.md @@ -43,12 +43,13 @@ gcloud auth application-default login gcloud auth configure-docker docker run hello-world ``` -Then run the following command to create a local Docker image named `maxtext_base_image`. +Then run the following command to create a local Docker image named `maxtext_base_image`. This build process takes approximately 10 to 15 minutes. ```bash bash dependencies/scripts/docker_build_dependency_image.sh MODE=post-training ``` ### 1.3. Upload the Docker image to Artifact Registry +> **Note:** You will need the [**Artifact Registry Writer**](https://docs.cloud.google.com/artifact-registry/docs/access-control#permissions) role to push Docker images to your project's Artifact Registry and to allow the cluster to pull them during workload execution. If you don't have this permission, contact your project administrator to grant you this role through "Google Cloud Console -> IAM -> Grant access". ```bash # Replace `$USER_runner` with your desired image name export DOCKER_IMAGE_NAME=${USER}_runner @@ -57,7 +58,7 @@ bash dependencies/scripts/docker_upload_runner.sh CLOUD_IMAGE_NAME=$DOCKER_IMAGE The `docker_upload_runner.sh` script uploads your Docker image to Artifact Registry. ## 2. Install XPK -Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation-via-pip). +Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk/blob/main/docs/installation.md). ## 3. Create GKE cluster Use a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster).