You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/grpo.md
+16-2Lines changed: 16 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,8 +26,22 @@ And we use vLLM as the library for efficient model inference and generation.
26
26
In this tutorial we use a single host TPUVM such as `v6e-8/v5p-8`. Let's get started!
27
27
28
28
## Create virtual environment and Install MaxText dependencies
29
-
Follow instructions in [Install MaxText](https://github.com/AI-Hypercomputer/maxtext/blob/main/docs/guides/install_maxtext.md), but
30
-
recommend creating the virtual environment outside the `maxtext` directory.
29
+
If you have already completed the [MaxText installation](https://github.com/AI-Hypercomputer/maxtext/blob/main/docs/guides/install_maxtext.md), you can skip to the next section for vLLM and tpu-inference installations. Otherwise, please install MaxText using the following commands before proceeding.
## Build and Upload MaxText Docker Image with Tunix, vLLM, tpu-inference dependencies
85
+
Before building the Docker image, authenticate to [Google Artifact Registry](https://docs.cloud.google.com/artifact-registry/docs/docker/authentication#gcloud-helper) for permission to push your images and other access.
86
+
```bash
87
+
# Authenticate your user account for gcloud CLI access
88
+
gcloud auth login
89
+
# Configure application default credentials for Docker and other tools
90
+
gcloud auth application-default login
91
+
# Configure Docker credentials and test your access
92
+
gcloud auth configure-docker
93
+
docker run hello-world
94
+
```
95
+
96
+
You can install the required dependencies using either of the following two options:
85
97
86
-
### Installing stable releases of tunix and vllm-tpu
98
+
### Option 1: Installing stable releases of tunix and vllm-tpu
87
99
Run the following bash script to create a docker image with all the dependencies of MaxText, Tunix, vLLM and tpu-inference installed.
88
100
89
101
In addition to MaxText dependencies, primarily, it installs `vllm-tpu` which is [vllm](https://github.com/vllm-project/vllm) and [tpu-inference](https://github.com/vllm-project/tpu-inference) and thereby providing TPU inference for vLLM, with unified JAX and PyTorch support.
@@ -92,9 +104,9 @@ In addition to MaxText dependencies, primarily, it installs `vllm-tpu` which is
You can also use `bash dependencies/scripts/docker_build_dependency_image.sh MODE=post-training-experimental` to try out new features via experimental dependencies such as improved pathwaysutils resharding API
107
+
You can also use `bash dependencies/scripts/docker_build_dependency_image.sh MODE=post-training-experimental` to try out new features via experimental dependencies such as improved pathwaysutils resharding API.
96
108
97
-
### Install from locally git cloned repo's
109
+
### Option 2: Install from locally git cloned repositories
98
110
99
111
You can also locally git clone [tunix](https://github.com/google/tunix), [tpu-inference](https://github.com/vllm-project/tpu-inference), [vllm](https://github.com/vllm-project/vllm.git) and then use the following command to build a docker image using them:
100
112
```
@@ -106,7 +118,7 @@ bash dependencies/scripts/docker_build_dependency_image.sh MODE=post-training PO
Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk)
The overview of the demo script ~/maxtext/src/MaxText/examples/grpo_llama3_1_70b_demo_pw.py` is as follows:
128
-
129
-
1. We load a policy model and a reference model. Both are copies of `Llama3.1-70b-Instruct`.
130
-
2. Evaluate the policy model's performance on GSM8K math reasoning benchmark.
131
-
3. Train the policy model using GRPO with potentially different meshes for trainer and rollout depending on the parameters `TRAINER_DEVICES_FRACTION` and `SAMPLER_DEVICES_FRACTION`. If we set both of these to `1.0`, the entire (same) mesh will be used for both trainer and rollout. If we set say `TRAINER_DEVICES_FRACTION=0.5` and `SAMPLER_DEVICES_FRACTION=0.5`, the first half of the devices will be used for trainer and the second half will be used for rollout
132
-
4. Evaluate the policy model's performance on GSM8K math reasoning benchmark after the post-training with GRPO.
Copy file name to clipboardExpand all lines: docs/tutorials/sft_on_multi_host.md
+11-1Lines changed: 11 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,10 +33,20 @@ cd maxtext
33
33
```
34
34
35
35
### 1.2. Build MaxText Docker image
36
+
Before building the Docker image, authenticate to [Google Artifact Registry](https://docs.cloud.google.com/artifact-registry/docs/docker/authentication#gcloud-helper) for permission to push your images and other access.
37
+
```bash
38
+
# Authenticate your user account for gcloud CLI access
39
+
gcloud auth login
40
+
# Configure application default credentials for Docker and other tools
41
+
gcloud auth application-default login
42
+
# Configure Docker credentials and test your access
43
+
gcloud auth configure-docker
44
+
docker run hello-world
45
+
```
46
+
Then run the following command to create a local Docker image named `maxtext_base_image`.
0 commit comments