You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/install_maxtext.md
-24Lines changed: 0 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -156,27 +156,3 @@ install_maxtext_github_deps
156
156
```
157
157
158
158
3.**Run tests:** Run MaxText tests to ensure there are no regressions.
159
-
160
-
## Appendix: Install XPK for MaxText Multi-host Workloads
161
-
162
-
> **_NOTE:_** XPK is only required for multi-host TPU configurations (e.g., v5p-128, v6e-256). For single-host training, XPK is not needed and you can run MaxText directly on your TPU VM.
163
-
164
-
XPK (Accelerated Processing Kit) is a tool designed to simplify the orchestration and management of workloads on Google Kubernetes Engine (GKE) clusters with TPU or GPU accelerators. In MaxText, we use XPK to submit both pre-training and post-training jobs on multi-host TPU configurations.
165
-
166
-
For your convenience, we provide a minimal installation path below:
167
-
```bash
168
-
# Directly install xpk using pip
169
-
pip install xpk
170
-
171
-
# Install kubectl
172
-
sudo apt-get update
173
-
sudo apt install snapd
174
-
sudo snap install kubectl --classic
175
-
176
-
# Install gke-gcloud-auth-plugin
177
-
echo"deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main"| sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk).
119
+
Please create a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster), and you can submit the `train_rl.py` script via [XPK](https://github.com/AI-Hypercomputer/xpk). We also provide a quick guide for XPK installation and usage [here](https://maxtext.readthedocs.io/en/latest/run_maxtext/run_maxtext_via_xpk.html).
The `docker_upload_runner.sh` script uploads your Docker image to Artifact Registry.
59
59
60
60
## 2. Install XPK
61
-
Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation-via-pip).
61
+
Install XPK by following the instructions in the [official documentation](https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation-via-pip). We also provide a quick guide for XPK installation and usage [here](https://maxtext.readthedocs.io/en/latest/run_maxtext/run_maxtext_via_xpk.html).
62
62
63
63
## 3. Create GKE cluster
64
64
Use a pathways ready GKE cluster as described [here](https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/create-gke-cluster).
0 commit comments