11## Initialization
221 . Run the following commands to initialize the project and zone.
3- ```
3+ ``` shell
44export PROJECT=# <your_project_id>
55export ZONE=# <zone>
66gcloud config set project $PROJECT
@@ -11,19 +11,19 @@ gcloud config set compute/zone $ZONE
1111instructions. Also ensure you have the proper [ GCP permissions] ( https://github.com/AI-Hypercomputer/xpk?tab=readme-ov-file#installation ) .
1212
1313* In order to run the tpu-recipes as-is, run the ` git clone ` command from your home directory:
14- ```
14+ ``` shell
1515git clone https://github.com/google/xpk.git
1616```
1717
18183 . Run the rest of these commands from the cloned XPK directory:
1919
20- ```
20+ ``` shell
2121cd xpk # Should be equivalent to cd ~/xpk
2222```
2323
2424## GKE Cluster Creation
25251 . Specify your TPU GKE cluster configs.
26- ```
26+ ``` shell
2727export CLUSTER_NAME=v6e-demo # <your_cluster_name>
2828export NETWORK_NAME=${CLUSTER_NAME} -only-mtu9k
2929export NETWORK_FW_NAME=${NETWORK_NAME} -only-fw
@@ -35,7 +35,7 @@ export REGION=<compute_region>
3535```
3636
37372 . Create the network and firewall for this cluster if it doesn’t exist yet.
38- ```
38+ ``` shell
3939NETWORK_NAME_1=${CLUSTER_NAME} -mtu9k-1-${ZONE}
4040NETWORK_FW_NAME_1=${NETWORK_NAME_1} -fw-1-${ZONE}
4141
@@ -67,7 +67,7 @@ gcloud compute routers nats create "${NAT_CONFIG}" \
6767```
6868
69693 . Create GKE cluster with TPU node-pools
70- ```
70+ ``` shell
7171export CLUSTER_ARGUMENTS=" --enable-dataplane-v2 --enable-ip-alias --enable-multi-networking --network=${NETWORK_NAME_1} --subnetwork=${NETWORK_NAME_1} "
7272
7373export NODE_POOL_ARGUMENTS=" --additional-node-network network=${NETWORK_NAME_2} ,subnetwork=${SUBNET_NAME_2} "
@@ -80,12 +80,12 @@ python3 xpk.py cluster create --cluster $CLUSTER_NAME --cluster-cpu-machine-type
8080 * You should be able to see your GKE cluster similar to this once it is created successfully:![ image] ( https://github.com/user-attachments/assets/60743411-5ee5-4391-bb0e-7ffba4d91c1d )
8181
82824 . Performance Daemonset
83- ```
83+ ``` shell
8484kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/ai-on-gke/9ff340f07f70be0130454f9e7238551587242b75/scripts/network-setup/v6e-network-optimization.yaml
8585```
8686
87875 . Test your GKE cluster to make sure it is usable
88- ```
88+ ``` shell
8989python3 xpk.py workload create \
9090--cluster ${CLUSTER_NAME} \
9191--workload hello-world-test \
@@ -96,17 +96,15 @@ python3 xpk.py workload create \
9696* You should be able to to see results like this: ![ image] ( https://github.com/user-attachments/assets/c33010a6-e109-411e-8fb5-afb4edb3fa72 )
9797
98986 . You can also check your workload status with the following command:
99- ```
100- python3 xpk.py workload list \
101- --cluster ${CLUSTER_NAME}
102- ```
99+ ``` shell
100+ python3 xpk.py workload list --cluster ${CLUSTER_NAME}
101+ ```
1031027 . For more information about XPK, please refer to this [ link] ( https://github.com/google/xpk ) .
104103
105104## GKE Cluster Deletion
106105You can use the following command to delete GKE cluster:
107- ```
106+ ``` shell
108107export CLUSTER_NAME=v6e-demo # <your_cluster_name>
109108
110- python3 xpk.py cluster delete \
111- --cluster $CLUSTER_NAME
109+ python3 xpk.py cluster delete --cluster $CLUSTER_NAME
112110```
0 commit comments