Skip to content

Commit f416009

Browse files
authored
Add README for shared_pathways_service
Shared Pathways Service is a new feature to accelerate developer iteration by decoupling service creation for Pathways from code development.
1 parent 7fa9b60 commit f416009

File tree

1 file changed

+61
-0
lines changed
  • pathwaysutils/experimental/shared_pathways_service

1 file changed

+61
-0
lines changed
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Shared Pathways Service
2+
3+
The Shared Pathways Service accelerates developer iteration by providing a
4+
persistent, multi-tenant TPU environment. This decouples service creation from
5+
the development loop, allowing JAX clients to connect on-demand from a familiar
6+
local environment (like a laptop or cloud VM) to a long-running Pathways
7+
service that manages scheduling and error handling.
8+
9+
## Requirements
10+
11+
Make sure that your GKE cluster is running the Resource Manager and Worker pods.
12+
You can follow the steps
13+
<a href="https://docs.cloud.google.com/ai-hypercomputer/docs/workloads/pathways-on-cloud/troubleshooting-pathways#health_monitoring" target="_blank">here</a>
14+
to confirm the status of these pods. If you haven't started the Pathways pods
15+
yet, you can use [pw-service-example.yaml](yamls/pw-service-example.yaml).
16+
Make sure to modify the following values to deploy these pods:
17+
18+
- A unique Jobset name for the cluster's Pathways pods
19+
- GCS bucket path
20+
- TPU type and topology
21+
- Number of slices
22+
23+
These fields are highlighted in the YAML file with trailing comments for easier
24+
understanding.
25+
26+
## Instructions
27+
28+
1. Clone `pathwaysutils`.
29+
30+
`git clone https://github.com/AI-Hypercomputer/pathways-utils.git`
31+
32+
2. Install portpicker
33+
34+
`pip install portpicker`
35+
36+
3. Import `isc_pathways` and move your workload under
37+
`with isc_pathways.connect()` statement. Refer to
38+
[run_connect_example.py](run_connect_example.py) for reference. Example code:
39+
40+
```
41+
from pathwaysutils.experimental.shared_pathways_service import isc_pathways
42+
43+
with isc_pathways.connect(
44+
cluster="my-cluster",
45+
project="my-project",
46+
region="region",
47+
gcs_bucket="gs://user-bucket",
48+
pathways_service="pathways-cluster-pathways-head-0-0.pathways-cluster:29001",
49+
expected_tpu_instances={"tpuv6e:2x2": 2},
50+
) as tm:
51+
import jax.numpy as jnp
52+
import pathwaysutils
53+
import pprint
54+
55+
pathwaysutils.initialize()
56+
orig_matrix = jnp.zeros(5)
57+
...
58+
```
59+
60+
The connect block will deploy a proxy pod dedicated to your client and connect
61+
your local runtime environment to the proxy pod via port-forwarding.

0 commit comments

Comments
 (0)