Skip to content

Commit 9edbc0f

Browse files
update README
1 parent 90f04b7 commit 9edbc0f

File tree

1 file changed

+42
-4
lines changed
  • inference/trillium/JetStream-Maxtext/Llama-4-Maverick-17B-128E

1 file changed

+42
-4
lines changed

inference/trillium/JetStream-Maxtext/Llama-4-Maverick-17B-128E/README.md

Lines changed: 42 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -342,14 +342,52 @@ The recipe uses the helm chart to run the above steps.
342342

343343
The server bring up takes ~20 mins with GCS. You can verify if it is ready by running:
344344
```bash
345-
HEAD_POD=$(kubectl get pods | grep pathways--pathways-head | awk '{print $1}')
345+
HEAD_POD=$(kubectl get pods | grep pathways-pathways-head | awk '{print $1}')
346346
kubectl logs -f ${HEAD_POD} -c jetstream
347347
348-
349-
348+
WARNING:absl:The transformations API will eventually be replaced by an upgraded design. The current API will not be removed until this point, but it will no longer be actively worked on.
349+
350+
Memstats: After load_params:
351+
Memstats unavailable, error: INVALID_ARGUMENT: MemoryStats is only supported for addressable PjRt devices.
352+
353+
RAMstats: After load_params:
354+
Using (GB) 15.49 / 708.23 (2.187143%) --> Available:686.77
355+
2025-04-28 22:21:36,353 - jetstream.core.server_lib - INFO - Loaded all weights.
356+
GC tweaked (allocs, gen1, gen2): 60000 20 30
357+
2025-04-28 22:22:13,177 - jetstream.core.server_lib - INFO - Starting server on port 9000 with 256 threads
358+
2025-04-28 22:22:14,545 - jetstream.core.server_lib - INFO - Not starting JAX profiler server: False
359+
INFO: Started server process [1]
360+
INFO: Waiting for application startup.
361+
INFO: Application startup complete.
362+
INFO: Uvicorn running on http://0.0.0.0:9999 (Press CTRL+C to quit)
350363
```
351364
352-
4. Stop the server and clean up the resources after completion by following the steps in the [Cleanup](#cleanup) section.
365+
5. Port forward and connect to model server. Replace the `UUID` with your pod `UUID`
366+
```
367+
kubectl port-forward pod/pathways-pathways-head-0-0-UUID 8000:8000
368+
```
369+
370+
6. Make a sample reuqest on a new terminal
371+
```bash
372+
curl --request POST \
373+
--header "Content-type: application/json" \
374+
-s \
375+
localhost:8000/generate \
376+
--data \
377+
'{
378+
"prompt": "What are the top 5 programming languages",
379+
"max_tokens": 200
380+
}'
381+
```
382+
383+
You should see a response like this
384+
```bash
385+
{
386+
"response": " that are most widely used and in demand in the industry?\n\n1. **Identify the context**: The question is asking about the most popular programming languages, which implies a need to consider current industry trends and usage statistics.\n2. **Consider the criteria**: To determine the most popular languages, we need to look at factors such as the number of developers using each language, the number of projects and applications built with each language, and the demand for each language in the job market.\n3. **Evaluate the options**: Based on various sources, including industry reports and developer surveys, we can evaluate the popularity of different programming languages.\n4. **Rank the languages**: By analyzing the data and considering the criteria, we can rank the programming languages in order of their popularity.\n5. **Identify the top 5**: Based on the ranking, we can identify the top 5 most popular programming languages.\n\nThe top 5 most popular programming languages are: \n1. JavaScript\n2. Python\n3. Java\n"
387+
}
388+
```
389+
390+
7. Stop the server and clean up the resources after completion by following the steps in the [Cleanup](#cleanup) section.
353391
354392
355393
### Cleanup

0 commit comments

Comments
 (0)