You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: inference/trillium/JetStream-Maxtext/Llama-4-Maverick-17B-128E/README.md
+42-4Lines changed: 42 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -342,14 +342,52 @@ The recipe uses the helm chart to run the above steps.
342
342
343
343
The server bring up takes ~20 mins with GCS. You can verify if it is ready by running:
344
344
```bash
345
-
HEAD_POD=$(kubectl get pods | grep pathways--pathways-head | awk '{print $1}')
345
+
HEAD_POD=$(kubectl get pods | grep pathways-pathways-head | awk '{print $1}')
346
346
kubectl logs -f ${HEAD_POD} -c jetstream
347
347
348
-
349
-
348
+
WARNING:absl:The transformations API will eventually be replaced by an upgraded design. The current API will not be removed until this point, but it will no longer be actively worked on.
349
+
350
+
Memstats: After load_params:
351
+
Memstats unavailable, error: INVALID_ARGUMENT: MemoryStats is only supported for addressable PjRt devices.
352
+
353
+
RAMstats: After load_params:
354
+
Using (GB) 15.49 / 708.23 (2.187143%) --> Available:686.77
355
+
2025-04-28 22:21:36,353 - jetstream.core.server_lib - INFO - Loaded all weights.
356
+
GC tweaked (allocs, gen1, gen2): 60000 20 30
357
+
2025-04-28 22:22:13,177 - jetstream.core.server_lib - INFO - Starting server on port 9000 with 256 threads
358
+
2025-04-28 22:22:14,545 - jetstream.core.server_lib - INFO - Not starting JAX profiler server: False
359
+
INFO: Started server process [1]
360
+
INFO: Waiting for application startup.
361
+
INFO: Application startup complete.
362
+
INFO: Uvicorn running on http://0.0.0.0:9999 (Press CTRL+C to quit)
350
363
```
351
364
352
-
4. Stop the server and clean up the resources after completion by following the steps in the [Cleanup](#cleanup) section.
365
+
5. Port forward and connect to model server. Replace the `UUID` with your pod `UUID`
"prompt": "What are the top 5 programming languages",
379
+
"max_tokens": 200
380
+
}'
381
+
```
382
+
383
+
You should see a response like this
384
+
```bash
385
+
{
386
+
"response": " that are most widely used and in demand in the industry?\n\n1. **Identify the context**: The question is asking about the most popular programming languages, which implies a need to consider current industry trends and usage statistics.\n2. **Consider the criteria**: To determine the most popular languages, we need to look at factors such as the number of developers using each language, the number of projects and applications built with each language, and the demand for each language in the job market.\n3. **Evaluate the options**: Based on various sources, including industry reports and developer surveys, we can evaluate the popularity of different programming languages.\n4. **Rank the languages**: By analyzing the data and considering the criteria, we can rank the programming languages in order of their popularity.\n5. **Identify the top 5**: Based on the ranking, we can identify the top 5 most popular programming languages.\n\nThe top 5 most popular programming languages are: \n1. JavaScript\n2. Python\n3. Java\n"
387
+
}
388
+
```
389
+
390
+
7. Stop the server and clean up the resources after completion by following the steps in the [Cleanup](#cleanup) section.
0 commit comments