Skip to content

Commit cfd9e3f

Browse files
Correct inference provider config for K8s deployment 2025-01-27-intro-to-llama-stack-with-vllm.md
1 parent 01664a2 commit cfd9e3f

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

_posts/2025-01-27-intro-to-llama-stack-with-vllm.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -309,9 +309,9 @@ providers:
309309
- provider_id: vllm
310310
provider_type: remote::vllm
311311
config:
312-
url: ${env.VLLM_URL}
313-
max_tokens: ${env.VLLM_MAX_TOKENS:4096}
314-
api_token: ${env.VLLM_API_TOKEN:fake}
312+
url: http://vllm-server.default.svc.cluster.local:8000/v1
313+
max_tokens: 4096
314+
api_token: fake
315315
```
316316

317317
Once we have defined the run configuration for Llama Stack, we can build an image with that configuration the server source code:

0 commit comments

Comments
 (0)