@@ -128,7 +128,83 @@ sky status --endpoint 30000 sglang
128128
129129</details >
130130
131- ## Method 7: Run on AWS SageMaker
131+ ## Method 7: Using dstack
132+
133+ <details >
134+ <summary >More</summary >
135+
136+ [ dstack] ( https://github.com/dstackai/dstack ) simplifies GPU provisioning and workload orchestration across clouds, Kubernetes, and on-prem systems.
137+
138+ Deploying SGLang as a secure, auto-scalable endpoint is straightforward:
139+
140+ 1 . Install dstack: see [ dstack's documentation] ( https://dstack.ai/docs/installation/ )
141+ 2 . Create a dstack [ service] ( https://dstack.ai/docs/concepts/services/ ) :
142+
143+ <details >
144+ <summary >Service configuration: <code >service.yaml</code ></summary >
145+
146+ ``` yaml
147+ type : service
148+ name : qwen
149+
150+ image : lmsysorg/sglang:latest
151+ env :
152+ - MODEL_ID=qwen/qwen2.5-0.5b-instruct
153+ commands :
154+ - |
155+ python3 -m sglang.launch_server \
156+ --model-path $MODEL_ID
157+ --port 8000 \
158+ --trust-remote-code
159+ port : 8000
160+ model : qwen/qwen2.5-0.5b-instruct
161+
162+ resources :
163+ gpu : 8GB..24GB:1
164+ ` ` `
165+ </details>
166+
167+ Apply the configuration:
168+
169+ ` ` ` bash
170+ HF_TOKEN=<secret> dstack apply -f service.yaml
171+ ```
172+
173+ 3 . If you want to enable auto-scaling, cache-aware routing, HTTPS, or bring your own custom domain,
174+ create a [ gateway] ( https://dstack.ai/docs/concepts/gateways/ ) :
175+
176+ <details >
177+ <summary >Gateway configuration: <code >gateway.yaml</code ></summary >
178+
179+ ``` yaml
180+ type : gateway
181+ name : sglang-gateway
182+
183+ backend : aws
184+ region : eu-west-1
185+
186+ # Specify your domain
187+ domain : example.com
188+
189+ router :
190+ # (Optional) Enable cache-aware routing
191+ type : sglang
192+ policy : cache_aware
193+ ` ` `
194+ </details>
195+
196+ Apply the gateway configuration.
197+
198+ ` ` ` bash
199+ dstack apply -f gateway.yaml
200+ ```
201+
202+ Once the gateway is assigned a hostname, go to your domain's DNS settings and add a DNS record for ` *.<gateway domain> ` .
203+
204+ See the [ SGLang example] ( https://dstack.ai/examples/inference/sglang/ ) for more details.
205+ </details >
206+
207+ ## Method 8: Run on AWS SageMaker
132208
133209<details >
134210<summary >More</summary >
0 commit comments