Skip to content

Commit b2f2ce3

Browse files
Bihan  RanaBihan  Rana
authored andcommitted
Add dstack install method in docs
Updated the dstack section
1 parent ecefc79 commit b2f2ce3

File tree

1 file changed

+77
-1
lines changed

1 file changed

+77
-1
lines changed

docs/get_started/install.md

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,83 @@ sky status --endpoint 30000 sglang
128128

129129
</details>
130130

131-
## Method 7: Run on AWS SageMaker
131+
## Method 7: Using dstack
132+
133+
<details>
134+
<summary>More</summary>
135+
136+
[dstack](https://github.com/dstackai/dstack) simplifies GPU provisioning and workload orchestration across clouds, Kubernetes, and on-prem systems.
137+
138+
Deploying SGLang as a secure, auto-scalable endpoint is straightforward:
139+
140+
1. Install dstack: see [dstack's documentation](https://dstack.ai/docs/installation/)
141+
2. Create a dstack [service](https://dstack.ai/docs/concepts/services/):
142+
143+
<details>
144+
<summary>Service configuration: <code>service.yaml</code></summary>
145+
146+
```yaml
147+
type: service
148+
name: qwen
149+
150+
image: lmsysorg/sglang:latest
151+
env:
152+
- MODEL_ID=qwen/qwen2.5-0.5b-instruct
153+
commands:
154+
- |
155+
python3 -m sglang.launch_server \
156+
--model-path $MODEL_ID
157+
--port 8000 \
158+
--trust-remote-code
159+
port: 8000
160+
model: qwen/qwen2.5-0.5b-instruct
161+
162+
resources:
163+
gpu: 8GB..24GB:1
164+
```
165+
</details>
166+
167+
Apply the configuration:
168+
169+
```bash
170+
HF_TOKEN=<secret> dstack apply -f service.yaml
171+
```
172+
173+
3. If you want to enable auto-scaling, cache-aware routing, HTTPS, or bring your own custom domain,
174+
create a [gateway](https://dstack.ai/docs/concepts/gateways/):
175+
176+
<details>
177+
<summary>Gateway configuration: <code>gateway.yaml</code></summary>
178+
179+
```yaml
180+
type: gateway
181+
name: sglang-gateway
182+
183+
backend: aws
184+
region: eu-west-1
185+
186+
# Specify your domain
187+
domain: example.com
188+
189+
router:
190+
# (Optional) Enable cache-aware routing
191+
type: sglang
192+
policy: cache_aware
193+
```
194+
</details>
195+
196+
Apply the gateway configuration.
197+
198+
```bash
199+
dstack apply -f gateway.yaml
200+
```
201+
202+
Once the gateway is assigned a hostname, go to your domain's DNS settings and add a DNS record for `*.<gateway domain>`.
203+
204+
See the [SGLang example](https://dstack.ai/examples/inference/sglang/) for more details.
205+
</details>
206+
207+
## Method 8: Run on AWS SageMaker
132208

133209
<details>
134210
<summary>More</summary>

0 commit comments

Comments
 (0)