Skip to content

Commit 2be5e9e

Browse files
[Blog] NVIDIA DGX Spark (#3298)
1 parent fe23317 commit 2be5e9e

File tree

1 file changed

+132
-0
lines changed

1 file changed

+132
-0
lines changed
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
---
2+
title: "Orchestrating workloads on NVIDIA DGX Spark"
3+
date: 2025-11-14
4+
description: "TBA"
5+
slug: nvidia-dgx-spark
6+
image: https://dstack.ai/static-assets/static-assets/images/nvidia-dgx-spark.png
7+
# categories:
8+
# - Benchmarks
9+
---
10+
11+
# Orchestrating workloads on NVIDIA DGX Spark
12+
13+
With support from [Graphsignal :material-arrow-top-right-thin:{ .external }](https://x.com/GraphsignalAI/status/1986565583593197885){:target="_blank" }, our team gained access to the new [NVIDIA DGX Spark :material-arrow-top-right-thin:{ .external }](https://www.nvidia.com/en-us/products/workstations/dgx-spark/){:target="_blank"} and used it to validate how `dstack` operates on this hardware. This post walks through how to set it up with `dstack` and use it alongside existing on-prem clusters or GPU cloud environments to run workloads.
14+
15+
<img src="https://dstack.ai/static-assets/static-assets/images/nvidia-dgx-spark.png" width="630"/>
16+
17+
<!-- more -->
18+
19+
If DGX Spark is new to you, here is a quick breakdown of the key specs.
20+
21+
* Built on the NVIDIA GB10 Grace Blackwell Superchip with Arm CPUs.
22+
* Capable of up to 1 petaflop of AI compute at FP4 precision, roughly comparable to RTX 5070 performance.
23+
* Features 128GB of unified CPU and GPU memory enabled by the Grace Blackwell architecture.
24+
* Ships with NVIDIA DGX OS (a tuned Ubuntu build) and NVIDIA Container Toolkit.
25+
26+
These characteristics make DGX Spark a fitting extension for local development and smaller-scale model training or inference, including workloads up to the GPT-OSS 120B range.
27+
28+
## Creating an SSH fleet
29+
30+
Because DGX Spark supports SSH and containers, integrating it with dstack is straightforward. Start by configuring an [SSH fleet](../../docs/concepts/fleets.md#ssh-fleets). The file needs the hosts and access credentials.
31+
32+
<div editor-title="fleet.dstack.yml">
33+
34+
```yaml
35+
type: fleet
36+
name: spark
37+
38+
ssh_config:
39+
user: devops
40+
identity_file: ~/.ssh/id_rsa
41+
hosts:
42+
- spark-e3a4
43+
```
44+
45+
</div>
46+
47+
The `user` must have `sudo` privileges.
48+
49+
Apply the configuration:
50+
51+
<div class="termy">
52+
53+
```shell
54+
$ dstack apply -f fleet.dstack.yml
55+
56+
Provisioning...
57+
---> 100%
58+
59+
FLEET INSTANCE GPU PRICE STATUS CREATED
60+
spark 0 GB10:1 $0 idle 3 mins ago
61+
```
62+
63+
</div>
64+
65+
Once active, the system detects hardware and marks the instance as `idle`. From here, you can run
66+
[dev environments](../../docs/concepts/dev-environments.md), [tasks](../../docs/concepts/tasks.md),
67+
and [services](../../docs/concepts/services.md) on the DGX Spark fleet, the same way you would with other on-prem or cloud GPU backends.
68+
69+
## Running a dev environment
70+
71+
Example configuration:
72+
73+
<div editor-title=".dstack.yml">
74+
75+
```yaml
76+
type: dev-environment
77+
name: cursor
78+
79+
image: lmsysorg/sglang:spark
80+
81+
ide: cursor
82+
83+
resources:
84+
gpu: GB10
85+
86+
volumes:
87+
- /root/.cache/huggingface:/root/.cache/huggingface
88+
89+
fleets: [spark]
90+
```
91+
92+
</div>
93+
94+
We use an [instance volume](../../docs/concepts/volumes.md#instance-volumes) to keep model downloads cached across runs. The `lmsysorg/sglang:spark` image is tuned for inference on DGX Spark. Any Arm-compatible image with proper driver support will work if customization is needed.
95+
96+
Run the environment:
97+
98+
<div class="termy">
99+
100+
```shell
101+
$ dstack apply -f .dstack.yml
102+
103+
BACKEND GPU INSTANCE TYPE PRICE
104+
ssh (remtoe) GB10:1 instance $0 idle
105+
106+
Submit the run cursor? [y/n]: y
107+
108+
# NAME BACKEND GPU PRICE STATUS SUMBITTED
109+
1 cursor ssh (remote) GB10:1 $0 running 12:24
110+
111+
Launching `cursor`...
112+
---> 100%
113+
114+
To open in VS Code Desktop, use this link:
115+
vscode://vscode-remote/ssh-remote+cursor/workflow
116+
```
117+
118+
</div>
119+
120+
Workloads behave exactly like they do on other supported compute targets. You can use DGX Spark for fine tuning, interactive development, or model serving without changing workflows.
121+
122+
!!! info "Aknowledgement"
123+
Thanks to the [Graphsignal :material-arrow-top-right-thin:{ .external }](https://graphsignal.com/){:target="_blank"} team for access to DGX Spark and for supporting testing and validation. Graphsignal provides inference observability tooling used to profile CUDA workloads during both training and inference.
124+
125+
## What's next?
126+
127+
1. Read the [NVIDIA DGX Spark in-depth review :material-arrow-top-right-thin:{ .external }](https://lmsys.org/blog/2025-10-13-nvidia-dgx-spark/){:target="_blank"} by the SGLang team.
128+
2. Check [dev environments](../../docs/concepts/dev-environments.md),
129+
[tasks](../../docs/concepts/tasks.md), [services](../../docs/concepts/services.md),
130+
and [fleets](../../docs/concepts/fleets.md)
131+
3. Follow [Quickstart](../../docs/quickstart.md)
132+
4. Join [Discord :material-arrow-top-right-thin:{ .external }](https://discord.gg/u8SmfwPpMd){:target="_blank"}

0 commit comments

Comments
 (0)