You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/dev-box/concept-serverless-gpu.md
+26-1Lines changed: 26 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -65,7 +65,26 @@ Serverless GPU compute in Dev Box uses Azure Container Apps (ACA) to provide GPU
65
65
66
66
The following GPU options are currently supported:
67
67
68
-
- NVIDIA T4 GPUs
68
+
-**NVIDIA T4 GPUs**: Readily available with minimal quota concerns
69
+
-**NVIDIA A100 GPUs**: More powerful but available in limited capacity
70
+
71
+
### Regional availability
72
+
73
+
Currently, GPU resources are available in the following Azure regions:
74
+
75
+
- West US 3
76
+
- Sweden North
77
+
- Australia East
78
+
79
+
Additional regions may be supported in the future based on demand.
80
+
81
+
### vNet injection
82
+
83
+
vNet injection allows customers to integrate their network and security protocols with the serverless GPU environment. While not required for the proof of concept (POC), this feature will be prioritized for public previews and general availability (GA). With vNet injection, customers can achieve tighter control over network and security configurations.
84
+
85
+
### MOBO architecture model
86
+
87
+
Serverless GPU compute adopts the MOBO architecture model for ACA integration. In this model, ACA instances are created and managed within the customer’s subscription, providing a more controlled and streamlined management experience. This ensures that the Dev Box service can securely manage ACA sessions without introducing additional complexity.
-**Visual Studio**: Access GPU compute from within the Visual Studio environment
76
95
-**VS Code with AI Toolkit**: Use seamless GPU integration for AI development tasks
77
96
97
+
The goal is to provide a seamless, native experience where GPU resources are accessible without requiring any setup from the developer.
98
+
78
99
## Administration and management
79
100
80
101
Administrators control serverless GPU access at the project level through Dev Center. Key management capabilities include:
@@ -83,6 +104,10 @@ Administrators control serverless GPU access at the project level through Dev Ce
83
104
-**Set concurrent GPU limits**: Specify the maximum number of GPUs that can be used simultaneously across a project
84
105
-**Cost controls**: Manage GPU usage within subscription quotas
85
106
107
+
Access to serverless GPU resources is managed through project-level properties. When the serverless GPU feature is enabled for a project, all Dev Boxes within that project automatically gain access to GPU compute. This simplifies the access model by removing the need for custom roles or pool-based configurations.
108
+
109
+
Future iterations of the project policy infrastructure will provide even more granular control over GPU access and usage.
110
+
86
111
## Related content
87
112
88
113
-[Get started with serverless GPU in Dev Box (link to be added)]
Copy file name to clipboardExpand all lines: articles/dev-box/source-serverless-gpu.md
-58Lines changed: 0 additions & 58 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -167,61 +167,3 @@ Instant Access to GPU Compute: Dev Box allows developers to get up and running w
167
167
Centralized Control for Admins: Dev Box integrates seamlessly with Dev Center's project policies, giving administrators granular control over serverless GPU access. Admins can define consumption limits, enable or disable GPU access on a per-project basis, and set permissions for users, all within the familiar Dev Center infrastructure.
168
168
169
169
Secure Private Network Integration: Dev Box runs within a private, enterprise-managed network. This ensures that sensitive corporate data used for AI workloads—such as proprietary models, internal datasets, or compliance-bound information—remains isolated and secure at the network layer. This added layer of security is crucial for enterprises handling regulated or confidential data.
170
-
171
-
POC Plan
172
-
173
-
Stage 1 – ETA 1-2 weeks – Eng: Nick Depinet
174
-
175
-
Develop a shell (Windows Terminal extension) that communicates with ACA and can be launched from within Dev Box.
176
-
177
-
AI Toolkit Integration
178
-
179
-
Checkpoint: Begin collection internal developer feedback on shell functionality and integration.
180
-
181
-
Stage 2 – ETA 2-3 weeks – Eng: Sneha
182
-
183
-
Implement Agent Management Service (AMS), handle authentication, session management, and related tasks.
184
-
185
-
Stage 3 – ETA 3-4 weeks
186
-
187
-
Introduce admin controls
188
-
189
-
HOBO provisioning
190
-
191
-
Begin planning for vNet injection support as a future enhancement.
192
-
193
-
Stage 4 – ETA 4-5 weeks
194
-
195
-
Finalize portal experience integration, enabling a seamless user interface for Dev Box users to manage GPU compute access.
196
-
197
-
Open questions
198
-
199
-
What is data persistency story?
200
-
201
-
What is the user experience around handling GPU limits per user?
202
-
203
-
How do we think about GPU pooling?
204
-
205
-
Where does the session pool live in dev center infra
206
-
207
-
Rude FAQ
208
-
209
-
Experience related
210
-
211
-
Why is the GPU accessible only as an external process? Why can't I use the GPU to accelerate my DevBox graphics?
212
-
213
-
Why do I have to request for GPU quota separately? Why can’t you auto-grant GPU quota to match the size of my Dev Box pool?
214
-
215
-
As an IT Admin for an Enterprise customer, why should I procure Serverless GPU through DevBox instead of directly procuring ACA Serverless GPU?
216
-
217
-
Current limitations / Roadmap related
218
-
219
-
Why can I only access GPUs via Shell? Why isn't there a GUI?
220
-
221
-
Why aren't you giving me the latest generation GPUs? I really need H100s
222
-
223
-
I need multiple GPUs attached to a single DevBox, why are you making me create multiple shells which get 1 GPU each instead of giving me N GPUs in a single shell?
224
-
225
-
I want to run Windows only software such as GameMaker on Serverless GPUs. Why am I limited to Linux only?
0 commit comments