You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/container-apps/gpu-serverless-overview.md
+5-8Lines changed: 5 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,15 +20,15 @@ Serverless GPUs are only supported for Consumption workload profiles. The featur
20
20
21
21
## Benefits
22
22
23
-
Serverless GPUs accelerate AI development by allowing you to focus on your core AI code and less on managing infrastructure when using GPUs. This feature provides a middle layer option between the Azure AI model catalog's serverless APIs and hosting models on managed compute.
23
+
Serverless GPUs accelerate AI development by allowing you to focus on your core AI code and less on managing infrastructure when using GPUs. This feature provides a middle layer option between the [Azure AI model catalog](/azure/machine-learning/concept-model-catalog)'s serverless APIs and hosting models on managed compute.
24
24
25
25
The Container Apps serverless GPU support provides full data governance as your data never leaves the boundaries of your container while still providing a managed, serverless platform from which to build your applications.
26
26
27
27
When you use serverless GPUs in Container Apps, your apps get:
28
28
29
-
-**Scale-to zero GPUs**: Support for automatic serverless scaling of NVIDIA A100 and T4 GPUs.
29
+
-**Scale-to zero GPUs**: Support for serverless scaling of NVIDIA A100 and T4 GPUs.
30
30
31
-
-**Per-second billing**: Pay only for the GPU compute you use.
31
+
-**Per-second billing**: Fine-grained cost calculations reduced down to the second.
32
32
33
33
-**Built-in data governance**: Your data never leaves the container boundary.
34
34
@@ -40,12 +40,9 @@ When you use serverless GPUs in Container Apps, your apps get:
40
40
41
41
The following scenarios, while not comprehensive, describe common use cases for serverless GPUs.
42
42
43
-
-**Artificial Intelligence**: Applications that involve deep learning, neural networks, or large-scale data analysis can use GPUs to significantly accelerate computation speed. Including applications that support:
43
+
-**Real-time and batch inferencing**: Using custom open-source models with fast startup times, automatic scaling, and a per-second billing model. Serverless GPUs are ideal for dynamic applications. You pay only for the compute you use, and your apps automatically scale in and out to meet demand.
44
44
45
-
- Real-time custom model inferencing with open-source models
46
-
- Batch-inferencing for periodic workloads with automatic scale-to-zero
47
-
- Custom fine-tuning of generative AI models
48
-
- Machine learning scenarios
45
+
-**Machine learning scenarios**: Significantly speed up applications that implement fine-tuned custom generative AI models, deep learning, neural networks, or large-scale data analysis.
49
46
50
47
-**High-Performance Computing (HPC)**: Applications that require complex calculations and simulations, such as scientific computing, financial modeling, or weather forecasting use GPUs as resources for high computational demands.
0 commit comments