Skip to content

Commit 3681fae

Browse files
Update gpu-serverless-overview.md
1 parent ef87a99 commit 3681fae

File tree

1 file changed

+5
-8
lines changed

1 file changed

+5
-8
lines changed

articles/container-apps/gpu-serverless-overview.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -20,15 +20,15 @@ Serverless GPUs are only supported for Consumption workload profiles. The featur
2020
2121
## Benefits
2222

23-
Serverless GPUs accelerate AI development by allowing you to focus on your core AI code and less on managing infrastructure when using GPUs. This feature provides a middle layer option between the Azure AI model catalog's serverless APIs and hosting models on managed compute.
23+
Serverless GPUs accelerate AI development by allowing you to focus on your core AI code and less on managing infrastructure when using GPUs. This feature provides a middle layer option between the [Azure AI model catalog](/azure/machine-learning/concept-model-catalog)'s serverless APIs and hosting models on managed compute.
2424

2525
The Container Apps serverless GPU support provides full data governance as your data never leaves the boundaries of your container while still providing a managed, serverless platform from which to build your applications.
2626

2727
When you use serverless GPUs in Container Apps, your apps get:
2828

29-
- **Scale-to zero GPUs**: Support for automatic serverless scaling of NVIDIA A100 and T4 GPUs.
29+
- **Scale-to zero GPUs**: Support for serverless scaling of NVIDIA A100 and T4 GPUs.
3030

31-
- **Per-second billing**: Pay only for the GPU compute you use.
31+
- **Per-second billing**: Fine-grained cost calculations reduced down to the second.
3232

3333
- **Built-in data governance**: Your data never leaves the container boundary.
3434

@@ -40,12 +40,9 @@ When you use serverless GPUs in Container Apps, your apps get:
4040

4141
The following scenarios, while not comprehensive, describe common use cases for serverless GPUs.
4242

43-
- **Artificial Intelligence**: Applications that involve deep learning, neural networks, or large-scale data analysis can use GPUs to significantly accelerate computation speed. Including applications that support:
43+
- **Real-time and batch inferencing**: Using custom open-source models with fast startup times, automatic scaling, and a per-second billing model. Serverless GPUs are ideal for dynamic applications. You pay only for the compute you use, and your apps automatically scale in and out to meet demand.
4444

45-
- Real-time custom model inferencing with open-source models
46-
- Batch-inferencing for periodic workloads with automatic scale-to-zero
47-
- Custom fine-tuning of generative AI models
48-
- Machine learning scenarios
45+
- **Machine learning scenarios**: Significantly speed up applications that implement fine-tuned custom generative AI models, deep learning, neural networks, or large-scale data analysis.
4946

5047
- **High-Performance Computing (HPC)**: Applications that require complex calculations and simulations, such as scientific computing, financial modeling, or weather forecasting use GPUs as resources for high computational demands.
5148

0 commit comments

Comments
 (0)