Skip to content

Commit 017cbb8

Browse files
committed
Merge branch 'main' of https://github.com/MicrosoftDocs/azure-docs-pr into 376105
2 parents fbdce83 + 3275ceb commit 017cbb8

File tree

49 files changed

+586
-276
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+586
-276
lines changed

articles/api-center/synchronize-aws-gateway-apis.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,10 @@ ms.service: azure-api-center
66
ms.topic: how-to
77
ms.date: 02/10/2025
88
ms.author: danlep
9-
ms.custom: devx-track-azurecli
9+
ms.custom:
10+
- devx-track-azurecli
11+
- migration
12+
- aws-to-azure
1013
# Customer intent: As an API program manager, I want to integrate my Azure API Management instance with my API center and synchronize API Management APIs to my inventory.
1114
---
1215

articles/azure-cache-for-redis/cache-how-to-premium-persistence.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.date: 02/21/2025
1212

1313
> [!IMPORTANT]
1414
>
15-
> The data persistence functionality provides resilience for unexpected Redis node failures. Data persistence isn't a data backup or point in time recovery (PITR) feature. If corrupted data is written to the Redis instance, th corrupted data is also persisted. To make backups of your Redis instance, use the [export feature](cache-how-to-import-export-data.md).
15+
> The data persistence functionality provides resilience for unexpected Redis node failures. Data persistence isn't a data backup or point in time recovery (PITR) feature. If corrupted data is written to the Redis instance, the corrupted data is also persisted. To make backups of your Redis instance, use the [export feature](cache-how-to-import-export-data.md).
1616
>
1717
1818
> [!WARNING]

articles/container-apps/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,8 @@
276276
items:
277277
- name: Serverless GPUs
278278
href: gpu-serverless-overview.md
279+
- name: GPU types
280+
href: gpu-types.md
279281
- name: Tutorials
280282
items:
281283
- name: Generate images with serverless GPUs

articles/container-apps/gpu-types.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
---
2+
title: Comparing GPU types in Azure Container Apps
3+
description: Learn to how select the most appropriate GPU type for your container app.
4+
services: container-apps
5+
author: craigshoemaker
6+
ms.service: azure-container-apps
7+
ms.topic: how-to
8+
ms.date: 03/18/2025
9+
ms.author: cshoe
10+
ai-usage: ai-generated
11+
---
12+
13+
# Comparing GPU types in Azure Container Apps
14+
15+
Azure Container Apps supports serverless GPU acceleration (preview), enabling compute-intensive machine learning, and AI workloads in containerized environments. This capability allows you to use GPU hardware without managing the underlying infrastructure, following the serverless model that defines Container Apps.
16+
17+
This article compares the Nvidia T4 and A100 GPU options available in Azure Container Apps. Understanding the technical differences between these GPU types is important as you optimize your containerized applications for performance, cost-efficiency, and workload requirements.
18+
19+
## Key differences
20+
21+
The fundamental differences between T4 and A100 GPU types involve the amount of compute resources available to the respective types.
22+
23+
| GPU type | Description |
24+
|---|---|
25+
| T4 | Delivers cost-effective acceleration ideal for inference workloads and mainstream AI applications. The GPU is built on the Turing architecture, which provides sufficient computational power for most production inference scenarios. |
26+
| A100 | Features performance advantages for demanding workloads that require maximum computational power. The [massive memory capacity](#specs) helps you work with large language models, complex computer vision applications, or scientific simulations that wouldn't fit in the T4's more limited memory. |
27+
28+
The following table provides a comparison of the technical specifications between the NVIDIA T4 and NVIDIA A100 GPUs available in Azure Container Apps. These specifications highlight the key hardware differences, performance capabilities, and optimal use cases for each GPU type.
29+
30+
<a name="specs"></a>
31+
32+
| Specification | NVIDIA T4 | NVIDIA A100 |
33+
|---------------|-----------|-------------|
34+
| **Memory** | 16GB VRAM | 40GB or 80GB HBM2/HBM2e |
35+
| **Architecture** | Turing | Ampere |
36+
| **Power Consumption** | 70W TDP | Higher (400W for SXM variant) |
37+
| **Precision Support** | FP32, FP16 | TF32, FP32, FP16, BFLOAT16, INT8, INT4 |
38+
| **Training Performance** | Limited for modern deep learning | Up to 20x faster than T4 for large models |
39+
| **Inference Performance** | Cost-effective for smaller models | Substantially higher, especially for large models |
40+
| **Special Features** | - | MIG technology (up to seven isolated instances), NVLink |
41+
| **Optimal Model Size** | Small models (<5GB) | Medium to large models (>5GB) |
42+
| **Best Use Cases** | Cost-effective inference, mainstream AI applications | Training workloads, large models, complex computer vision, scientific simulations |
43+
| **Scalability** | Limited multi-GPU scaling | Better multi-GPU scaling with NVLink |
44+
45+
## Differences between GPU types
46+
47+
The type of GPU you select is largely dependent on the purpose of your application. The following section explores the strengths of each GPU type in context of inference, training, and mixed workloads.
48+
49+
### Inference workloads
50+
51+
For inference workloads, choosing between T4 and A100 depends on several factors including model size, performance requirements, and deployment scale.
52+
53+
The T4 provides the most cost-effective inference acceleration, particularly when deploying smaller models. The A100, however, delivers substantially higher inference performance, especially for large models, where it can perform faster than the T4 GPU.
54+
55+
When looking to scale, the T4 often provides better cost-performance ratio, while the A100 excels in scenarios requiring maximum performance. The A100 type is specially suited for large models or when using MIG to serve multiple inference workloads simultaneously.
56+
57+
### Training workloads
58+
59+
For AI training workloads, the difference between these GPUs becomes even more pronounced. The T4, while capable of handling small model training, faces significant limitations for modern deep learning training.
60+
61+
The A100 is overwhelmingly superior for training workloads, delivering up to 20 times better performance for large models compared to the T4. The substantially larger memory capacity (40 GB or 80GB) enables training of larger models without the need for complex model parallelism techniques in many cases. The A100's higher memory bandwidth also significantly accelerates data loading during training, reducing overall training time.
62+
63+
### Mixed precision and specialized workloads
64+
65+
The capabilities for mixed precision and specialized compute formats differ significantly between these GPUs. The T4 supports FP32 and FP16 precision operations, providing reasonable acceleration for mixed precision workloads. However, its support for specialized formats is limited compared to the A100.
66+
67+
The A100 offers comprehensive support for a wide range of precision formats, including TF32, FP32, FP16, BFLOAT16, INT8, and INT4. Since the A100 uses TensorFloat-32 (TF32), this GPU provides the mathematical accuracy of FP32 while delivering higher performance.
68+
69+
For workloads that benefit from mixed precision or require specialized formats, the A100 offers significant advantages in terms of both performance and flexibility.
70+
71+
## Selecting a GPU type
72+
73+
Choosing between the T4 and A100 GPUs requires careful consideration of several key factors. The primary workload type should guide the initial decision: for inference-focused workloads, especially with smaller models, the T4 often provides sufficient performance at a more attractive price point. For training-intensive workloads or inference with large models, the A100's superior performance becomes more valuable and often necessary.
74+
75+
Model size and complexity represent another critical decision factor. For small models (under 5GB), the T4's 16GB memory is typically adequate. For medium-sized models (5-15GB) consider testing on both GPU types to determine the optimal cost vs. performance for your situation. Large models (over 15GB) often require the A100's expanded memory capacity and bandwidth.
76+
77+
Evaluate your performance requirements carefully. For baseline acceleration needs, the T4 provides a good balance of performance and cost. For maximum performance in demanding applications, the A100 delivers superior results especially for large-scale AI and high-performance computing workloads. Latency-sensitive applications benefit from the A100's higher compute capability and memory bandwidth, which reduce processing time.
78+
79+
## Special considerations
80+
81+
Keep in mind the following exceptions when you're selecting a GPU type:
82+
83+
- **Plan for growth**: Even if you plan on starting with small models, if you expect to grow into needing more resources, consider starting with the A100 despite its higher initial cost. The continuity in your set-up might prove worth any extra costs you incur as you grow. Future-proofing like this is important to research organizations and AI-focused companies where model complexity tends to increase over time.
84+
85+
- **Hybrid deployments**: Using both T4 and A100 workload profiles can help you split work into the most cost effective destinations. You might decide to use A100 GPUs for training and development while you deploy inference workloads on T4 GPUs.
86+
87+
## Related content
88+
89+
- [Serverless GPUs](gpu-serverless-overview.md)
90+
- [Tutorial: Generate images with serverless GPUs](gpu-image-generation.md)
91+
- [Tutorial: Deploy an NVIDIA Llama3 NIM](serverless-gpu-nim.md)

articles/data-factory/data-migration-guidance-s3-azure-storage.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@ author: dearandyxu
66
ms.subservice: data-movement
77
ms.topic: conceptual
88
ms.date: 05/15/2024
9+
ms.custom:
10+
- migration
11+
- aws-to-azure
912
---
1013

1114
# Use Azure Data Factory to migrate data from Amazon S3 to Azure Storage

articles/data-factory/solution-template-migration-s3-azure.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@ ms.author: yexu
66
ms.topic: conceptual
77
ms.date: 10/03/2024
88
ms.subservice: data-movement
9+
ms.custom:
10+
- migration
11+
- aws-to-azure
912
---
1013

1114
# Migrate data from Amazon S3 to Azure Data Lake Storage Gen2

articles/iot-operations/troubleshoot/known-issues.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,6 @@ This article lists the known issues for Azure IoT Operations.
1717

1818
- If you prefer to have no updates made to your cluster without giving explicit consent, you should disable Arc updates when you enable the cluster. This is due to the fact that some system extensions are automatically updated by the Arc agent. To disable updates, include the `--disable-auto-upgrade` flag as part of the `az connectedk8s connect` command.
1919

20-
- If your deployment fails with the `"code":"LinkedAuthorizationFailed"` error, it means that you don't have **Microsoft.Authorization/roleAssignments/write** permissions on the resource group that contains your cluster.
21-
2220
- Directly editing **SecretProviderClass** and **SecretSync** custom resources in your Kubernetes cluster can break the secrets flow in Azure IoT Operations. For any operations related to secrets, use the operations experience UI.
2321

2422
- During and after deploying Azure IoT Operations, you might see warnings about `Unable to retrieve some image pull secrets (regcred)` in the logs and Kubernetes events. These warnings are expected and don't affect the deployment and use of Azure IoT Operations.

articles/iot-operations/troubleshoot/troubleshoot.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@ For general deployment and configuration troubleshooting, you can use the Azure
2323

2424
- Use [az iot ops support create-bundle](/cli/azure/iot/ops/support#az-iot-ops-support-create-bundle) to collect logs and traces to help you diagnose problems. The `support create-bundle` command creates a standard support bundle zip archive you can review or provide to Microsoft Support.
2525

26+
### You see a `"code":"LinkedAuthorizationFailed"` error message
27+
If your deployment fails with the `"code":"LinkedAuthorizationFailed"` error, the messages indicates that you don't have the required permissions on the resource group containing the cluster.
28+
29+
To resolve this issue, ensure that you have **Microsoft.Authorization/roleAssignments/write** permissions at the resource group level.
30+
2631
### You see an UnauthorizedNamespaceError error message
2732

2833
If you see the following error message, you either didn't enable the required Azure-arc custom locations feature, or you enabled the custom locations feature with an incorrect custom locations RP OID.
@@ -31,7 +36,7 @@ If you see the following error message, you either didn't enable the required Az
3136
Message: Microsoft.ExtendedLocation resource provider does not have the required permissions to create a namespace on the cluster.
3237
```
3338

34-
To resolve, follow [this guidance](/azure-arc/kubernetes/custom-locations#enable-custom-locations-on-your-cluster) for enabling the custom locations feature with the correct OID.
39+
To resolve, follow [this guidance](/azure/azure-arc/kubernetes/custom-locations#enable-custom-locations-on-your-cluster) for enabling the custom locations feature with the correct OID.
3540

3641
### You see a MissingResourceVersionOnHost error message
3742

15.7 KB
Loading

articles/migration/index.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ metadata:
1010
author: PageWriter-MSFT
1111
ms.date: 03/11/2025
1212
adobe-target: true
13+
ms.custom: migration
1314

1415
sections:
1516

@@ -268,9 +269,9 @@ sections:
268269
Detailed guide on migrating your GCP virtual machines to Azure.
269270
:::column-end:::
270271
:::column:::
271-
**[AKS for EKS Professionals](/azure/architecture/aws-professional/eks-to-aks/)**
272+
**[Migrate EKS web application workloads](/azure/aks/eks-web-overview)**
272273
273-
Assists Amazon EKS professionals in understanding Azure Kubernetes Service (AKS) through comparisons and migration strategies.
274+
Replicate an Amazon Elastic Kubernetes Service (EKS) web application with AWS Web Application Firewall (WAF) using Azure Web Application Firewall (WAF) and Azure Application Gateway in Azure Kubernetes Service (AKS).
274275
:::column-end:::
275276
:::column:::
276277
**[Migrate physical Machines](/azure/migrate/tutorial-migrate-physical-virtual-machines)**

0 commit comments

Comments
 (0)