Skip to content

Commit b3d7f47

Browse files
authored
Merge pull request #51222 from MicrosoftDocs/EW-prevent-azure-machine-learning-data-exfiltration
New Module: prevent azure machine learning data exfiltration
2 parents 8752b31 + 17b3fc1 commit b3d7f47

20 files changed

+475
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.prevent-data-exfiltration-azure-ai-workloads.prevent-data-exfiltration-azure-ai-workloads
3+
title: Prevent data exfiltration from Azure AI Workloads
4+
metadata:
5+
title: Prevent Data Exfiltration From Azure AI Workloads
6+
description: Understand data exfiltration risks in Azure AI workloads and how to mitigate them.
7+
ms.date: 07/01/2025
8+
author: Orin-Thomas
9+
ms.author: viniap
10+
ms.topic: unit
11+
durationInMinutes: 2
12+
content: |
13+
[!include[](includes/1-prevent-data-exfiltration-azure-ai-workloads.md)]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.prevent-data-exfiltration-azure-ai-workloads.exfiltration-prevention-azure-ai-services
3+
title: Exfiltration prevention for Azure AI services
4+
metadata:
5+
title: Exfiltration Prevention for Azure AI Services
6+
description: Prevent data exfiltration in Azure AI services by implementing best practices and security measures.
7+
ms.date: 07/01/2025
8+
author: Orin-Thomas
9+
ms.author: viniap
10+
ms.topic: unit
11+
durationInMinutes: 5
12+
content: |
13+
[!include[](includes/2-exfiltration-prevention-azure-ai-services.md)]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.prevent-data-exfiltration-azure-ai-workloads.azure-machine-learning-data-exfiltration-prevention
3+
title: Azure Machine Learning data exfiltration prevention
4+
metadata:
5+
title: Azure Machine Learning Data Exfiltration Prevention
6+
description: Prevent data exfiltration in Azure Machine Learning by implementing best practices and security measures.
7+
ms.date: 07/01/2025
8+
author: Orin-Thomas
9+
ms.author: viniap
10+
ms.topic: unit
11+
durationInMinutes: 6
12+
content: |
13+
[!include[](includes/3-azure-machine-learning-data-exfiltration-prevention.md)]
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.prevent-data-exfiltration-azure-ai-workloads.knowledge-check
3+
title: Knowledge check
4+
metadata:
5+
title: Knowledge Check
6+
description: Check your knowledge.
7+
ms.date: 07/01/2025
8+
author: Orin-Thomas
9+
ms.author: viniap
10+
ms.topic: unit
11+
durationInMinutes: 4
12+
content: Choose the best response for each question.
13+
quiz:
14+
questions:
15+
- content: "You're tasked with enabling data exfiltration prevention for an Azure OpenAI service in your organization. You want to restrict outbound traffic to avoid data being sent to unauthorized locations. What do you need to configure?"
16+
choices:
17+
- content: "Disable outbound traffic and allow for in-Azure communication only."
18+
isCorrect: false
19+
explanation: "Disabling outbound traffic will completely shut off the Azure AI service affected."
20+
- content: "Enable the restriction for outbound traffic and configure the list of approved FQDNs."
21+
isCorrect: true
22+
explanation: "Disabling outbound traffic and configuring the allowed FQDNs allow the service to function properly while limiting the ability of a malicious individual to extract data."
23+
- content: "Configure the list of approved FQDNs so data can only be sent to those."
24+
isCorrect: false
25+
explanation: "The list of allowed FQDNs has no effect if the outbound traffic hasn't been disabled previously."
26+
- content: "How can you restrict Inbound traffic for compute instances or clusters using a public IP address in Azure Machine Learning?"
27+
choices:
28+
- content: "Restrict traffic using a network security group (NSG) and service tags."
29+
isCorrect: true
30+
explanation: "NSGs can block unauthorized traffic while allowing for legitimate requests to go through."
31+
- content: "Block port 44224 for the compute or cluster instance."
32+
isCorrect: false
33+
explanation: "Blocking port 44224 prevents Azure Machine Learning from responding to requests and shut off the service."
34+
- content: "Allow only HTTPS traffic (port 443) for the compute or cluster instance."
35+
isCorrect: false
36+
explanation: "Azure Machine Learning doesn't, by default, respond to traffic on port 443 for HTTPS requests."
37+
- content: "You want to add another security layer to prevent data exfiltration on an Azure Machine Learning environment. How can you control the outbound traffic to allowed storage accounts only?"
38+
choices:
39+
- content: "Use 3rd party storage accounts only."
40+
isCorrect: false
41+
explanation: "The usage of 3rd party storage accounts by itself doesn't prevent data exfiltration."
42+
- content: "Use Storage accounts on different Azure subscriptions."
43+
isCorrect: false
44+
explanation: "Storage accounts on different Azure subscriptions still allow unchecked outbound traffic."
45+
- content: "Use Service endpoint policies."
46+
isCorrect: true
47+
explanation: "Service endpoint policies let you filter outbound network traffic to specific Azure Storage accounts, limiting data exfiltration."
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.prevent-data-exfiltration-azure-ai-workloads.summary
3+
title: Summary
4+
metadata:
5+
title: Summary
6+
description: Module summary.
7+
ms.date: 07/01/2025
8+
author: Orin-Thomas
9+
ms.author: viniap
10+
ms.topic: unit
11+
durationInMinutes: 1
12+
content: |
13+
[!include[](includes/5-summary.md)]
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Exfiltration is a specific form of data loss where data is deliberately transferred to an external destination by a malicious actor. AI workloads, like any other workload running in the cloud, are potential avenues of data loss through exfiltration.
2+
3+
Exfiltration poses significant risks to organizations, including potential breaches of privacy, financial losses, and damage to reputation. Implementing robust exfiltration prevention measures is essential to protect sensitive data from leaving the secure environment.
4+
5+
A strategy to prevent data exfiltration involves applying security controls to all resources in an AI workload. In this module we focus specifically on the security controls and configuration you can apply to Azure AI services and Azure Machine Learning to address attempts at exfiltration.
6+
7+
[![Diagram of a high security tenant transferring data to a low security tenant that then has access to output data to untrusted data sources.](../media/exfiltration-inbound-outbound.svg)](../media/exfiltration-inbound-outbound-big.png#lightbox)
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
Azure AI services exfiltration prevention capabilities allow you to configure a list of outbound URLs your Azure AI services resources are permitted to access. In limiting outbound traffic to authorized URLs only, you can reduce the chance a malicious actor transmits data outside of your organization.
2+
3+
The following services support data loss prevention configuration:
4+
5+
- Azure OpenAI
6+
- Azure AI Vision
7+
- Content Moderator
8+
- Custom Vision
9+
- Face
10+
- Document Intelligence
11+
- Speech Service
12+
- QnA Maker
13+
14+
To enable exfiltration prevention for an AI service, you need to complete two steps. The first step is to set the property restrictOutboundNetworkAccess on the AI service resource to true. You then need to provide a list of approved URLs you wish to allow the AI service to access by adding those URLs to the allowedFqdnList property. This property supports up to 1,000 URLs, including both IPv4 addresses and fully qualified domain names.
15+
16+
You can use Cloud Shell to configure exfiltration protection for Azure AI services by performing the following steps.
17+
18+
1. In the Azure portal, select the Cloud Shell icon on the top-right corner of the portal to start a session.
19+
1. Select Bash.
20+
1. List all cognitive service accounts using the following command:
21+
22+
```azurecli
23+
az cognitiveservices account list -output table
24+
```
25+
26+
1. Find out if network access outbound is allowed on the account in use using the following command:
27+
28+
```azurecli
29+
az cognitiveservices account show -g "myResourceGroup" -n "Account Name" | grep Network Access
30+
```
31+
32+
[![Screenshot that displays output of command checking status of cognitive services.](../media/show-exfiltration-configuration.svg)](../media/show-exfiltration-configuration-big.png#lightbox)
33+
34+
1. The result of this command informs you if public network access is enabled for the service and if any outbound restrictions are set.
35+
1. Check to see if there's a Fully Qualified Domain Name list of allowed addresses.
36+
37+
```azurecli
38+
az cognitive services account show -g "myResourceGroup" -n "AccountName" | grep Fqdn
39+
```
40+
41+
1. The next command uses the rest protocol to patch the Azure OpenAI instance so that network access will be restricted and the allowed FQDN list will be set to "microsoft.com".
42+
43+
```azurecli
44+
az rest -m patch -u /subscriptions/{subscription ID}/resourceGroups/{resource group}/providers/Microsoft.CognitiveServices/accounts/{account name}?api-version=2024-10-01 -b '{"properties": { "restrictOutboundNetworkAccess": true, "allowedFqdnList": [ "microsoft.com" ] }}'
45+
```
46+
47+
1. After issuing the command, wait up to 15 minutes for settings to take effect.
48+
1. Check if outbound access is restricted using the command we used previously.
49+
50+
```azurecli
51+
az cognitiveservices account show -g "myResourceGroup" -n "Account Name" | grep Network Access
52+
```
53+
54+
1. Restrict Outbound Network access is now set to true.
55+
1. The next command we'll send to a text file so that we can edit the file using nano.
56+
57+
```azurecli
58+
az cognitiveseervices account show -g "MyResourceGroup" -n "accountName' > "myfile".txt
59+
nano "myfile".txt
60+
```
61+
62+
1. The output shows Microsoft.com in the allowed FQDN list.
63+
64+
[![Screenshot showing the contents of the output text file in the editor.](../media/editor-list.svg)](../media/editor-list-big.png#lightbox)
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
Azure Machine Learning relies on multiple inbound and outbound dependencies. Some of these dependencies can expose a data exfiltration risk by malicious agents within your organization.
2+
3+
If your compute instance or cluster uses a public IP address, you have an inbound on the _azuremachinelearning_ service tag (port 44224). You can control this inbound traffic by using a network security group (NSG) and service tags.
4+
5+
Outbound traffic is the most common route for data exfiltration. When storage outbound and Azure Front Door outbound traffic is not configured properly, it can lead to exfiltration. However, storage outbound traffic is a requirement for compute instances and compute clusters in an Azure Machine Learning deployment.
6+
7+
- A malicious agent can use this outbound rule by provisioning and saving data in their own storage account. You can remove these risks by using an Azure Service Endpoint policy and Azure Batch’s simplified node communication architecture.
8+
- Azure Front door is used by the Azure Machine Learning studio UI and AutoML. Instead of allowing outbound to the service tag (AzureFrontDoor.frontend), switch to the following fully qualified domain names (FQDN):
9+
10+
- ml.azure.com
11+
- automlresources-prod-d0eaehh7g8andvav.b02.azurefd.net
12+
13+
Switching to these FQDNS removes unnecessary outbound traffic
14+
15+
## Service endpoint policies
16+
17+
Service endpoint policies let you filter virtual network traffic to specific Azure Storage accounts, limiting data exfiltration. Azure Machine Learning compute instances and clusters need access to Microsoft-managed storage for provisioning. The service endpoint policies' Azure Machine Learning alias includes these accounts to prevent data exfiltration or control destination storage accounts. To configure Service Endpoint policies:
18+
19+
1. **From the Azure portal, search for Service Endpoint Policy and click + Create to start.**
20+
1. **On the Basics tab, provide the required fields and then select Next.**
21+
1. On the Policy definitions tab, select** +Add a resource** and then provide the following information:
22+
- **Service**: Microsoft.Storage
23+
- **Scope**: Select the scope as a Single Account to limit the network traffic to one storage account.
24+
- **Subscription**: The Azure subscription that contains the storage account.
25+
- **Resource Group**: The Resource Group that contains the storage account
26+
- **Resource**: The default storage account of the workspace
27+
1. Select **Add** to add the resource information.
28+
1. Select **+Add an alias** and then select _/services/Azure/MachineLearning_ as the Server Alias value. Select **Add** to add the alias.
29+
30+
[![A screenshot showing the configuration of a service endpoint policy in the Azure portal.](../media/service-endpoint-policy.svg)](../media/service-endpoint-policy-big.png#lightbox)
31+
32+
1. Select **Review + Create, then Create**
33+
34+
## Inbound and outbound network traffic
35+
36+
When using Azure Machine Learning **compute instance** _with a public IP address_, allow inbound traffic from Azure Batch management (service tag BatchNodeManagement.\<region>). A compute instance _with no public IP_ **doesn't** require this inbound communication.
37+
38+
For outbound traffic, there are two options customers might be using:
39+
40+
- Service tag/NSG: Allow outbound traffic to the following **service tags**. Replace \<region> with the Azure region that contains your compute cluster or instance:
41+
42+
| **Service tag** | **Protocol** | **Port** |
43+
|---|---|---|
44+
| **BatchNodeManagement.\<region>** | ANY | 443 |
45+
| **AzureMachineLearning** | TCP | 443 |
46+
| **Storage.\<region>** | TCP | 443 |
47+
48+
- Firewall: Allow outbound traffic over **ANY port 443** to the following FQDNs. Replace instances of \<region> with the Azure region that contains your compute cluster or instance:
49+
50+
- *.\<region>.batch.azure.com
51+
- *.\<region>.service.batch.azure.com
52+
53+
> [!NOTE]
54+
> If you enable the service endpoint on the subnet used by your firewall, you must open outbound traffic to the following hosts over **TCP port 443**:
55+
>
56+
> - *.blob.core.windows.net
57+
> - *.queue.core.windows.net
58+
> - *.table.core.windows.net
59+
60+
## Enable storage endpoint for the subnet
61+
62+
Use the following steps to enable a storage endpoint for the subnet that contains your Azure Machine Learning compute clusters and compute instances:
63+
64+
1. From the Azure portal, select the **Azure Virtual Network** for your Azure Machine Learning workspace.
65+
1. From the left of the page, select **Subnets** and then select the subnet that contains your compute cluster and compute instance.
66+
1. In the form that appears, expand the **Services** dropdown and then enable **Microsoft.Storage**. Select **Save** to save these changes.
67+
1. Apply the service endpoint policy to your workspace subnet.
68+
69+
[![A screenshot showing the edit subnet option in the Azure Portal.](../media/edit-subnet.svg)](../media/edit-subnet-big.png#lightbox)
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
In this module, you learned about mechanisms to prevent data exfiltration for AI workloads on Azure and how to configure exfiltration prevention for Azure Machine Learning workloads.
2+
3+
## Learn more
4+
5+
For more information about how to prevent data loss and exfiltration for AI workloads on Azure, see the following article:
6+
7+
- [Configure data loss prevention for Azure AI services](/azure/ai-services/cognitive-services-data-loss-prevention)
8+
- [Azure Machine Learning data exfiltration prevention](/azure/machine-learning/how-to-prevent-data-loss-exfiltration)
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
### YamlMime:Module
2+
uid: learn.prevent-data-exfiltration-azure-ai-workloads
3+
metadata:
4+
ms.author: viniap
5+
author: Orin-Thomas
6+
ms.date: 07/01/2025
7+
title: Prevent data exfiltration from Azure AI workloads
8+
description: Overview of how to prevent data exfiltration for AI workloads running on Microsoft Azure.
9+
ms.topic: module
10+
ms.service: azure-machine-learning
11+
ms.collection: ce-advocates-ai-copilot
12+
title: Prevent data exfiltration from Azure AI Workloads
13+
summary: Learn how to configure Azure Machine Learning and Azure AI services workloads to prevent data exfiltration.
14+
abstract: |
15+
After completing this module, you will be able to:
16+
- Understand mechanisms to prevent data exfiltration for AI workloads on Azure
17+
- Configure exfiltration prevention for Azure AI services and Azure Machine Learning
18+
prerequisites: |
19+
To get the most out of this module, you should have:
20+
- Fundamental security concepts
21+
- Fundamental AI concepts
22+
- Fundamental Azure Machine Learning concepts
23+
iconUrl: /learn/achievements/generic-badge.svg
24+
levels:
25+
- beginner
26+
roles:
27+
- developer
28+
products:
29+
- azure
30+
units:
31+
- learn.prevent-data-exfiltration-azure-ai-workloads.prevent-data-exfiltration-azure-ai-workloads
32+
- learn.prevent-data-exfiltration-azure-ai-workloads.exfiltration-prevention-azure-ai-services
33+
- learn.prevent-data-exfiltration-azure-ai-workloads.azure-machine-learning-data-exfiltration-prevention
34+
- learn.prevent-data-exfiltration-azure-ai-workloads.knowledge-check
35+
- learn.prevent-data-exfiltration-azure-ai-workloads.summary
36+
badge:
37+
uid: learn.prevent-data-exfiltration-azure-ai-workloads-badge

0 commit comments

Comments
 (0)