You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-enterprise-security.md
+11-7Lines changed: 11 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@ ms.date: 08/26/2022
15
15
16
16
# Enterprise security and governance for Azure Machine Learning
17
17
18
-
In this article, you'll learn about security and governance features available for Azure Machine Learning. These features are useful for administrators, DevOps, and MLOps who want to create a secure configuration that is compliant with your companies policies. With Azure Machine Learning and the Azure platform, you can:
18
+
In this article, you learn about security and governance features available for Azure Machine Learning. These features are useful for administrators, DevOps, and MLOps who want to create a secure configuration that is compliant with your companies policies. With Azure Machine Learning and the Azure platform, you can:
19
19
20
20
* Restrict access to resources and operations by user account or groups
21
21
* Restrict incoming and outgoing network communications
@@ -50,7 +50,7 @@ Each workspace has an associated system-assigned [managed identity](../active-di
50
50
| Azure Container Registry | Contributor |
51
51
| Resource group that contains the workspace | Contributor |
52
52
53
-
The system-assigned managed identity is used for internal service-to-service authentication between Azure Machine Learning and other Azure resources. The identity token is not accessible to users and cannot be used by them to gain access to these resources. Users can only access the resources through [Azure Machine Learning control and data plane APIs](how-to-assign-roles.md), if they have sufficient RBAC permissions.
53
+
The system-assigned managed identity is used for internal service-to-service authentication between Azure Machine Learning and other Azure resources. The identity token isn't accessible to users and they can't use it to gain access to these resources. Users can only access the resources through [Azure Machine Learning control and data plane APIs](how-to-assign-roles.md), if they have sufficient RBAC permissions.
54
54
55
55
We don't recommend that admins revoke the access of the managed identity to the resources mentioned in the preceding table. You can restore access by using the [resync keys operation](how-to-change-storage-access-key.md).
56
56
@@ -77,15 +77,19 @@ For more information, see the following articles:
77
77
78
78
## Network security and isolation
79
79
80
-
To restrict network access to Azure Machine Learning resources, you can use [Azure Virtual Network (VNet)](../virtual-network/virtual-networks-overview.md)and [Azure Machine Learning managed virtual network (preview)](how-to-managed-network.md). Using a virtual network reduces the attack surface for your solution, as well as the chances of data exfiltration.
80
+
To restrict network access to Azure Machine Learning resources, you can use an [Azure Machine Learning managed virtual network](how-to-managed-network.md)(preview) or [Azure Virtual Network (VNet)](../virtual-network/virtual-networks-overview.md). Using a virtual network reduces the attack surface for your solution, and the chances of data exfiltration.
81
81
82
82
You don't have to pick one or the other. For example, you can use a managed virtual network to secure managed compute resources and an Azure Virtual Network for your unmanaged resources or to secure client access to the workspace.
83
83
84
-
*__Azure Machine Learning managed virtual network__ (preview) provides a fully managed solution that enables network isolation for your workspace and managed compute resources. You can use private endpoints to secure communication with other Azure services, and can restrict outbound communications.
84
+
*__Azure Machine Learning managed virtual network__ (preview) provides a fully managed solution that enables network isolation for your workspace and managed compute resources. You can use private endpoints to secure communication with other Azure services, and can restrict outbound communications. The following managed compute resources are secured with a managed network:
For more information, see [Azure Machine Learning managed virtual network (preview)](how-to-managed-network.md).
92
+
For more information, see [Azure Machine Learning managed virtual network](how-to-managed-network.md) (preview).
89
93
90
94
*__Azure Virtual Networks__ provides a more customizable virtual network offering. However, you're responsible for configuration and management. You may need to use network security groups, user-defined routing, or a firewall to restrict outbound communication.
91
95
@@ -103,7 +107,7 @@ You don't have to pick one or the other. For example, you can use a managed virt
103
107
104
108
## Data encryption
105
109
106
-
Azure Machine Learning uses a variety of compute resources and data stores on the Azure platform. To learn more about how each of these supports data encryption at rest and in transit, see [Data encryption with Azure Machine Learning](concept-data-encryption.md).
110
+
Azure Machine Learning uses various compute resources and data stores on the Azure platform. To learn more about how each of these resources supports data encryption at rest and in transit, see [Data encryption with Azure Machine Learning](concept-data-encryption.md).
Azure Machine Learning requires access to servers and services on the public internet. When implementing network isolation, you need to understand what access is required and how to enable it.
21
21
22
22
> [!NOTE]
23
-
> The information in this article applies to Azure Machine Learning workspace configured with a private endpoint.
23
+
> The information in this article applies to Azure Machine Learning workspace configured to use an _Azure Virtual Network_. When using a _managed virtual network_, the required inbound and outbound configuration for the workspace is automatically applied. For more information, see [Azure Machine Learning managed virtual network](how-to-managed-network.md).
24
24
25
25
## Common terms and information
26
26
@@ -100,16 +100,7 @@ __Outbound traffic__
100
100
101
101
__To allow installation of Python packages for training and deployment__, allow __outbound__ traffic to the following host names:
102
102
103
-
> [!NOTE]
104
-
> This is not a complete list of the hosts required for all Python resources on the internet, only the most commonly used. For example, if you need access to a GitHub repository or other host, you must identify and add the required hosts for that scenario.
105
-
106
-
|__Host name__|__Purpose__|
107
-
| ---- | ---- |
108
-
|`anaconda.com`<br>`*.anaconda.com`| Used to install default packages. |
109
-
|`*.anaconda.org`| Used to get repo data. |
110
-
|`pypi.org`| Used to list dependencies from the default index, if any, and the index isn't overwritten by user settings. If the index is overwritten, you must also allow `*.pythonhosted.org`. |
111
-
|`*pytorch.org`| Used by some examples based on PyTorch. |
112
-
|`*.tensorflow.org`| Used by some examples based on Tensorflow. |
In this article, you learn how to use Azure Machine Learning studio in a virtual network. The studio includes features like AutoML, the designer, and data labeling.
20
22
21
23
Some of the studio's features are disabled by default in a virtual network. To re-enable these features, you must enable managed identity for storage accounts you intend to use in the studio.
@@ -92,9 +94,9 @@ In this article, you learn how to:
92
94
93
95
### Designer sample pipeline
94
96
95
-
There's a known issue where user cannot run sample pipeline in Designer homepage. This is the sample dataset used in the sample pipeline is Azure Global dataset, and it cannot satisfy all virtual network environment.
97
+
There's a known issue where user can't run sample pipeline in Designer homepage. This problem occurs because the sample dataset used in the sample pipeline is an Azure Global dataset. It can't be accessed from a virtual network environment.
96
98
97
-
To resolve this issue, you can use a public workspace to run sample pipeline to get to know how to use the designer and then replace the sample dataset with your own dataset in the workspace within virtual network.
99
+
To resolve this issue, use a public workspace to run the sample pipeline. Or replace the sample dataset with your own dataset in the workspace within a virtual network.
98
100
99
101
## Datastore: Azure Storage Account
100
102
@@ -103,7 +105,7 @@ Use the following steps to enable access to data stored in Azure Blob and File s
103
105
> [!TIP]
104
106
> The first step is not required for the default storage account for the workspace. All other steps are required for *any* storage account behind the VNet and used by the workspace, including the default storage account.
105
107
106
-
1.**If the storage account is the *default* storage for your workspace, skip this step**. If it is not the default, __Grant the workspace managed identity the 'Storage Blob Data Reader' role__ for the Azure storage account so that it can read data from blob storage.
108
+
1.**If the storage account is the *default* storage for your workspace, skip this step**. If it isn't the default, __Grant the workspace managed identity the 'Storage Blob Data Reader' role__ for the Azure storage account so that it can read data from blob storage.
107
109
108
110
For more information, see the [Blob Data Reader](../role-based-access-control/built-in-roles.md#storage-blob-data-reader) built-in role.
109
111
@@ -115,15 +117,15 @@ Use the following steps to enable access to data stored in Azure Blob and File s
115
117
For more information, see the [Reader](../role-based-access-control/built-in-roles.md#reader) built-in role.
116
118
117
119
<aid='enable-managed-identity'></a>
118
-
1.__Enable managed identity authentication for default storage accounts__. Each Azure Machine Learning workspace has two default storage accounts, a default blob storage account and a default file store account, which are defined when you create your workspace. You can also set new defaults in the __Datastore__ management page.
120
+
1.__Enable managed identity authentication for default storage accounts__. Each Azure Machine Learning workspace has two default storage accounts, a default blob storage account and a default file store account. Both are defined when you create your workspace. You can also set new defaults in the __Datastore__ management page.
119
121
120
122

121
123
122
124
The following table describes why managed identity authentication is used for your workspace default storage accounts.
123
125
124
126
|Storage account | Notes |
125
127
|---------|---------|
126
-
|Workspace default blob storage| Stores model assets from the designer. Enable managed identity authentication on this storage account to deploy models in the designer. If managed identity authentication is disabled, the user's identity is used to access data stored in the blob. <br> <br> You can visualize and run a designer pipeline if it uses a non-default datastore that has been configured to use managed identity. However, if you try to deploy a trained model without managed identity enabled on the default datastore, deployment will fail regardless of any other datastores in use.|
128
+
|Workspace default blob storage| Stores model assets from the designer. Enable managed identity authentication on this storage account to deploy models in the designer. If managed identity authentication is disabled, the user's identity is used to access data stored in the blob. <br> <br> You can visualize and run a designer pipeline if it uses a non-default datastore that has been configured to use managed identity. However, if you try to deploy a trained model without managed identity enabled on the default datastore, deployment fails regardless of any other datastores in use.|
127
129
|Workspace default file store| Stores AutoML experiment assets. Enable managed identity authentication on this storage account to submit AutoML experiments. |
128
130
129
131
1.__Configure datastores to use managed identity authentication__. After you add an Azure storage account to your virtual network with either a [service endpoint](how-to-secure-workspace-vnet.md?tabs=se#secure-azure-storage-accounts) or [private endpoint](how-to-secure-workspace-vnet.md?tabs=pe#secure-azure-storage-accounts), you must configure your datastore to use [managed identity](../active-directory/managed-identities-azure-resources/overview.md) authentication. Doing so lets the studio access data in your storage account.
@@ -164,24 +166,24 @@ After you create a SQL contained user, grant permissions to it by using the [GRA
164
166
165
167
## Intermediate component output
166
168
167
-
When using the Azure Machine Learning designer intermediate component output, you can specify the output location for any component in the designer. Use this to store intermediate datasets in separate location for security, logging, or auditing purposes. To specify output, use the following steps:
169
+
When using the Azure Machine Learning designer intermediate component output, you can specify the output location for any component in the designer. Use this output to store intermediate datasets in separate location for security, logging, or auditing purposes. To specify output, use the following steps:
168
170
169
171
1. Select the component whose output you'd like to specify.
170
172
1. In the component settings pane that appears to the right, select __Output settings__.
171
173
1. Specify the datastore you want to use for each component output.
172
174
173
-
Make sure that you have access to the intermediate storage accounts in your virtual network. Otherwise, the pipeline will fail.
175
+
Make sure that you have access to the intermediate storage accounts in your virtual network. Otherwise, the pipeline fails.
174
176
175
177
[Enable managed identity authentication](#enable-managed-identity) for intermediate storage accounts to visualize output data.
176
178
## Access the studio from a resource inside the VNet
177
179
178
-
If you are accessing the studio from a resource inside of a virtual network (for example, a compute instance or virtual machine), you must allow outbound traffic from the virtual network to the studio.
180
+
If you're accessing the studio from a resource inside of a virtual network (for example, a compute instance or virtual machine), you must allow outbound traffic from the virtual network to the studio.
179
181
180
-
For example, if you are using network security groups (NSG) to restrict outbound traffic, add a rule to a __service tag__ destination of __AzureFrontDoor.Frontend__.
182
+
For example, if you're using network security groups (NSG) to restrict outbound traffic, add a rule to a __service tag__ destination of __AzureFrontDoor.Frontend__.
181
183
182
184
## Firewall settings
183
185
184
-
Some storage services, such as Azure Storage Account, have firewall settings that apply to the public endpoint for that specific service instance. Usually this setting allows you to allow/disallow access from specific IP addresses from the public internet. __This is not supported__ when using Azure Machine Learning studio. It is supported when using the Azure Machine Learning SDK or CLI.
186
+
Some storage services, such as Azure Storage Account, have firewall settings that apply to the public endpoint for that specific service instance. Usually this setting allows you to allow/disallow access from specific IP addresses from the public internet. __This is not supported__ when using Azure Machine Learning studio. It's supported when using the Azure Machine Learning SDK or CLI.
185
187
186
188
> [!TIP]
187
189
> Azure Machine Learning studio is supported when using the Azure Firewall service. For more information, see [Use your workspace behind a firewall](how-to-access-azureml-behind-firewall.md).
0 commit comments