Skip to content

Commit ce439b3

Browse files
authored
Merge pull request #2531 from Blackmist/360209-cmk-ga
initial writing
2 parents cea7450 + 95bb935 commit ce439b3

File tree

1 file changed

+47
-47
lines changed

1 file changed

+47
-47
lines changed

articles/machine-learning/concept-customer-managed-keys.md

Lines changed: 47 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,13 @@ ms.topic: conceptual
99
ms.author: larryfr
1010
author: Blackmist
1111
ms.reviewer: deeikele
12-
ms.date: 05/21/2024
12+
ms.date: 01/28/2025
1313
ms.custom: engagement-fy23, build-2024
1414
monikerRange: 'azureml-api-2 || azureml-api-1'
1515
---
1616
# Customer-managed keys for Azure Machine Learning
1717

18-
Azure Machine Learning is built on top of multiple Azure services. Although the stored data is encrypted through encryption keys that Microsoft provides, you can enhance security by also providing your own (customer-managed) keys. The keys that you provide are stored in Azure Key Vault. Your data is can be stored on a set of other resources that you manage in your Azure subscription, or [(preview) server-side on Microsoft managed resources](#preview-service-side-encryption-of-metadata).
18+
Azure Machine Learning is built on top of multiple Azure services. Although the stored data is encrypted through encryption keys that Microsoft provides, you can enhance security by also providing your own (customer-managed) keys. The keys that you provide are stored in Azure Key Vault. Your data can be stored on a set of other resources that you manage in your Azure subscription, or [service-side on Microsoft managed resources](#service-side-encryption-of-metadata).
1919

2020
In addition to customer-managed keys (CMK), Azure Machine Learning provides an [high business impact configuration](/python/api/azure-ai-ml/azure.ai.ml.entities.workspace) for highly sensitive data workloads. Enabling this configuration reduces the amount of data that Microsoft collects for diagnostic purposes and enables [extra encryption in Microsoft-managed environments](/azure/security/fundamentals/encryption-atrest).
2121

@@ -35,8 +35,8 @@ For example, the managed identity for Azure Cosmos DB would need to have those p
3535
## Limitations
3636

3737
* After workspace creation, the customer-managed encryption key for resources that the workspace depends on can only be updated to another key in the original Azure Key Vault resource.
38-
* Unless you are using the [server-side preview](#preview-service-side-encryption-of-metadata), the encrypted data is stored on resources in a Microsoft-managed resource group in your subscription. You can't create these resources up front or transfer ownership of them to you. The data lifecycle is managed indirectly via the Azure Machine Learning APIs as you create objects in the Azure Machine Learning service.
39-
* If you are using the [server-side preview](#preview-service-side-encryption-of-metadata), Azure charges will continue to accrue during the soft delete retention period.
38+
* Unless you are using the [service-side](#service-side-encryption-of-metadata), the encrypted data is stored on resources in a Microsoft-managed resource group in your subscription. You can't create these resources up front or transfer ownership of them to you. The data lifecycle is managed indirectly via the Azure Machine Learning APIs as you create objects in the Azure Machine Learning service.
39+
* If you are using the [service-side](#service-side-encryption-of-metadata), Azure charges will continue to accrue during the soft delete retention period.
4040
* You can't delete Microsoft-managed resources that you use for customer-managed keys without also deleting your workspace.
4141
* You can't encrypt the compute cluster's OS disk by using your customer-managed keys. You must use Microsoft-managed keys.
4242

@@ -47,20 +47,59 @@ For example, the managed identity for Azure Cosmos DB would need to have those p
4747

4848
When you *don't* use a customer-managed key, Microsoft creates and manages resources in a Microsoft-owned Azure subscription and uses a Microsoft-managed key to encrypt the data.
4949

50-
When you use a customer-managed key, the resources are in your Azure subscription and encrypted with your key. While these resources exist in your subscription, Microsoft manages them. These resources are automatically created and configured when you create your Azure Machine Learning workspace.
50+
When you use a customer-managed key, there are two possible configurations:
5151

52-
These Microsoft-managed resources are located in a new Azure resource group created in your subscription. This resource group is separate from the resource group for your workspace. It contains the Microsoft-managed resources that your key is used with. The formula for naming the resource group is: `<Azure Machine Learning workspace resource group name><GUID>`.
52+
- [Service-side encryption](#service-side-encryption-of-metadata): The resources are stored service-side on Microsoft-managed resources. This configuration reduces costs and also reduces the chance of conflict with policies you may have set for your Azure subscription.
53+
- [Subscription-side encryption (classic)](#subscription-side-encryption-of-metadata-classic): The resources are hosted in your Azure subscription and encrypted with your key. While these resources exist in your subscription, Microsoft manages them. These resources are automatically created and configured when you create your Azure Machine Learning workspace.
5354

54-
> [!TIP]
55-
> The [Request Units](/azure/cosmos-db/request-units) for Azure Cosmos DB automatically scale as needed.
55+
## Service-side encryption of metadata
56+
57+
A new architecture for the customer-managed key encryption workspace is available in preview, reducing cost compared to the current architecture and mitigating likelihood of Azure policy conflicts. In this configuration, encrypted data is stored service-side on Microsoft-managed resources instead of in your subscription.
58+
59+
Data that previously was stored in Azure Cosmos DB in your subscription, is stored in multitenant Microsoft-managed resources with document-level encryption using your encryption key. Search indices that were previously stored in Azure AI Search in your subscription, are stored on Microsoft-managed resources that are provisioned dedicated for you per workspace. The cost of the Azure AI search instance is charged under your Azure Machine Learning workspace in Microsoft Cost Management.
60+
61+
Pipelines metadata that previously was stored in a storage account in a managed resource group, is now stored on the storage account in your subscription that is associated to the Azure Machine Learning workspace. Since this Azure Storage resource is managed separately in your subscription, you're responsible to configure encryption settings on it.
62+
63+
To opt in for this preview, set the `enableServiceSideCMKEncryption` on a REST API or in your Bicep or Resource Manager template. You can also use Azure portal.
64+
65+
:::image type="content" source="./media/concept-customer-managed-keys/cmk-service-side-encryption.png" alt-text="Screenshot of the encryption tab with the option for server side encryption selected." lightbox="./media/concept-customer-managed-keys/cmk-service-side-encryption.png":::
66+
67+
> [!NOTE]
68+
> - When you use service-side encryption, Azure charges will continue to accrue during the soft delete retention period.
69+
70+
For templates that create a workspace with service-side encryption of metadata, see
71+
72+
- [Bicep template for creating default workspace](https://github.com/azure/azure-quickstart-templates/tree/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-cmk-service-side-encryption).
73+
- [Bicep template for creating hub workspace](https://github.com/Azure/azure-quickstart-templates/tree/master/quickstarts/microsoft.machinelearningservices/aistudio-cmk-service-side-encryption).
74+
75+
### Subscription-side encryption of metadata (classic)
76+
77+
When you bring your own encryption key, service metadata is stored on dedicated resources in your Azure subscription. Microsoft creates a separate resource group in your subscription for this purpose: *azureml-rg-workspacename_GUID*. Only Microsoft can modify the resources in this managed resource group.
5678

5779
If your Azure Machine Learning workspace uses a private endpoint, this resource group also contains a Microsoft-managed Azure virtual network. This virtual network helps secure communication between the managed services and the workspace. You *can't provide your own virtual network* for use with the Microsoft-managed resources. You also *can't modify the virtual network*. For example, you can't change the IP address range that it uses.
5880

81+
Microsoft creates the following resources to store metadata for your workspace:
82+
83+
| Service | Usage | Example data |
84+
| ----- | ----- | ----- |
85+
| Azure Cosmos DB | Stores job history data, compute metadata, and asset metadata. | Data can include job name, status, sequence number, and status; compute cluster name, number of cores, and number of nodes; datastore names and tags, and descriptions on assets like models; and data label names. |
86+
| Azure AI Search | Stores indexes that help with querying your machine learning content. | These indexes are built on top of the data stored in Azure Cosmos DB. |
87+
| Azure Storage | Stores metadata related to Azure Machine Learning pipeline data. | Data can include designer pipeline names, pipeline layout, and execution properties. |
88+
89+
> [!TIP]
90+
> The [Request Units](/azure/cosmos-db/request-units) for Azure Cosmos DB automatically scale as needed.
91+
5992
> [!IMPORTANT]
6093
> If your subscription doesn't have enough quota for these services, a failure will occur.
6194
>
6295
> When you use a customer-managed key, the costs for your subscription are higher because these resources are in your subscription. To estimate the cost, use the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculator/).
6396
97+
From the perspective of data lifecycle management, data in the preceding resources is created and deleted as you create and delete corresponding objects in Azure Machine Learning.
98+
99+
Your Azure Machine Learning workspace reads and writes data by using its managed identity. This identity is granted access to the resources through a role assignment (Azure role-based access control) on the data resources. The encryption key that you provide is used to encrypt data that stored on Microsoft-managed resources. At runtime, the key is also used to create indexes for Azure AI Search.
100+
101+
Extra networking controls are configured when you create a private link endpoint on your workspace to allow for inbound connectivity. This configuration includes the creation of a private link endpoint connection to the Azure Cosmos DB instance. Network access is restricted to only trusted Microsoft services.
102+
64103
## Encryption of data on compute resources
65104

66105
Azure Machine Learning uses compute resources to train and deploy machine learning models. The following table describes the compute options and how each one encrypts data:
@@ -92,45 +131,6 @@ Azure Disk Encryption isn't supported for the OS disk. Each virtual machine also
92131
### Compute instance
93132

94133
The OS disk for a compute instance is encrypted with Microsoft-managed keys in Azure Machine Learning storage accounts. If you create the workspace with the `hbi_workspace` parameter set to `TRUE`, the local temporary disk on the compute instance is encrypted with Microsoft-managed keys. Customer-managed key encryption isn't supported for OS and temporary disks.
95-
96-
## Storage of encrypted workspace metadata
97-
98-
When you bring your own encryption key, service metadata is stored on dedicated resources in your Azure subscription. Microsoft creates a separate resource group in your subscription for this purpose: *azureml-rg-workspacename_GUID*. Only Microsoft can modify the resources in this managed resource group.
99-
100-
Microsoft creates the following resources to store metadata for your workspace:
101-
102-
| Service | Usage | Example data |
103-
| ----- | ----- | ----- |
104-
| Azure Cosmos DB | Stores job history data, compute metadata, and asset metadata. | Data can include job name, status, sequence number, and status; compute cluster name, number of cores, and number of nodes; datastore names and tags, and descriptions on assets like models; and data label names. |
105-
| Azure AI Search | Stores indexes that help with querying your machine learning content. | These indexes are built on top of the data stored in Azure Cosmos DB. |
106-
| Azure Storage | Stores metadata related to Azure Machine Learning pipeline data. | Data can include designer pipeline names, pipeline layout, and execution properties. |
107-
108-
From the perspective of data lifecycle management, data in the preceding resources is created and deleted as you create and delete corresponding objects in Azure Machine Learning.
109-
110-
Your Azure Machine Learning workspace reads and writes data by using its managed identity. This identity is granted access to the resources through a role assignment (Azure role-based access control) on the data resources. The encryption key that you provide is used to encrypt data that stored on Microsoft-managed resources. At runtime, the key is also used to create indexes for Azure AI Search.
111-
112-
Extra networking controls are configured when you create a private link endpoint on your workspace to allow for inbound connectivity. This configuration includes the creation of a private link endpoint connection to the Azure Cosmos DB instance. Network access is restricted to only trusted Microsoft services.
113-
114-
## (Preview) Service-side encryption of metadata
115-
116-
A new architecture for the customer-managed key encryption workspace is available in preview, reducing cost compared to the current architecture and mitigating likelihood of Azure policy conflicts. In this new model, encrypted data is stored service-side on Microsoft-managed resources instead of in your subscription.
117-
118-
Data that previously was stored in Azure Cosmos DB in your subscription, is stored in multitenant Microsoft-managed resources with document-level encryption using your encryption key. Search indices that were previously stored in Azure AI Search in your subscription, are stored on Microsoft-managed resources that are provisioned dedicated for you per workspace. The cost of the Azure AI search instance is charged under your Azure Machine Learning workspace in Microsoft Cost Management.
119-
120-
Pipelines metadata that previously was stored in a storage account in a managed resource group, is now stored on the storage account in your subscription that is associated to the Azure Machine Learning workspace. Since this Azure Storage resource is managed separately in your subscription, you're responsible to configure encryption settings on it.
121-
122-
To opt in for this preview, set the `enableServiceSideCMKEncryption` on a REST API or in your Bicep or Resource Manager template. You can also use Azure portal.
123-
124-
:::image type="content" source="./media/concept-customer-managed-keys/cmk-service-side-encryption.png" alt-text="Screenshot of the encryption tab with the option for server side encryption selected." lightbox="./media/concept-customer-managed-keys/cmk-service-side-encryption.png":::
125-
126-
> [!NOTE]
127-
> - During this preview key rotation and data labeling capabilities are not supported. Server-side encryption is currently not supported in reference to an Azure Key Vault for storing your encryption key that has public network access disabled.
128-
> - If you are using the preview server-side storage, Azure charges will continue to accrue during the soft delete retention period.
129-
130-
For templates that create a workspace with service-side encryption of metadata, see
131-
132-
- [Bicep template for creating default workspace](https://github.com/azure/azure-quickstart-templates/tree/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-cmk-service-side-encryption).
133-
- [Bicep template for creating hub workspace](https://github.com/Azure/azure-quickstart-templates/tree/master/quickstarts/microsoft.machinelearningservices/aistudio-cmk-service-side-encryption).
134134

135135
## High business impact (HBI) configuration
136136

0 commit comments

Comments
 (0)