Skip to content

Commit e9690f5

Browse files
authored
Merge pull request #688 from MicrosoftDocs/main
10/07/2024 PM Publish
2 parents 59f251b + eb04bbc commit e9690f5

30 files changed

+836
-487
lines changed

articles/ai-services/openai/how-to/batch.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ In the Studio UI the deployment type will appear as `Global-Batch`.
8585
> [!TIP]
8686
> Each line of your input file for batch processing has a `model` attribute that requires a global batch **deployment name**. For a given input file, all names must be the same deployment name. This is different from OpenAI where the concept of model deployments does not exist.
8787
>
88-
> For the best performance we recommend submitting large files for patch processing, rather than a large number of small files with only a few lines in each file.
88+
> For the best performance we recommend submitting large files for batch processing, rather than a large number of small files with only a few lines in each file.
8989
9090
::: zone pivot="programming-language-ai-studio"
9191

articles/ai-services/translator/service-limits.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: laujan
77
manager: nitinme
88
ms.service: azure-ai-translator
99
ms.topic: conceptual
10-
ms.date: 09/26/2024
10+
ms.date: 10/07/2024
1111
ms.author: lajanuar
1212
---
1313

@@ -78,7 +78,7 @@ The Translator has a maximum latency of 15 seconds using standard models and 120
7878
|Total number of files.|≤ 1000 |
7979
|Total content size in a batch | ≤ 250 MB|
8080
|Number of target languages in a batch| ≤ 10 |
81-
|Size of Translation memory file| ≤ 10 MB|
81+
|Size of glossary file| ≤ 10 MB|
8282

8383
##### Synchronous operation limits
8484

@@ -87,7 +87,7 @@ The Translator has a maximum latency of 15 seconds using standard models and 120
8787
|Document size| ≤ 10 MB |
8888
|Total number of files.|1 |
8989
|Total number of target languages | 1|
90-
|Size of Translation memory file| ≤ 1 MB|
90+
|Size of glossary file| ≤ 1 MB|
9191
|Translated character limit|6 million characters per minute (cpm).|
9292

9393
## Next steps

articles/ai-studio/concepts/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ While most of the resources used by Azure AI Studio live in your Azure subscript
7474
- **Metadata storage**: Provided by Azure Storage resources in the Microsoft subscription.
7575

7676
> [!NOTE]
77-
> If you use customer-managed keys, the metadata storage resources are created in your subscription. For more information, see [Customer-managed keys](../../ai-services/encryption/cognitive-services-encryption-keys-portal.md?context=/azure/ai-studio/context/context).
77+
> If you use customer-managed keys, the metadata storage resources are created in your subscription. For more information, see [Customer-managed keys](encryption-keys-portal.md).
7878
7979
Managed compute resources and managed virtual networks exist in the Microsoft subscription, but you manage them. For example, you control which VM sizes are used for compute resources, and which outbound rules are configured for the managed virtual network.
8080

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
title: Customer-Managed Keys for Azure AI Studio
3+
titleSuffix: Azure AI Studio
4+
description: Learn about using customer-managed keys for encryption to improve data security with Azure AI Studio.
5+
author: Blackmist
6+
ms.author: larryfr
7+
ms.service: azure-ai-services
8+
ms.custom:
9+
- ignite-2023
10+
ms.topic: concept-article
11+
ms.date: 10/7/2024
12+
ms.reviewer: deeikele
13+
# Customer intent: As an admin, I want to understand how I can use my own encryption keys with Azure AI Studio.
14+
---
15+
16+
# Customer-managed keys for encryption with Azure AI Studio
17+
18+
Customer-managed keys (CMKs) in Azure AI Studio provide enhanced control over the encryption of your data. By using CMKs, you can manage your own encryption keys to add an extra layer of protection and meet compliance requirements more effectively.
19+
20+
## About encryption in Azure AI Studio
21+
22+
Azure AI Studio layers on top of Azure Machine Learning and Azure AI services. By default, these services use Microsoft-managed encryption keys.
23+
24+
Hub and project resources are implementations of the Azure Machine Learning workspace and encrypt data in transit and at rest. For details, see [Data encryption with Azure Machine Learning](../../machine-learning/concept-data-encryption.md).
25+
26+
Azure AI services data is encrypted and decrypted using [FIPS 140-2](https://en.wikipedia.org/wiki/FIPS_140-2) compliant [256-bit AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) encryption. Encryption and decryption are transparent, meaning encryption and access are managed for you. Your data is secure by default and you don't need to modify your code or applications to take advantage of encryption.
27+
28+
## Data storage in your subscription when using customer-managed keys
29+
30+
Hub resources store metadata in your Azure subscription when using customer-managed keys. Data is stored in a Microsoft-managed resource group that includes an Azure Storage account, Azure Cosmos DB resource and Azure AI Search.
31+
32+
> [!IMPORTANT]
33+
> When using a customer-managed key, the costs for your subscription will be higher because encrypted data is stored in your subscription. To estimate the cost, use the [Azure pricing calculator](https://azure.microsoft.com/pricing/calculator/).
34+
35+
The encryption key you provide when creating a hub is used to encrypt data that is stored on Microsoft-managed resources. All projects using the same hub store data on the resources in a managed resource group identified by the name `azureml-rg-hubworkspacename_GUID`. Projects use Microsoft Entra ID authentication when interacting with these resources. If your hub has a private link endpoint, network access to the managed resources is restricted. The managed resource group is deleted, when the hub is deleted.
36+
37+
The following data is stored on the managed resources.
38+
39+
|Service|What it's used for|Example|
40+
|-----|-----|-----|
41+
|Azure Cosmos DB|Stores metadata for your Azure AI projects and tools|Index names, tags; Flow creation timestamps; deployment tags; evaluation metrics|
42+
|Azure AI Search|Stores indices that are used to help query your AI studio content.|An index based off your model deployment names|
43+
|Azure Storage Account|Stores instructions for how customization tasks are orchestrated|JSON representation of flows you create in AI Studio|
44+
45+
>[!IMPORTANT]
46+
> Azure AI Studio uses Azure compute that is managed in the Microsoft subscription, for example when you fine-tune models or or build flows. Its disks are encrypted with Microsoft-managed keys. Compute is ephemeral, meaning after a task is completed the virtual machine is deprovisioned, and the OS disk is deleted. Compute instance machines used for 'Code' experiences are persistant. Azure Disk Encryption isn't supported for the OS disk.
47+
48+
## (Preview) Service-side storage of encrypted data when using customer-managed keys
49+
50+
A new architecture for customer-managed key encryption with hubs is available in preview, which resolves the dependency on the managed resource group. In this new model, encrypted data is stored service-side on Microsoft-managed resources instead of in managed resources in your subscription. Metadata is stored in multitenant resources using document-level CMK encryption. An Azure AI Search instance is hosted on the Microsoft-side per customer, and for each hub. Due to its dedicated resource model, its Azure cost is charged in your subscription via the hub resource.
51+
52+
> [!NOTE]
53+
> During this preview key rotation and user-assigned identity capabilities are not supported. Server-side encryption is currently not supported in reference to an Azure Key Vault for storing your encryption key that has public network access disabled.
54+
55+
## Use customer-managed keys with Azure Key Vault
56+
57+
You must use Azure Key Vault to store your customer-managed keys. You can either create your own keys and store them in a key vault, or you can use the Azure Key Vault APIs to generate keys. The Azure AI services resource and the key vault must be in the same region and in the same Microsoft Entra tenant, but they can be in different subscriptions. For more information about Azure Key Vault, see [What is Azure Key Vault?](/azure/key-vault/general/overview).
58+
59+
To enable customer-managed keys, the key vault containing your keys must meet these requirements:
60+
61+
- You must enable both the **Soft Delete** and **Do Not Purge** properties on the key vault.
62+
- If you use the [Key Vault firewall](/azure/key-vault/general/access-behind-firewall), you must allow trusted Microsoft services to access the key vault.
63+
- You must grant your hub's and Azure AI Services resource's system-assigned managed identity the following permissions on your key vault: *get key*, *wrap key*, *unwrap key*.
64+
65+
The following limitations hold for Azure AI Services:
66+
- Only Azure Key Vault with [legacy access policies](/azure/key-vault/general/assign-access-policy) are supported.
67+
- Only RSA and RSA-HSM keys of size 2048 are supported with Azure AI services encryption. For more information about keys, see **Key Vault keys** in [About Azure Key Vault keys, secrets, and certificates](/azure/key-vault/general/about-keys-secrets-certificates).
68+
69+
### Enable your Azure AI Services resource's managed identity
70+
71+
If connecting with Azure AI Services, or variants of Azure AI Services such as Azure OpenAI, you need to enable managed identity as a prerequisite for using customer-managed keys.
72+
73+
1. Go to your Azure AI services resource.
74+
1. On the left, under **Resource Management**, select **Identity**.
75+
1. Switch the system-assigned managed identity status to **On**.
76+
1. Save your changes, and confirm that you want to enable the system-assigned managed identity.
77+
78+
## Enable customer-managed keys
79+
80+
Azure AI studio builds on hub as implementation of Azure Machine Learning workspace, Azure AI Services, and lets you connect with other resources in Azure. You must set encryption specifically on each resource.
81+
82+
Customer-managed key encryption is configured via Azure portal in a similar way for each Azure resource:
83+
1. Create a new Azure resource in Azure portal.
84+
1. Under the encryption tab, select your encryption key.
85+
86+
:::image type="content" source="../../machine-learning/media/concept-customer-managed-keys/cmk-service-side-encryption.png" alt-text="Screenshot of the encryption tab with the option for server side encryption selected." lightbox="../../machine-learning/media/concept-customer-managed-keys/cmk-service-side-encryption.png":::
87+
88+
Alternatively, use infrastructure-as-code options for automation. Example Bicep templates for Azure AI Studio are available on the Azure Quickstart repo:
89+
1. [CMK encryption for hub](https://github.com/Azure/azure-quickstart-templates/tree/master/quickstarts/microsoft.machinelearningservices/aistudio-cmk).
90+
1. [Service-side CMK encryption preview for hub](https://github.com/azure/azure-quickstart-templates/tree/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-cmk-service-side-encryption).
91+
92+
## Limitations
93+
94+
* The customer-managed key for encryption can only be updated to keys in the same Azure Key Vault instance.
95+
* After deployment, hubs can't switch from Microsoft-managed keys to Customer-managed keys or vice versa.
96+
* [Azure AI services Customer-Managed Key Request Form](https://aka.ms/cogsvc-cmk) is required to use customer-managed keys in combination with Azure Speech and Content Moderator capabilities.
97+
* At the time of creation, you can't provide or modify resources that are created in the Microsoft-managed Azure resource group in your subscription.
98+
* You can't delete Microsoft-managed resources used for customer-managed keys without also deleting your hub.
99+
* [Azure AI services Customer-Managed Key Request Form](https://aka.ms/cogsvc-cmk) is still required for Speech and Content Moderator.
100+
101+
## Related content
102+
103+
* [What is Azure Key Vault](/azure/key-vault/general/overview)?

articles/ai-studio/toc.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -324,10 +324,10 @@ items:
324324
href: how-to/troubleshoot-secure-connection-project.md
325325
- name: Data protection & encryption
326326
items:
327+
- name: Configure customer-managed keys
328+
href: concepts/encryption-keys-portal.md
327329
- name: Rotate keys
328330
href: ../ai-services/rotate-keys.md?context=/azure/ai-studio/context/context
329-
- name: Configure customer-managed keys
330-
href: ../ai-services/encryption/cognitive-services-encryption-keys-portal.md?context=/azure/ai-studio/context/context
331331
- name: Vulnerability management
332332
href: concepts/vulnerability-management.md
333333
- name: Disaster recovery

articles/machine-learning/concept-automl-forecasting-at-scale.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ The many models [components](concept-component.md) in AutoML enable you to train
2828

2929
The many models training component applies AutoML's [model sweeping and selection](concept-automl-forecasting-sweeping.md) independently to each store in this example. This model independence aids scalability and can benefit model accuracy especially when the stores have diverging sales dynamics. However, a single model approach may yield more accurate forecasts when there are common sales dynamics. See the [distributed DNN training](#distributed-dnn-training-preview) section for more details on that case.
3030

31-
You can configure the data partitioning, the [AutoML settings](how-to-auto-train-forecast.md#configure-experiment) for the models, and the degree of parallelism for many models training jobs. For examples, see our guide section on [many models components](how-to-auto-train-forecast.md#forecasting-at-scale-many-models).
31+
You can configure the data partitioning, the [AutoML settings](how-to-auto-train-forecast.md#configure-experiment) for the models, and the degree of parallelism for many models training jobs. For examples, see our guide section on [many models components](how-to-auto-train-forecast.md#forecast-at-scale-many-models).
3232

3333
## Hierarchical time series forecasting
3434

@@ -49,7 +49,7 @@ AutoML supports the following features for hierarchical time series (HTS):
4949
* **Retrieving quantile/probabilistic forecasts for levels at or "below" the training level**. Current modeling capabilities support disaggregation of probabilistic forecasts.
5050

5151
HTS components in AutoML are built on top of [many models](#many-models), so HTS shares the scalable properties of many models.
52-
For examples, see our guide section on [HTS components](how-to-auto-train-forecast.md#forecasting-at-scale-hierarchical-time-series).
52+
For examples, see our guide section on [HTS components](how-to-auto-train-forecast.md#forecast-at-scale-hierarchical-time-series).
5353

5454
## Distributed DNN training (preview)
5555

articles/machine-learning/concept-automl-forecasting-deep-learning.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ In the table, $n_{\text{input}} = n_{\text{features}} + 1$, the number of predic
7878

7979
## TCNForecaster in AutoML
8080

81-
TCNForecaster is an optional model in AutoML. To learn how to use it, see [enable deep learning](./how-to-auto-train-forecast.md#enable-deep-learning).
81+
TCNForecaster is an optional model in AutoML. To learn how to use it, see [enable deep learning](./how-to-auto-train-forecast.md#enable-learning-for-deep-neural-networks).
8282

8383
In this section, we describe how AutoML builds TCNForecaster models with your data, including explanations of data preprocessing, training, and model search.
8484

@@ -90,9 +90,9 @@ AutoML executes several preprocessing steps on your data to prepare for model tr
9090
|--|--|
9191
Fill missing data|[Impute missing values and observation gaps](./concept-automl-forecasting-methods.md#missing-data-handling) and optionally [pad or drop short time series](./how-to-auto-train-forecast.md#short-series-handling)|
9292
|Create calendar features|Augment the input data with [features derived from the calendar](./concept-automl-forecasting-calendar-features.md) like day of the week and, optionally, holidays for a specific country/region.|
93-
|Encode categorical data|[Label encode](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html) strings and other categorical types; this includes all [time series ID columns](./how-to-auto-train-forecast.md#forecasting-job-settings).|
93+
|Encode categorical data|[Label encode](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html) strings and other categorical types; this includes all [time series ID columns](./how-to-auto-train-forecast.md#forecast-job-settings).|
9494
|Target transform|Optionally apply the natural logarithm function to the target depending on the results of certain statistical tests.|
95-
|Normalization|[Z-score normalize](https://en.wikipedia.org/wiki/Standard_score) all numeric data; normalization is performed per feature and per time series group, as defined by the [time series ID columns](./how-to-auto-train-forecast.md#forecasting-job-settings).
95+
|Normalization|[Z-score normalize](https://en.wikipedia.org/wiki/Standard_score) all numeric data; normalization is performed per feature and per time series group, as defined by the [time series ID columns](./how-to-auto-train-forecast.md#forecast-job-settings).
9696

9797
These steps are included in AutoML's transform pipelines, so they're automatically applied when needed at inference time. In some cases, the inverse operation to a step is included in the inference pipeline. For example, if AutoML applied a $\log$ transform to the target during training, the raw forecasts are exponentiated in the inference pipeline.
9898

@@ -104,7 +104,7 @@ The following table lists and describes input settings and parameters for TCNFor
104104

105105
|Training input|Description|Value|
106106
|--|--|--|
107-
|Validation data|A portion of data that is held out from training to guide the network optimization and mitigate over fitting.| [Provided by the user](./how-to-auto-train-forecast.md#training-and-validation-data) or automatically created from training data if not provided.|
107+
|Validation data|A portion of data that is held out from training to guide the network optimization and mitigate over fitting.| [Provided by the user](./how-to-auto-train-forecast.md#prepare-training-and-validation-data) or automatically created from training data if not provided.|
108108
|Primary metric|Metric computed from median-value forecasts on the validation data at the end of each training epoch; used for early stopping and model selection.|[Chosen by the user](./how-to-auto-train-forecast.md#configure-experiment); normalized root mean squared error or normalized mean absolute error.|
109109
|Training epochs|Maximum number of epochs to run for network weight optimization.|100; automated early stopping logic may terminate training at a smaller number of epochs.
110110
|Early stopping patience|Number of epochs to wait for primary metric improvement before training is stopped.|20|

0 commit comments

Comments
 (0)