Merge pull request #272204 from msakande/MDC-GA-remove-preview-notes

prmerger-automator[bot] · web-flow · commit b68566956515 · 2024-04-15T21:20:27.000Z
updates for data collection GA
diff --git a/articles/machine-learning/concept-data-collection.md b/articles/machine-learning/concept-data-collection.md
@@ -1,5 +1,5 @@
 ---
-title: Inference data collection from models in production (preview)
+title: Inference data collection from models in production
 titleSuffix: Azure Machine Learning
 description: Collect inference data from models deployed on Azure Machine Learning to monitor their performance in production.
 services: machine-learning
@@ -10,18 +10,16 @@ ms.subservice: mlops
 ms.reviewer: mopeakande
 reviewer: msakande
 ms.topic: conceptual 
-ms.date: 05/09/2023
+ms.date: 04/15/2024
 ms.custom: devplatv2, event-tier1-build-2023, build-2023
 ---
 
-# Data collection from models in production (preview)
+# Data collection from models in production
 
 [!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
 
 In this article, you'll learn about data collection from models that are deployed to Azure Machine Learning online endpoints.
 
-[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
-
 Azure Machine Learning **Data collector** provides real-time logging of input and output data from models that are deployed to managed online endpoints or Kubernetes online endpoints. Azure Machine Learning stores the logged inference data in Azure blob storage. This data can then be seamlessly used for model monitoring, debugging, or auditing, thereby, providing observability into the performance of your deployed models.
 
 Data collector provides:
@@ -43,9 +41,9 @@ Data collector can be configured at the deployment level, and the configuration
 
 Data collector has the following limitations:
 - Data collector only supports logging for online (or real-time) Azure Machine Learning endpoints (Managed or Kubernetes).
-- The Data collector Python SDK only supports logging tabular data via `pandas DataFrames`.
+- The Data collector Python SDK only supports logging tabular data via pandas DataFrames.
 
 ## Next steps
 
-- [How to collect data from models in production (preview)](how-to-collect-production-data.md)
+- [How to collect data from models in production](how-to-collect-production-data.md)
 - [What are Azure Machine Learning endpoints?](concept-endpoints.md)
diff --git a/articles/machine-learning/how-to-collect-production-data.md b/articles/machine-learning/how-to-collect-production-data.md
@@ -1,5 +1,5 @@
 ---
-title: Collect production data from models deployed for real-time inferencing (preview)
+title: Collect production data from models deployed for real-time inferencing
 titleSuffix: Azure Machine Learning
 description: Collect inference data from a model deployed to a real-time endpoint on Azure Machine Learning.
 services: machine-learning
@@ -8,20 +8,18 @@ ms.subservice: mlops
 ms.topic: how-to
 author: ahughes-msft
 ms.author: alehughes
-ms.date: 01/29/2024
+ms.date: 04/15/2024
 ms.reviewer: mopeakande
 reviewer: msakande
 ms.custom: devplatv2, build-2023
 ---
 
-# Collect production data from models deployed for real-time inferencing (preview)
+# Collect production data from models deployed for real-time inferencing
 
 [!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
 
 In this article, you learn how to use Azure Machine Learning **Data collector** to collect production inference data from a model that is deployed to an Azure Machine Learning managed online endpoint or a Kubernetes online endpoint.
 
-[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
-
 You can enable data collection for new or existing online endpoint deployments. Azure Machine Learning data collector logs inference data in Azure Blob Storage. Data collected with the Python SDK is automatically registered as a data asset in your Azure Machine Learning workspace. This data asset can be used for model monitoring.
 
 If you're interested in collecting production inference data for an MLflow model that is deployed to a real-time endpoint, see [Data collection for MLflow models](#collect-data-for-mlflow-models).
@@ -482,7 +480,7 @@ To view the collected data in Blob Storage from the studio UI:
 
 If you're deploying an MLflow model to an Azure Machine Learning online endpoint, you can enable production inference data collection with single toggle in the studio UI. If data collection is toggled on, Azure Machine Learning auto-instruments your scoring script with custom logging code to ensure that the production data is logged to your workspace Blob Storage. Your model monitors can then use the data to monitor the performance of your MLflow model in production.
 
-While you're configuring the deployment of your model, you can enable production data collection. Under the **Deployment** tab, select **Enabled** for **Data collection (preview)**.
+While you're configuring the deployment of your model, you can enable production data collection. Under the **Deployment** tab, select **Enabled** for **Data collection**.
 
 After you've enabled data collection, production inference data will be logged to your Azure Machine Learning workspace Blob Storage and two data assets will be created with names `<endpoint_name>-<deployment_name>-model_inputs` and `<endpoint_name>-<deployment_name>-model_outputs`. These data assets are updated in real time as you use your deployment in production. Your model monitors can then use the data assets to monitor the performance of your model in production.
 
diff --git a/articles/machine-learning/how-to-deploy-models-llama.md b/articles/machine-learning/how-to-deploy-models-llama.md
@@ -401,7 +401,7 @@ Follow these steps to deploy a model such as `Llama-2-7b-chat` to a real-time en
 
 1. Select the **Virtual machine** and the **Instance count** that you want to assign to the deployment.
 1. Select if you want to create this deployment as part of a new endpoint or an existing one. Endpoints can host multiple deployments while keeping resource configuration exclusive for each of them. Deployments under the same endpoint share the endpoint URI and its access keys.
-1. Indicate if you want to enable **Inferencing data collection (preview)**.
+1. Indicate if you want to enable **Inferencing data collection**.
 1. Indicate if you want to enable **Package Model (preview)**.
 1. Select **Deploy**. After a few moments, the endpoint's **Details** page opens up.
 1. Wait for the endpoint creation and deployment to finish. This step can take a few minutes.
diff --git a/articles/machine-learning/prompt-flow/how-to-deploy-for-real-time-inference.md b/articles/machine-learning/prompt-flow/how-to-deploy-for-real-time-inference.md
@@ -99,7 +99,7 @@ This step allows you to configure the basic settings of the deployment.
 |Deployment name| - Within the same endpoint, deployment name should be unique. <br> - If you select an existing endpoint, and input an existing deployment name, then that deployment will be overwritten with the new configurations. |
 |Virtual machine| The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](../reference-managed-online-endpoints-vm-sku-list.md).|
 |Instance count| The number of instances to use for the deployment. Specify the value on the workload you expect. For high availability, we recommend that you set the value to at least 3. We reserve an extra 20% for performing upgrades. For more information, see [managed online endpoints quotas](../how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints)|
-|Inference data collection (preview)| If you enable this, the flow inputs and outputs will be auto collected in an Azure Machine Learning data asset, and can be used for later monitoring. To learn more, see [how to monitor generative ai applications.](how-to-monitor-generative-ai-applications.md)|
+|Inference data collection| If you enable this, the flow inputs and outputs will be auto collected in an Azure Machine Learning data asset, and can be used for later monitoring. To learn more, see [how to monitor generative ai applications.](how-to-monitor-generative-ai-applications.md)|
 |Application Insights diagnostics| If you enable this, system metrics during inference time (such as token count, flow latency, flow request, and etc.) will be collected into workspace default Application Insights. To learn more, see [prompt flow serving metrics](#view-prompt-flow-endpoints-specific-metrics-optional).|