additional data collection preview references

msakande · msakande · commit 6166a305b080 · 2024-04-15T15:40:53.000-05:00
diff --git a/articles/machine-learning/how-to-deploy-models-llama.md b/articles/machine-learning/how-to-deploy-models-llama.md
@@ -401,7 +401,7 @@ Follow these steps to deploy a model such as `Llama-2-7b-chat` to a real-time en
 
 1. Select the **Virtual machine** and the **Instance count** that you want to assign to the deployment.
 1. Select if you want to create this deployment as part of a new endpoint or an existing one. Endpoints can host multiple deployments while keeping resource configuration exclusive for each of them. Deployments under the same endpoint share the endpoint URI and its access keys.
-1. Indicate if you want to enable **Inferencing data collection (preview)**.
+1. Indicate if you want to enable **Inferencing data collection**.
 1. Indicate if you want to enable **Package Model (preview)**.
 1. Select **Deploy**. After a few moments, the endpoint's **Details** page opens up.
 1. Wait for the endpoint creation and deployment to finish. This step can take a few minutes.
diff --git a/articles/machine-learning/prompt-flow/how-to-deploy-for-real-time-inference.md b/articles/machine-learning/prompt-flow/how-to-deploy-for-real-time-inference.md
@@ -99,7 +99,7 @@ This step allows you to configure the basic settings of the deployment.
 |Deployment name| - Within the same endpoint, deployment name should be unique. <br> - If you select an existing endpoint, and input an existing deployment name, then that deployment will be overwritten with the new configurations. |
 |Virtual machine| The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](../reference-managed-online-endpoints-vm-sku-list.md).|
 |Instance count| The number of instances to use for the deployment. Specify the value on the workload you expect. For high availability, we recommend that you set the value to at least 3. We reserve an extra 20% for performing upgrades. For more information, see [managed online endpoints quotas](../how-to-manage-quotas.md#azure-machine-learning-online-endpoints-and-batch-endpoints)|
-|Inference data collection (preview)| If you enable this, the flow inputs and outputs will be auto collected in an Azure Machine Learning data asset, and can be used for later monitoring. To learn more, see [how to monitor generative ai applications.](how-to-monitor-generative-ai-applications.md)|
+|Inference data collection| If you enable this, the flow inputs and outputs will be auto collected in an Azure Machine Learning data asset, and can be used for later monitoring. To learn more, see [how to monitor generative ai applications.](how-to-monitor-generative-ai-applications.md)|
 |Application Insights diagnostics| If you enable this, system metrics during inference time (such as token count, flow latency, flow request, and etc.) will be collected into workspace default Application Insights. To learn more, see [prompt flow serving metrics](#view-prompt-flow-endpoints-specific-metrics-optional).|