edit pass: monitor-azure-machine-learning

ShawnJackson · ShawnJackson · commit bf86c3219ad1 · 2025-03-28T17:33:31.000-05:00
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/7-azure-machine-learning-model-monitoring.yml b/learn-pr/advocates/monitor-azure-machine-learning/7-azure-machine-learning-model-monitoring.yml
@@ -1,8 +1,8 @@
 ### YamlMime:ModuleUnit
 uid: learn.introduction-to-azure-machine-learning-monitoring.azure-machine-learning-model-monitoring
-title: Monitoring Azure Machine Learning models
+title: Azure Machine Learning model monitoring
 metadata:
-  title: Monitoring Azure Machine Learning Models
+  title: Azure Machine Learning Model Monitoring
   description: Learn how to monitor Azure Machine Learning models.
   ms.date: 03/15/2025
   author: Orin-Thomas
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/9-knowledge-check.yml b/learn-pr/advocates/monitor-azure-machine-learning/9-knowledge-check.yml
@@ -17,17 +17,17 @@ quiz:
         - content: "Monitor Azure Machine Learning performance."
           isCorrect: true
           explanation: "Administrators are primarily responsible for the Azure infrastructure on which an Azure Machine Learning workspace and models run."
-        - content: "Monitor workflow issues."
+        - content: "Monitor workflow problems."
           isCorrect: false
           explanation: "Both administrators and machine learning professionals are responsible for monitoring problems and performance related to Azure Machine Learning workspaces."
         - content: "Monitor Azure Machine Learning models."
           isCorrect: false
           explanation: "Machine learning professionals and data scientists are primarily responsible for monitoring model performance."
-    - content: "You can use Kusto Query Language (KQL) to query the Log Analytics workspace that's associated with your Azure Machine Learning workspace. What is a necessary first step for accomplishing this goal?"
+    - content: "You can use KQL to query the Log Analytics workspace that's associated with your Azure Machine Learning workspace. What is a necessary first step for accomplishing this goal?"
       choices:
         - content: "Log Analytics is configured by default."
           isCorrect: false
-          explanation: "Only metrics are enabled by default on Azure Machine Learning. Logs can be exported to Log Analytics by setting up diagnostic settings."
+          explanation: "Only metrics are enabled by default on Azure Machine Learning. You can export logs to Log Analytics by setting up diagnostic settings."
         - content: "Configure diagnostic settings."
           isCorrect: true
           explanation: "By configuring diagnostic settings, you can enable logs to be exported to a Log Analytics workspace."
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/includes/1-introduction.md b/learn-pr/advocates/monitor-azure-machine-learning/includes/1-introduction.md
@@ -1,10 +1,10 @@
-Azure Machine Learning is a cloud service for managing the life cycles of machine learning projects. Machine Learning professionals, data scientists, and engineers can use Azure Machine Learning to train and deploy models and manage machine learning operations.
+Azure Machine Learning is a cloud service for managing the life cycles of machine learning projects. Machine learning professionals, data scientists, and engineers can use Azure Machine Learning to train and deploy models and manage machine learning operations.
 
 When anyone is monitoring an Azure Machine Learning environment, it's important to have visibility into all resources that might affect performance and AI model quality. Monitoring of Azure Machine Learning consists of the following areas:
 
 - **Azure Machine Learning performance**: Compute resources provide the infrastructure for running a machine learning workflow. They can affect Azure Machine Learning runs, experiments, and overall performance. This area is traditionally for operators and administrators.
-- **Workflow problems**: Throughout the machine learning life cycle, problems and errors might occur during deployment of new models, during the running of a job, or in other circumstances. Both administrators and machine learning professionals might be interested in this area.
-- **Machine learning models**: Data drift, model prediction drift, data quality, and feature attribution drift can lead to outdated models and cause your AI systems to become obsolete. Machine learning professionals and data scientists are the traditional owners of this monitoring.
+- **Workflow problems**: Throughout the life cycle of machine learning, problems and errors might occur during deployment of new models, during the running of a job, or in other circumstances. Both administrators and machine learning professionals might be interested in this area.
+- **Machine learning models**: Data drift, model prediction drift, poor data quality, and feature attribution drift can lead to outdated models and cause AI systems to become obsolete. Machine learning professionals and data scientists are the traditional owners of this monitoring.
 
 Azure Monitor is the primary tool for managing an Azure Machine Learning environment. Azure Monitor provides built-in capabilities to monitor performance and workflow problems in Azure Machine Learning. You can also expand these capabilities for your own needs.
 
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/includes/10-summary.md b/learn-pr/advocates/monitor-azure-machine-learning/includes/10-summary.md
@@ -1,7 +1,6 @@
-This module gave you an overview of monitoring in Azure Machine Learning. You learned that Azure Machine Learning consists of resources in Azure, such as compute and workspace, and how to check metrics for these resources. You also learned how to enable diagnostic settings for an Azure Machine Learning workspace, so that you can query logs from the Azure Machine Learning resources. Finally, you learned why it's important to monitor the Machine Learning models to prevent them from becoming obsolete and stale.
+This module gave you an overview of monitoring in Azure Machine Learning. You learned that Azure Machine Learning consists of resources in Azure, such as compute and workspace, and how to check metrics for these resources. You also learned how to enable diagnostic settings for an Azure Machine Learning workspace, so that you can query logs from the Azure Machine Learning resources. Finally, you learned why it's important to monitor the machine learning models to prevent them from becoming obsolete and stale.
 
 ## Further reading/learning
 
-- [Monitor Azure Machine Learning](/azure/machine-learning/monitor-azure-machine-learning-reference)
 - [Azure Machine Learning model monitoring](/azure/machine-learning/concept-model-monitoring)
 - [Azure Machine Learning monitoring data reference](/azure/machine-learning/monitor-azure-machine-learning-reference)
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/includes/2-monitoring-azure-machine-learning-workspace-compute.md b/learn-pr/advocates/monitor-azure-machine-learning/includes/2-monitoring-azure-machine-learning-workspace-compute.md
@@ -20,7 +20,7 @@ A compute is a designated resource where you run your job or host your endpoint.
 - **Compute cluster**: This type of compute is a managed infrastructure where you can create a cluster of CPU or GPU compute nodes.
 - **Serverless compute**: When you use serverless compute, you don't need to create your own cluster. All compute life-cycle management is offloaded to Azure Machine Learning.
 - **Kubernetes cluster**: This type of compute is a used to deploy trained machine learning models to Azure Kubernetes Service (AKS). You can create an AKS cluster from your Azure Machine Learning workspace or attach an existing AKS cluster.
-- **Attached (Unmanaged) compute**: You can attach your own compute resources to your workspace and use them for training and inference. Virtual machines (VM) are supported, along with services such as Azure Databricks and Azure HDInsight. These are unmanaged compute resources. As such, they can require extra steps for you to maintain or improve performance for machine learning workloads.
+- **Attached (unmanaged) compute**: You can attach your own compute resources to your workspace and use them for training and inference. Virtual machines are supported, along with services such as Azure Databricks and Azure HDInsight. These are unmanaged compute resources. As such, they can require extra steps for you to maintain or improve performance for machine learning workloads.
 
 For managed compute resources, you can get insights into the performance of the nodes, quota availability, and resilience of the environment directly from Azure Machine Learning.
 
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/includes/3-azure-monitor-platform-metrics.md b/learn-pr/advocates/monitor-azure-machine-learning/includes/3-azure-monitor-platform-metrics.md
@@ -1,8 +1,8 @@
 Azure Monitor collects and aggregates metrics from every component of Azure Machine Learning by default. Azure Monitor platform metrics provide a view of availability, performance, and resilience.
 
-Azure Monitor uses the concept of *resource types* to identify Azure resources. Resource types are also part of the resource IDs for every resource running in Azure. For example, one resource type for Azure Machine Learning is **Microsoft.MachineLearningServices/workspaces**.
+Azure Monitor uses the concept of *resource types* to identify Azure resources. Resource types are also part of the resource ID for every resource running in Azure. For example, one resource type for Azure Machine Learning is **Microsoft.MachineLearningServices/workspaces**.
 
-Azure Monitor organizes core monitoring data into metrics on resource types, also called *namespaces*. Metrics and logs are available for various resource types. The metric categories in the **Microsoft.MachineLearningServices/workspaces** resource are **Model**, **Quota**, **Resource**, **Run**, and **Traffic**. Quota information is for Machine Learning compute only. **Run** provides information on training runs for the workspace.
+Azure Monitor organizes core monitoring data into metrics on resource types, also called *namespaces*. Metrics and logs are available for various resource types. The metric categories in the **Microsoft.MachineLearningServices/workspaces** resource are **Model**, **Quota**, **Resource**, **Run**, and **Traffic**. Quota information is for Machine Learning compute only. The **Run** category provides information on training runs for the workspace.
 
 You can use that data to analyze the performance of your Azure Machine Learning environment. For example, if you want to check how many cores a workspace is consuming:
 
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/includes/4-azure-monitor-resource-logs.md b/learn-pr/advocates/monitor-azure-machine-learning/includes/4-azure-monitor-resource-logs.md
@@ -1,4 +1,4 @@
-Resource logs provide insight into operations that an Azure resource completed, such as Azure Machine Learning. Logs are generated automatically, but you must route them to Log Analytics or a different service to save or query them.
+Resource logs provide insight into operations that an Azure resource (such as Azure Machine Learning) completed. Logs are generated automatically, but you must route them to Log Analytics or a different service to save or query them.
 
 To route Azure Machine Learning logs to Log Analytics, perform the following steps:
 
@@ -34,7 +34,7 @@ After you configure the diagnostic setting, you can query the logs in Log Analyt
 
 1. On the **New Query 1*** tab, select the dropdown menu on the right side to change **Simple mode** to **KQL**. (KQL stands for Kusto Query Language.)
 
-1. In the KQL query editor, you can enter the query that you want to perform. For the example in this module, we check for records for a specific job name:
+1. In the KQL query editor, you can enter the query that you want to perform. The following example checks for records for a specific job name:
 
    ```kusto
    _AmlComputeJobEvent_
@@ -46,4 +46,4 @@ After you configure the diagnostic setting, you can query the logs in Log Analyt
 
 ![Screenshot of the KQL code in a log query in the Azure portal.](../media/log-query.png)
 
-After you run the query, you can analyze the results on the **Results** pane.
+After you run the query, you can analyze the results on the **Results** tab.
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/includes/5-azure-monitor-alerts.md b/learn-pr/advocates/monitor-azure-machine-learning/includes/5-azure-monitor-alerts.md
@@ -4,7 +4,7 @@ There are various types of alerts:
 
 - *Metric alerts* evaluate resource metrics at regular intervals. Metrics can be platform metrics, custom metrics, logs from Azure Monitor converted to metrics, or Application Insights metrics. Metric alerts can also apply multiple conditions and dynamic thresholds.
 - *Log alerts* enable users to use a Log Analytics query to evaluate resource logs at a predefined frequency.
-- *Activity log alerts* trigger when a new activity log event matches defined conditions. For example, activity log alerts can report on your service health and resource health.
+- *Activity log alerts* are triggered when a new activity log event matches defined conditions. For example, activity log alerts can report on service health and resource health.
 
 When you're monitoring an Azure Machine Learning workspace, you might want to get an alert when a model deployment fails, when quota utilization exceeds a threshold, or when nodes are unusable.
 
@@ -26,14 +26,14 @@ To create an alert:
 
 1. For **Alert logic**, you can provide thresholds for when the condition is met to trigger the alert. For **Threshold**, enter **5**. This value means that more than five failed deployments trigger this alert. You can change the variables to meet your needs. When you finish, select **Next**.
 
-1. On the **Actions** tab, make sure that **Use quick actions** is selected and provide the details on the right pane. Then select **Save** > **Next**.
+1. On the **Actions** tab, make sure that **Use quick actions** is selected, and provide the details on the right pane. Then select **Save** > **Next**.
 
 1. On the **Details** tab, confirm the subscription and resource group to use. Select the appropriate severity level. For **Alert rule name**, provide the name of the alert and a description. Then select **Review + Create**.
 
 The **Alerts** dashboard shows information if the alert is triggered.
 
 ![Screenshot of the Alert dashboard in an Azure Machine Learning workspace in the Azure portal.](../media/machine-learning-alerts.png)
 
-If you configure an alert to send an email, you should receive a notification.
+If you configure an alert to send an email, you should receive a notification like the following example.
 
 ![Screenshot that shows the output of an Azure Monitor alert in the Azure portal.](../media/azure-monitor-alert.png)
diff --git a/learn-pr/advocates/monitor-azure-machine-learning/includes/7-azure-machine-learning-model-monitoring.md b/learn-pr/advocates/monitor-azure-machine-learning/includes/7-azure-machine-learning-model-monitoring.md
@@ -12,7 +12,7 @@ Azure Machine Learning provides the following capabilities for continuous model
 - **Ability to define custom monitoring signals**. If the built-in monitoring signals aren't suitable for your business scenario, you can define your own monitoring signal with a custom component.
 - **Flexibility to use production inference data from any source**. If you deploy models outside Azure Machine Learning or you deploy models to batch endpoints, you can still collect production inference data yourself to use in Azure Machine Learning model monitoring.
 
-Each machine learning model and its use cases are unique. Therefore, model monitoring is unique for each situation. Here are recommended best practices for model monitoring:
+Each machine learning model and its use cases are unique. Therefore, model monitoring is unique for each situation. Here are best practices for model monitoring:
 
 - **Start model monitoring immediately after you deploy a model to production**.
 - **Work with data scientists who are familiar with the model to set up monitoring**. Data scientists who have insight into the model and its use cases can recommend monitoring signals and metrics. They can set the right alert thresholds for each metric to avoid alert fatigue.