Skip to content

Commit 5927913

Browse files
committed
edit pass: monitor-azure-machine-learning
1 parent 8406689 commit 5927913

File tree

7 files changed

+97
-66
lines changed

7 files changed

+97
-66
lines changed

learn-pr/advocates/monitor-azure-machine-learning/includes/10-summary.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
This module provides an overview of monitoring Azure Machine Learning. You learned that Azure Machine Learning is composed by different resources in Azure, such as Compute and Workspace, and how to check metrics for these resources. You also learned how to enable Diagnostic Settings for an Azure Machine Learning workspace, allowing you to query logs from the Azure Machine Learning resources. Finally, you learned why it's important to monitor the Machine Learning models to avoid them becoming obsolete and stale.
1+
This module gave you an overview of monitoring Azure Machine Learning. You learned that Azure Machine Learning consists of resources in Azure, such as compute and workspace, and how to check metrics for these resources. You also learned how to enable diagnostic settings for an Azure Machine Learning workspace, so that you can query logs from the Azure Machine Learning resources. Finally, you learned why it's important to monitor the Machine Learning models to prevent them from becoming obsolete and stale.
22

33
## Further reading/learning
44

learn-pr/advocates/monitor-azure-machine-learning/includes/3-azure-monitor-platform-metrics.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,11 @@ Azure Monitor organizes core monitoring data into metrics on resource types, als
77
You can use that data to analyze the performance of your Azure Machine Learning environment. For example, if you want to check how many cores a workspace is consuming:
88

99
1. In the Azure portal, open the Azure Machine Learning resource.
10+
1011
1. On the left menu, expand **Monitoring** and select **Metrics**.
12+
1113
1. On the chart, make sure that **Scope** is set to the Azure Machine Learning resource. Make sure that **Metric Namespace** is set to the namespace of the resource. (You might need to select **Add metric** if no options appear in the graph.)
14+
1215
1. Under **Metric**, scroll down to **Quota** > **Total Cores**.
1316

1417
![Screenshot of the metrics dashboard in the Azure portal.](../media/metrics-dashboard.png)
Lines changed: 33 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,49 @@
1-
Resource logs provide insight into operations that were done by an Azure resource, such as Azure Machine Learning. Logs are generated automatically, but you must route them to Log Analytics or different service to save or query them. To route Azure Machine Learning logs to Log Analytics, perform the following steps:
1+
Resource logs provide insight into operations that an Azure resource completed, such as Azure Machine Learning. Logs are generated automatically, but you must route them to Log Analytics or a different service to save or query them.
2+
3+
To route Azure Machine Learning logs to Log Analytics, perform the following steps:
24

35
1. In the Azure portal, open the Azure Machine Learning resource.
4-
1. On the left-hand side menu, expand Monitoring and select Diagnostic settings.
5-
1. Select + Add diagnostic setting.
6-
1. On the Diagnostic setting page, provide a name for the Diagnostic setting name.
7-
1. Under Log, select the log categories you want to export. You can also select Metrics to be exported.
8-
1. Under Destination details, select Send to Log Analytics workspace.
9-
1. Select the Subscription and Log Analytics workspace you want to export this data to.
106

11-
![A screenshot of the diagnostic settings in the Azure portal.](../media/diagnostic-setting.png)
7+
1. On the left menu, expand **Monitoring** and select **Diagnostic settings**.
8+
9+
1. Select **Add diagnostic setting**.
10+
11+
1. On the **Diagnostic setting** pane, provide a name for the diagnostic setting.
12+
13+
1. Under **Logs**, select the log categories that you want to export. You can also select metrics to be exported.
14+
15+
1. Under **Destination details**, select **Send to Log Analytics workspace**.
16+
17+
1. Select the subscription and Log Analytics workspace that you want to export this data to.
18+
19+
![Screenshot of the diagnostic settings in the Azure portal.](../media/diagnostic-setting.png)
1220

13-
1. Select Save to save and close the export configuration.
14-
1. Back on the Diagnostic settings page, you should now be able to see the configuration. If needed, you can change the settings for this export by clicking Edit setting.
21+
1. Select **Save** to save and close the export configuration.
1522

16-
Once you configure the Diagnostic setting, you can query the logs in Logs Analytics:
23+
1. Back on the **Diagnostic settings** page, you should now be able to see the configuration. If necessary, you can change the settings for this export by selecting **Edit setting**.
24+
25+
After you configure the diagnostic setting, you can query the logs in Log Analytics:
1726

1827
1. In the Azure portal, open the Azure Machine Learning resource.
19-
1. On the left-hand side menu, expand Monitoring and select Logs.
20-
1. If the Queries hub opens, you can close it. (Queries Hub provides a sample of queries you can use. They have the context of the resource type you're looking into and it's an easy way to get started with Logs Analytics.)
21-
1. On the New Query 1* tab, select the drop-down menu on the right-hand side to change Simple mode to KQL. (KQL stands for Justo Query Language)
22-
1. In the KQL query editor, you can type the query you want to perform. For our example, we check for records for a specific job name:
28+
29+
1. On the left menu, expand **Monitoring** and select **Logs**.
30+
31+
1. If **Queries Hub** opens, you can close it.
32+
33+
**Queries Hub** provides a sample of queries that you can use. The queries have the context of the resource type that you're looking into. It's an easy way to get started with Log Analytics.
34+
35+
1. On the **New Query 1*** tab, select the dropdown menu on the right side to change **Simple mode** to **KQL**. (KQL stands for Kusto Query Language.)
36+
37+
1. In the KQL query editor, you can enter the query that you want to perform. For the example in this module, we check for records for a specific job name:
2338

2439
```kusto
2540
_AmlComputeJobEvent_
2641
*| where JobName == "musing_date_yg60v862b2"*
2742
_| project TimeGenerated , ClusterId , EventType , ExecutionState , ToolType_
2843
```
2944

30-
1. Select Run to run the query.
45+
1. Select **Run** to run the query.
3146

32-
![A screenshot of the KQL code in a log query in the Azure portal.](../media/log-query.png)
47+
![Screenshot of the KQL code in a log query in the Azure portal.](../media/log-query.png)
3348

34-
Once you run the query, you can analyze the results in the Results pane.
49+
After you run the query, you can analyze the results on the **Results** pane.
Lines changed: 32 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,39 @@
1-
Azure Monitor can proactively notify you when specific conditions are found in your monitoring data. You can create an alert based on any metric or log data source in the Azure Monitor data platform. There are different types of alerts:
1+
Azure Monitor can proactively notify you when it finds specific conditions in your monitoring data. You can create an alert based on any metric or log data source in the Azure Monitor data platform.
22

3-
- Metric alerts evaluate resource metrics at regular intervals. Metrics can be platform metrics, custom metrics, logs from Azure Monitor converted to metrics, or Application Insights metrics. Metric alerts can also apply multiple conditions and dynamic thresholds.
4-
- Log alerts allow users to use a Log Analytics query to evaluate resource logs at a predefined frequency.
5-
- Activity log alerts trigger when a new activity log event occurs that matches defined conditions. Resource Health alerts and Service Health alerts are activity log alerts that report on your service and resource health.
3+
There are various types of alerts:
64

7-
When monitoring an Azure Machine Learning workspace, you might want to get an alert when a Model deploy fails, when quota utilization exceeds a threshold, or when there are one or more unusable nodes. To create an alert:
5+
- *Metric alerts* evaluate resource metrics at regular intervals. Metrics can be platform metrics, custom metrics, logs from Azure Monitor converted to metrics, or Application Insights metrics. Metric alerts can also apply multiple conditions and dynamic thresholds.
6+
- *Log alerts* enable users to use a Log Analytics query to evaluate resource logs at a predefined frequency.
7+
- *Activity log alerts* trigger when a new activity log event matches defined conditions. For example, activity log alerts can report on your service health and resource health.
88

9-
1. On the Azure portal, open the Azure Machine Learning resource.
10-
1. On the left-hand side menu, expand Monitoring and select Alerts.
11-
1. Select the + Create drop down menu and select Alert rule.
12-
1. Notice that the wizard starts on the second tab, Condition. This is because the Scope has already been set to Azure Machine Learning resource.
13-
1. On the Signal name, select See all signals.
14-
1. Notice that you can look for a range of signals including Custom log search, Metrics, and Activity log.
15-
1. Under Metrics, select Failed Runs.
16-
1. On the Alert logic, you can provide thresholds for when the condition is met to trigger the alert. On the Threshold enter 5. This means more than 5 failed deployments trigger this alert. You can change the variables to meet your needs. Select Next.
17-
1. In the Actions tab, make sure Use quick actions is selected and provide the details in the right-hand side pane. Select Save and select Next.
18-
1. On the Details tab, confirm the subscription and Resource group to use. Select the appropriate Severity level. In Alert rule name, provide the name of the alert and a description next. Select Review + Create.
9+
When you're monitoring an Azure Machine Learning workspace, you might want to get an alert when a model deployment fails, when quota utilization exceeds a threshold, or when nodes are unusable.
1910

20-
The Alerts dashboard shows information if the alert has been triggered:
11+
To create an alert:
2112

22-
![A screenshot of the Alerts blade of an Azure Machine Learning workspace in the Azure portal.](../media/machine-learning-alerts.png)
13+
1. In the Azure portal, open the Azure Machine Learning resource.
2314

24-
If you configure an Alert to send an e-mail, you should receive a notification:
15+
1. On the left menu, expand **Monitoring** and select **Alerts**.
2516

26-
![A screenshot showing the output of an Azure Monitor alert in the Azure portal.](../media/azure-monitor-alert.png)
17+
1. On the **Create** dropdown menu, select **Alert rule**.
18+
19+
1. Notice that the wizard starts on the second tab, **Condition**. The reason that the scope is already set to an Azure Machine Learning resource.
20+
21+
1. For **Signal name**, select **See all signals**.
22+
23+
Notice that you can look for a range of signals, including **Custom log search**, **Metrics**, and **Activity log**.
24+
25+
1. Under **Metrics**, select **Failed Runs**.
26+
27+
1. For **Alert logic**, you can provide thresholds for when the condition is met to trigger the alert. For **Threshold**, enter **5**. This value means that more than five failed deployments trigger this alert. You can change the variables to meet your needs. When you finish, select **Next**.
28+
29+
1. On the **Actions** tab, make sure that **Use quick actions** is selected and provide the details on the right pane. Then select **Save** > **Next**.
30+
31+
1. On the **Details** tab, confirm the subscription and resource group to use. Select the appropriate severity level. For **Alert rule name**, provide the name of the alert and a description. Then select **Review + Create**.
32+
33+
The **Alerts** dashboard shows information if the alert is triggered.
34+
35+
![Screenshot of the Alert dashboard in an Azure Machine Learning workspace in the Azure portal.](../media/machine-learning-alerts.png)
36+
37+
If you configure an alert to send an email, you should receive a notification.
38+
39+
![Screenshot that shows the output of an Azure Monitor alert in the Azure portal.](../media/azure-monitor-alert.png)
Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
In Azure Machine Learning, inferencing is performed by using endpoints. An endpoint is a stable and durable URL that can be used to request or invoke a model. Azure Machine Learning uses integration with Azure Monitor to track and monitor metrics and logs for online endpoints.
1+
Azure Machine Learning uses endpoints to perform inferencing. An endpoint is a stable and durable URL for requesting or invoking a model. Azure Machine Learning uses integration with Azure Monitor to track and monitor metrics and logs for online endpoints.
22

3-
Like workspace monitoring, you can view Metrics and Logs for online endpoints directly from the Azure portal. Since online endpoints are separate resources in Azure, you need to open those resources directly from the Azure portal within the Resource Group on which they're contained.
3+
Like workspace monitoring, you can view metrics and logs for online endpoints directly from the Azure portal. Because online endpoints are separate resources in Azure, you need to open those resources directly from the Azure portal within the resource group that contains them.
44

5-
The image displays Disk Utilization and Memory Utilization of the endpoint.
5+
The following example shows disk utilization and memory utilization of an endpoint.
66

7-
![A screenshot of a monitoring dashboard for a Azure Machine Learning endpoint.](../media/online-endpoint.png)
7+
![Screenshot of a monitoring dashboard for an Azure Machine Learning endpoint.](../media/online-endpoint.png)

0 commit comments

Comments
 (0)