You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-debug-pipeline-failure.md
+43-55Lines changed: 43 additions & 55 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,107 +1,95 @@
1
1
---
2
-
title: 'How to use studio UI to debug pipeline failure'
2
+
title: Use Azure Machine Learning studio to debug pipeline failures
3
3
titleSuffix: Azure Machine Learning
4
-
description: Learn how to debug compare pipeline failure with pipeline UI in studio.
4
+
description: Learn how to debug pipeline failures and compare pipelines by using the Azure Machine Learning studio UI.
5
5
ms.reviewer: lagayhar
6
6
author: likebupt
7
7
ms.author: keli19
8
8
services: machine-learning
9
9
ms.service: machine-learning
10
10
ms.subservice: core
11
11
ms.topic: how-to
12
-
ms.date: 05/27/2023
12
+
ms.date: 05/23/2024
13
13
ms.custom: designer
14
14
---
15
15
16
-
# How to use pipeline UI to debug Azure Machine Learning pipeline failures
16
+
# Use Designer in Azure Machine Learning studio to debug pipeline failures
17
17
18
-
After submitting a pipeline, you'll see a link to the pipeline job in your Azure Machine Learning workspace. The link lands you in the pipeline job page in Azure Machine Learning studio, in which you can check result and debug your pipeline job.
19
-
20
-
This article introduces how to use the pipeline job page to debug machine learning pipeline failures.
18
+
After you submit a pipeline job, you can select a link to the job in your workspace in Azure Machine Learning studio. The link opens the pipeline job detail page, where you can check results and debug your pipeline job. This article explains how to use the pipeline job detail page to debug machine learning pipeline failures.
21
19
22
20
> [!IMPORTANT]
23
-
> Items marked (preview) in this article are currently in public preview.
24
-
> The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
25
-
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
26
-
21
+
> Items marked (preview) in this article are currently in public preview. The preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
27
22
28
-
## Using outline to quickly find a node
23
+
## Use outline to quickly find a node
29
24
30
-
In pipeline job detail page, there's an outline left to the canvas, which shows the overall structure of your pipeline job. Hovering on any row, you can select the "Locate" button to locate that node in the canvas.
25
+
On the pipeline job detail page, the **Outline** pane on the left shows the overall structure of your pipeline job. Hover on any row and select the **Locate in canvas** icon to highlight that node on the canvas and open an information pane for the node on the right.
31
26
32
-
:::image type="content" source="./media/how-to-debug-pipeline-failure/outline.png" alt-text="Screenshot showing outline and locate in the canvas." lightbox= "./media/how-to-debug-pipeline-failure/outline.png":::
27
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/outline-detail.png" alt-text="Screenshot showing outline and locate in the canvas." lightbox= "./media/how-to-debug-pipeline-failure/outline.png":::
33
28
34
-
You can filter failed or completed nodes, and filter by only components or dataset for further search. The left pane shows the matched nodes with more information including status, duration, and created time.
29
+
In the **Outline** pane, you can select the **Filter** icon to quickly filter the view to **Completed nodes only**, **Component only**, or **Dataset only**. You can also filter the list by entering node names or component names in the Search box, or by selecting **Add filter**and choosing from a list of filters.
35
30
36
-
:::image type="content" source="./media/how-to-debug-pipeline-failure/quick-filter.png" alt-text="Screenshot showing the quick filter by in outline > search." lightbox= "./media/how-to-debug-pipeline-failure/quick-filter.png":::
31
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/quick-filter-detail.png" alt-text="Screenshot showing quick filter and search in the Outline pane." lightbox= "./media/how-to-debug-pipeline-failure/quick-filter.png":::
37
32
38
-
You can also sort the filtered nodes.
33
+
The left pane shows the matched nodes with more information including status, duration, and run time and date. You can sort the filtered nodes.
39
34
40
-
:::image type="content" source="./media/how-to-debug-pipeline-failure/sort.png" alt-text="Screenshot of sorting search result in outline > search." lightbox= "./media/how-to-debug-pipeline-failure/sort.png":::
35
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/sort-detail.png" alt-text="Screenshot of sorting search results in the Outline pane." lightbox= "./media/how-to-debug-pipeline-failure/sort.png":::
41
36
42
-
## Check logs and outputs of component
37
+
## Check component logs and outputs
43
38
44
39
If your pipeline fails or gets stuck on a node, first view the logs.
45
40
46
-
1. You can select the specific node and open the right pane.
47
-
48
-
1. Select **Outputs+logs** tab and you can explore all the outputs and logs of this node.
41
+

49
42
50
-
The **user_logs folder** contains information about user code generated logs. This folder is open by default, and the **std_log.txt** log is selected. The **std_log.txt** is where your code's logs (for example, print statements) show up.
43
+
1. Select the node to open the information pane on the right.
51
44
52
-
The **system_logs folder**contains logs generated by Azure Machine Learning. Learn more about [View and download diagnostic logs](how-to-log-view-metrics.md#view-and-download-diagnostic-logs).
45
+
1. Select **Outputs + logs**tab to view all the outputs and logs of this node.
53
46
54
-

47
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/log-detail.png" alt-text="Screenshot of the user_logs in the node information pane." lightbox= "./media/how-to-debug-pipeline-failure/log-detail.png":::
48
+
49
+
- The *user_logs* folder contains information about user code generated logs. This folder is open by default, and the *std_log.txt* log is selected. The **std_log.txt** is where your code's logs (for example, print statements) show up.
55
50
56
-
If you don't see those folders, this is due to the compute run time update isn't released to the compute cluster yet, and you can look at **70_driver_log.txt** under **azureml-logs** folder first.
51
+
- The *system_logs* folder contains logs generated by Azure Machine Learning. To learn more, see [View and download diagnostic logs](how-to-log-view-metrics.md#view-and-download-diagnostic-logs).
57
52
58
-
## Compare different pipelines to debug failure or other unexpected issues (preview)
53
+
If you don't see those folders, the compute run time update might not be released to the compute cluster yet, and you can look at *70_driver_log.txt* in the *azureml-logs* folder first.
59
54
60
-
Pipeline comparison identifies the differences (including topology, component properties, and job properties) between multiple jobs. For example you can compare a successful pipeline and a failed pipeline, which helps you find what modifications make your pipeline fail.
55
+
## Compare pipeline jobs (preview)
61
56
62
-
Two major scenarios where you can use pipeline comparison to help with debugging:
57
+
You can compare different pipeline jobs to debug failure or other unexpected issues (preview). Pipeline comparison identifies the differences, such as topology, component properties, and job properties, between pipeline jobs.
63
58
64
-
- Debug your failed pipeline job by comparing it to a completed one.
65
-
- Debug your failed node in a pipeline by comparing it to a similar completed one.
59
+
For example, you can compare successful and failed pipeline jobs to find differences that might have made one pipeline job fail. You can debug a failed pipeline job by comparing it to a completed job, or debug a failed node in a pipeline by comparing it to a similar completed node.
66
60
67
-
To enable this feature:
61
+
To enable this feature in Azure Machine Learning studio, select the megaphone icon at top right to manage preview features. In the **Managed preview feature** panel, make sure **Compare pipeline jobs to debug failures or unexpected issues** is set to **Enabled**.
68
62
69
-
1. Navigate to Azure Machine Learning studio UI.
70
-
2. Select **Manage preview features** (megaphone icon) among the icons on the top right side of the screen.
71
-
3. In **Managed preview feature** panel, toggle on **Compare pipeline jobs to debug failures or unexpected issues** feature.
63
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/enable-preview.png" alt-text="Screenshot of the preview feature toggled on." lightbox= "./media/how-to-debug-pipeline-failure/enable-preview.png":::
72
64
73
-
:::image type="content" source="./media/how-to-debug-pipeline-failure/enable-preview.png" alt-text="Screenshot of manage preview features toggled on." lightbox= "./media/how-to-debug-pipeline-failure/enable-preview.png":::
65
+
### Debug a failed pipeline job by comparing it to a completed job
74
66
75
-
### How to debug your failed pipeline job by comparing it to a completed one
67
+
During iterative model development, you might clone and modify a successful baseline pipeline by changing a parameter, dataset, compute resource, or other setting. If the new pipeline fails, you can use pipeline comparison to help figure out the failure by identifying the changes from the parent pipeline.
76
68
77
-
During iterative model development, you may have a baseline pipeline, and then do some modifications such as changing a parameter, dataset or compute resource, etc. If your new pipeline failed, you can use pipeline comparison to identify what has changed by comparing it to the baseline pipeline, which could help with figuring out why it failed.
69
+
For example, if you get an error message that your new pipeline failed due to an out-of-memory issue, you can use pipeline comparison to see what changed from a completed parent pipeline.
78
70
79
71
#### Compare a pipeline with its parent
80
72
81
-
The first thing you should check when debugging is to locate the failed node and check the logs.
82
-
83
-
For example, you may get an error message showing that your pipeline failed due to out-of-memory. If your pipeline is cloned from a completed parent pipeline, you can use pipeline comparison to see what has changed.
84
-
85
-
1. Select **Show lineage**.
86
-
1. Select the link under "Cloned From". This will open a new browser tab with the parent pipeline.
73
+
1. On the failed pipeline job page, select **Show lineage**.
74
+
1. Select the link in the **Cloned from** popup to open the parent pipeline job page in a new browser tab.
87
75
88
-
:::image type="content" source="./media/how-to-debug-pipeline-failure/cloned-from.png" alt-text="Screenshot showing the cloned from link, with the previous step, the lineage button highlighted." lightbox= "./media/how-to-debug-pipeline-failure/cloned-from.png":::
76
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/cloned-from.png" alt-text="Screenshot showing the cloned from link, with the previous step, the lineage button highlighted." lightbox= "./media/how-to-debug-pipeline-failure/cloned-from.png":::
89
77
90
-
1.Select **Add to compare** on the failed pipeline and the parent pipeline. This adds them in the comparison candidate list.
78
+
1.On both pages, select **Add to compare** on the top menu bar to add both jobs to the **Compare** list.
91
79
92
-
:::image type="content" source="./media/how-to-debug-pipeline-failure/comparison-list.png" alt-text="Screenshot showing the comparison list with a parent and child pipeline added." lightbox= "./media/how-to-debug-pipeline-failure/comparison-list.png":::
80
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/comparison-list-detail.png" alt-text="Screenshot showing the comparison list with a parent and child pipeline added." lightbox= "./media/how-to-debug-pipeline-failure/comparison-list.png":::
93
81
94
-
### Compare topology
82
+
Once you add both pipelines to the comparison list, select **Compare detail** or **Compare graph**.
95
83
96
-
Once the two pipelines are added to the comparison list, you have two options: **Compare detail** and **Compare graph**. **Compare graph** allows you to compare pipeline topology.
84
+
#### Compare graph
97
85
98
-
**Compare graph** shows you the graph topology changes between pipeline A and B. The special nodes in pipeline A are highlighted in red and marked with "A only". The special nodes in pipeline B are in green and marked with "B only". The shared nodes are in gray. If there are differences on the shared nodes, what has been changed is shown on the top of node.
86
+
**Compare graph** shows the topology changes between pipelines **A** and **B**. Nodes specific to pipeline A are highlighted in red and marked with **A**, and nodes specific to pipeline B are highlighted in green and marked with **B**. A description of changes is shown at the tops of the nodes.
99
87
100
-
There are three categories of changes with summaries viewable in the detail page, parameter change, input source, pipeline component. When the pipeline component is changed this means that there's a topology change inside or an inner node parameter change, you can select the folder icon on the pipeline component node to dig down into the details. Other changes can be detected by viewing the colored nodes in the compare graph.
88
+
Select a node to open the **Component information** pane, where depending on the node selected you can see **Dataset properties** or **Component properties** like **parameters**, **runSettings**, and **outputSettings**.
101
89
102
-
:::image type="content" source="./media/how-to-debug-pipeline-failure/parameter-changed.png" alt-text="Screenshot showing the parameter changed and the component information tab." lightbox= "./media/how-to-debug-pipeline-failure/parameter-changed.png":::
90
+
:::image type="content" source="./media/how-to-debug-pipeline-failure/parameter-changed.png" alt-text="Screenshot showing the parameter changed and the component information tab." lightbox= "./media/how-to-debug-pipeline-failure/parameter-changed.png":::
103
91
104
-
### Compare pipeline meta info and properties
92
+
####Compare pipeline metadata and properties
105
93
106
94
If you investigate the dataset difference and find that data or topology doesn't seem to be the root cause of failure, you can also check the pipeline details like pipeline parameter, output or run settings.
107
95
@@ -122,11 +110,11 @@ To quickly check the topology comparison, select the pipeline name and select **
122
110
123
111
:::image type="content" source="./media/how-to-debug-pipeline-failure/compare-graph.png" alt-text="Screenshot of detail comparison with compare graph highlighted." lightbox= "./media/how-to-debug-pipeline-failure/compare-graph.png":::
124
112
125
-
### How to debug your failed node in a pipeline by comparing to similar completed node
113
+
### Debug a failed node in a pipeline by comparing to a similar completed node
126
114
127
115
If you only updated node properties and changed nothing in the pipeline, then you can debug the node by comparing it with the jobs that are submitted from the same component.
128
116
129
-
#### Find the job to compare with
117
+
To find the job to compare with
130
118
131
119
1. Find a successful job to compare with by viewing all runs submitted from the same component.
132
120
1. Right select the failed node and select *View Jobs*. This gives you a list of all the jobs.
0 commit comments