You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, you learn how to troubleshoot when you get errors using the [ParallelRunStep](/python/api/azureml-pipeline-steps/azureml.pipeline.steps.parallel_run_step.parallelrunstep) class from the [Azure Machine Learning SDK](/python/api/overview/azure/ml/intro).
In this article, you learn how to troubleshoot when you get errors running a [machine learning pipeline](concept-ml-pipelines.md) in the [Azure Machine Learning SDK](/python/api/overview/azure/ml/intro) and [Azure Machine Learning designer](./concept-designer.md).
21
21
@@ -31,7 +31,7 @@ The following table contains common problems during pipeline development, with p
31
31
| Pipeline not reusing steps | Step reuse is enabled by default, but ensure you haven't disabled it in a pipeline step. If reuse is disabled, the `allow_reuse` parameter in the step will be set to `False`. |
32
32
| Pipeline is rerunning unnecessarily | To ensure that steps only rerun when their underlying data or scripts change, decouple your source-code directories for each step. If you use the same source directory for multiple steps, you may experience unnecessary reruns. Use the `source_directory` parameter on a pipeline step object to point to your isolated directory for that step, and ensure you aren't using the same `source_directory` path for multiple steps. |
33
33
| Step slowing down over training epochs or other looping behavior | Try switching any file writes, including logging, from `as_mount()` to `as_upload()`. The **mount** mode uses a remote virtualized filesystem and uploads the entire file each time it is appended to. |
34
-
| Compute target takes a long time to start | Docker images for compute targets are loaded from Azure Container Registry (ACR). By default, Azure Machine Learning creates an ACR that uses the *basic* service tier. Changing the ACR for your workspace to standard or premium tier may reduce the time it takes to build and load images. For more information, see [Azure Container Registry service tiers](../container-registry/container-registry-skus.md). |
34
+
| Compute target takes a long time to start | Docker images for compute targets are loaded from Azure Container Registry (ACR). By default, Azure Machine Learning creates an ACR that uses the *basic* service tier. Changing the ACR for your workspace to standard or premium tier may reduce the time it takes to build and load images. For more information, see [Azure Container Registry service tiers](/azure/container-registry/container-registry-skus). |
35
35
36
36
### Authentication errors
37
37
@@ -233,7 +233,7 @@ For pipelines created in the designer, you can find the **70_driver_log** file i
233
233
234
234
### Enable logging for real-time endpoints
235
235
236
-
In order to troubleshoot and debug real-time endpoints in the designer, you must enable Application Insight logging using the SDK. Logging lets you troubleshoot and debug model deployment and usage issues. For more information, see [Logging for deployed models](./v1/how-to-enable-app-insights.md).
236
+
In order to troubleshoot and debug real-time endpoints in the designer, you must enable Application Insight logging using the SDK. Logging lets you troubleshoot and debug model deployment and usage issues. For more information, see [Logging for deployed models](how-to-enable-app-insights.md).
This article will show you how to share a machine learning pipeline with your colleagues or customers.
20
20
21
21
Machine learning pipelines are reusable workflows for machine learning tasks. One benefit of pipelines is increased collaboration. You can also version pipelines, allowing customers to use the current model while you're working on a new version.
22
22
23
23
## Prerequisites
24
24
25
-
* Create an [Azure Machine Learning workspace](quickstart-create-resources.md) to hold all your pipeline resources
25
+
* Create an [Azure Machine Learning workspace](../quickstart-create-resources.md) to hold all your pipeline resources
26
26
27
27
*[Configure your development environment](how-to-configure-environment.md) to install the Azure Machine Learning SDK, or use an [Azure Machine Learning compute instance](concept-compute-instance.md) with the SDK already installed
The [OpenCensus](https://opencensus.io/quickstart/python/) Python library can be used to route logs to Application Insights from your scripts. Aggregating logs from pipeline runs in one place allows you to build queries and diagnose issues. Using Application Insights will allow you to track logs over time and compare pipeline logs across runs.
20
20
21
21
Having your logs in once place will provide a history of exceptions and error messages. Since Application Insights integrates with Azure Alerts, you can also create alerts based on Application Insights queries.
22
22
23
23
## Prerequisites
24
24
25
-
* Follow the steps to create an [Azure Machine Learning workspace](quickstart-create-resources.md) and [create your first pipeline](./how-to-create-machine-learning-pipelines.md)
25
+
* Follow the steps to create an [Azure Machine Learning workspace](../quickstart-create-resources.md) and [create your first pipeline](./how-to-create-machine-learning-pipelines.md)
26
26
*[Configure your development environment](./how-to-configure-environment.md) to install the Azure Machine Learning SDK.
27
27
* Install the [OpenCensus Azure Monitor Exporter](https://pypi.org/project/opencensus-ext-azure/) package locally:
28
28
```python
29
29
pip install opencensus-ext-azure
30
30
```
31
-
* Create an [Application Insights instance](../azure-monitor/app/opencensus-python.md) (this doc also contains information on getting the connection string for the resource)
31
+
* Create an [Application Insights instance](/azure/azure-monitor/app/opencensus-python) (this doc also contains information on getting the connection string for the resource)
32
32
33
33
## Getting Started
34
34
@@ -160,6 +160,6 @@ Some of the queries below use 'customDimensions.Level'. These severity levels co
160
160
161
161
## Next Steps
162
162
163
-
Once you have logs in your Application Insights instance, they can be used to set [Azure Monitor alerts](../azure-monitor/alerts/alerts-overview.md) based on query results.
163
+
Once you have logs in your Application Insights instance, they can be used to set [Azure Monitor alerts](/azure/azure-monitor/alerts/alerts-overview) based on query results.
164
164
165
-
You can also add results from queries to an [Azure Dashboard](../azure-monitor/app/tutorial-app-dashboards.md#add-logs-query) for additional insights.
165
+
You can also add results from queries to an [Azure Dashboard](/azure/azure-monitor/app/tutorial-app-dashboards#add-logs-query) for additional insights.
This article provides code for importing, transforming, and moving data between steps in an Azure Machine Learning pipeline. For an overview of how data works in Azure Machine Learning, see [Access data in Azure storage services](how-to-access-data.md). For the benefits and structure of Azure Machine Learning pipelines, see [What are Azure Machine Learning pipelines?](concept-ml-pipelines.md)
21
21
@@ -47,7 +47,7 @@ You'll need:
47
47
ws = Workspace.from_config()
48
48
```
49
49
50
-
- Some pre-existing data. This article briefly shows the use of an [Azure blob container](../storage/blobs/storage-blobs-overview.md).
50
+
- Some pre-existing data. This article briefly shows the use of an [Azure blob container](/azure/storage/blobs/storage-blobs-overview).
51
51
52
52
- Optional: An existing machine learning pipeline, such as the one described in [Create and run machine learning pipelines with Azure Machine Learning SDK](./how-to-create-machine-learning-pipelines.md).
For more options on creating datasets with different options and from different sources, registering them and reviewing them in the Azure Machine Learning UI, understanding how data size interacts with compute capacity, and versioning them, see [Create Azure Machine Learning datasets](./v1/how-to-create-register-datasets.md).
72
+
For more options on creating datasets with different options and from different sources, registering them and reviewing them in the Azure Machine Learning UI, understanding how data size interacts with compute capacity, and versioning them, see [Create Azure Machine Learning datasets](how-to-create-register-datasets.md).
73
73
74
74
### Pass datasets to your script
75
75
76
76
To pass the dataset's path to your script, use the `Dataset` object's `as_named_input()` method. You can either pass the resulting `DatasetConsumptionConfig` object to your script as an argument or, by using the `inputs` argument to your pipeline script, you can retrieve the dataset using `Run.get_context().input_datasets[]`.
77
77
78
-
Once you've created a named input, you can choose its access mode: `as_mount()` or `as_download()`. If your script processes all the files in your dataset and the disk on your compute resource is large enough for the dataset, the download access mode is the better choice. The download access mode will avoid the overhead of streaming the data at runtime. If your script accesses a subset of the dataset or it's too large for your compute, use the mount access mode. For more information, read [Mount vs. Download](./how-to-train-with-datasets.md#mount-vs-download)
78
+
Once you've created a named input, you can choose its access mode: `as_mount()` or `as_download()`. If your script processes all the files in your dataset and the disk on your compute resource is large enough for the dataset, the download access mode is the better choice. The download access mode will avoid the overhead of streaming the data at runtime. If your script accesses a subset of the dataset or it's too large for your compute, use the mount access mode. For more information, read [Mount vs. Download](how-to-train-with-datasets.md#mount-vs-download)
79
79
80
80
To pass a dataset to your pipeline step:
81
81
@@ -194,7 +194,7 @@ After the initial pipeline step writes some data to the `OutputFileDatasetConfig
194
194
195
195
In the following code:
196
196
197
-
*`step1_output_data` indicates that the output of the PythonScriptStep, `step1` is written to the ADLS Gen 2 datastore, `my_adlsgen2` in upload access mode. Learn more about how to [set up role permissions](./v1/how-to-access-data.md) in order to write data back to ADLS Gen 2 datastores.
197
+
*`step1_output_data` indicates that the output of the PythonScriptStep, `step1` is written to the ADLS Gen 2 datastore, `my_adlsgen2` in upload access mode. Learn more about how to [set up role permissions](how-to-access-data.md) in order to write data back to ADLS Gen 2 datastores.
198
198
199
199
* After `step1` completes and the output is written to the destination indicated by `step1_output_data`, then step2 is ready to use `step1_output_data` as an input.
Azure does not automatically delete intermediate data written with`OutputFileDatasetConfig`. To avoid storage charges for large amounts of unneeded data, you should either:
237
237
238
238
* Programmatically delete intermediate data at the end of a pipeline job, when it is no longer needed
239
-
* Use blob storage with a short-term storage policy for intermediate data (see [Optimize costs by automating Azure Blob Storage access tiers](../storage/blobs/lifecycle-management-overview.md?tabs=azure-portal))
239
+
* Use blob storage with a short-term storage policy for intermediate data (see [Optimize costs by automating Azure Blob Storage access tiers](/azure/storage/blobs/lifecycle-management-overview))
240
240
* Regularly review and delete no-longer-needed data
241
241
242
242
For more information, see [Plan and manage costs for Azure Machine Learning](concept-plan-manage-cost.md).
243
243
244
244
## Next steps
245
245
246
-
* [Create an Azure machine learning dataset](./v1/how-to-create-register-datasets.md)
247
-
* [Create and run machine learning pipelines with Azure Machine Learning SDK](./how-to-create-machine-learning-pipelines.md)
246
+
* [Create an Azure machine learning dataset](how-to-create-register-datasets.md)
247
+
* [Create and run machine learning pipelines with Azure Machine Learning SDK](how-to-create-machine-learning-pipelines.md)
In this article, you'll learn how to programmatically schedule a pipeline to run on Azure. You can create a schedule based on elapsed time or on file-system changes. Time-based schedules can be used to take care of routine tasks, such as monitoring for data drift. Change-based schedules can be used to react to irregular or unpredictable changes, such as new data being uploaded or old data being edited. After learning how to create schedules, you'll learn how to retrieve and deactivate them. Finally, you'll learn how to use other Azure services, Azure Logic App and Azure Data Factory, to run pipelines. An Azure Logic App allows for more complex triggering logic or behavior. Azure Data Factory pipelines allow you to call a machine learning pipeline as part of a larger data orchestration pipeline.
21
21
@@ -141,7 +141,7 @@ If you then run `Schedule.list(ws)` again, you should get an empty list.
141
141
142
142
## Use Azure Logic Apps for complex triggers
143
143
144
-
More complex trigger rules or behavior can be created using an [Azure Logic App](../logic-apps/logic-apps-overview.md).
144
+
More complex trigger rules or behavior can be created using an [Azure Logic App](/azure/logic-apps/logic-apps-overview).
145
145
146
146
To use an Azure Logic App to trigger a Machine Learning pipeline, you'll need the REST endpoint for a published Machine Learning pipeline. [Create and publish your pipeline](./how-to-create-machine-learning-pipelines.md). Then find the REST endpoint of your `PublishedPipeline` by using the pipeline ID:
147
147
@@ -154,11 +154,11 @@ published_pipeline.endpoint
154
154
155
155
## Create a Logic App
156
156
157
-
Now create an [Azure Logic App](../logic-apps/logic-apps-overview.md) instance. If you wish, [use an integration service environment (ISE)](../logic-apps/connect-virtual-network-vnet-isolated-environment.md) and [set up a customer-managed key](../logic-apps/customer-managed-keys-integration-service-environment.md) for use by your Logic App.
157
+
Now create an [Azure Logic App](../logic-apps/logic-apps-overview.md) instance. If you wish, [use an integration service environment (ISE)](/azure/logic-apps/connect-virtual-network-vnet-isolated-environment) and [set up a customer-managed key](/azure/logic-apps/customer-managed-keys-integration-service-environment) for use by your Logic App.
158
158
159
159
Once your Logic App has been provisioned, use these steps to configure a trigger for your pipeline:
160
160
161
-
1.[Create a system-assigned managed identity](../logic-apps/create-managed-service-identity.md) to give the app access to your Azure Machine Learning Workspace.
161
+
1.[Create a system-assigned managed identity](/azure/logic-apps/create-managed-service-identity) to give the app access to your Azure Machine Learning Workspace.
162
162
163
163
1. Navigate to the Logic App Designer view and select the Blank Logic App template.
0 commit comments