MicrosoftDocs
diff --git a/‎.openpublishing.redirection.json
Lines changed: 10 additions & 0 deletions b/‎.openpublishing.redirection.json
Lines changed: 10 additions & 0 deletions
diff --git a/‎articles/machine-learning/azure-machine-learning-release-notes.md
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/azure-machine-learning-release-notes.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/machine-learning/concept-enterprise-security.md
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/concept-enterprise-security.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/machine-learning/concept-model-management-and-deployment.md
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/concept-model-management-and-deployment.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/machine-learning/how-to-access-data.md
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/how-to-access-data.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/machine-learning/how-to-debug-batch-predictions.md
Lines changed: 0 additions & 187 deletions b/‎articles/machine-learning/how-to-debug-batch-predictions.md
Lines changed: 0 additions & 187 deletions
diff --git a/‎articles/machine-learning/how-to-debug-parallel-run-step.md
Lines changed: 95 additions & 0 deletions b/‎articles/machine-learning/how-to-debug-parallel-run-step.md
Lines changed: 95 additions & 0 deletions
diff --git a/‎articles/machine-learning/how-to-deploy-and-where.md
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/how-to-deploy-and-where.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/machine-learning/how-to-deploy-app-service.md
Lines changed: 1 addition & 1 deletion b/‎articles/machine-learning/how-to-deploy-app-service.md
Lines changed: 1 addition & 1 deletion
@@ -165,6 +165,16 @@
       "redirect_url": "/azure/machine-learning/service/how-to-deploy-fpga-web-service",
       "redirect_document_id": false
     },
+    {
+      "source_path": "articles/machine-learning/how-to-debug-batch-predictions.md",
+      "redirect_url": "/azure/machine-learning/how-to-debug-parallel-run-step",
+      "redirect_document_id": false
+    },
+    {
+      "source_path": "articles/machine-learning/how-to-run-batch-predictions.md",
+      "redirect_url": "/azure/machine-learning/how-to-use-parallel-run-step",
+      "redirect_document_id": false
+    },
     {
       "source_path": "articles/machine-learning/service/quickstart-run-local-notebook.md",
       "redirect_url": "/azure/machine-learning/service/how-to-configure-environment#local",
 
@@ -1486,7 +1486,7 @@ Azure Machine Learning Compute can be created in Python, using Azure portal, or
 + ML Pipelines
   + New and updated notebooks for getting started with pipelines, batch scoping,  and style transfer examples: https://aka.ms/aml-pipeline-notebooks
   + Learn how to [create your first pipeline](how-to-create-your-first-pipeline.md)
-  + Learn how to [run batch predictions using pipelines](how-to-run-batch-predictions.md)
+  + Learn how to [run batch predictions using pipelines](how-to-use-parallel-run-step.md)
 + Azure Machine Learning compute target
   + [Sample notebooks](https://aka.ms/aml-notebooks) are now updated to use the new managed compute.
   + [Learn about this compute](how-to-set-up-training-targets.md#amlcompute)
 
@@ -330,7 +330,7 @@ Here are the details:
 
 * [Secure Azure Machine Learning web services with SSL](how-to-secure-web-service.md)
 * [Consume a Machine Learning model deployed as a web service](how-to-consume-web-service.md)
-* [How to run batch predictions](how-to-run-batch-predictions.md)
+* [How to run batch predictions](how-to-use-parallel-run-step.md)
 * [Monitor your Azure Machine Learning models with Application Insights](how-to-enable-app-insights.md)
 * [Collect data for models in production](how-to-enable-data-collection.md)
 * [Azure Machine Learning SDK](https://docs.microsoft.com/python/api/overview/azure/ml/intro?view=azure-ml-py)
 
@@ -85,7 +85,7 @@ You also provide the configuration of the target deployment platform. For exampl
 When the image is created, components required by Azure Machine Learning are also added. For example, assets needed to run the web service and interact with IoT Edge.
 
 #### Batch scoring
-Batch scoring is supported through ML pipelines. For more information, see [Batch predictions on big data](how-to-run-batch-predictions.md).
+Batch scoring is supported through ML pipelines. For more information, see [Batch predictions on big data](how-to-use-parallel-run-step.md).
 
 #### Real-time web services
 
 
@@ -259,7 +259,7 @@ Azure Machine Learning provides several ways to use your models for scoring. Som
 
 | Method | Datastore access | Description |
 | ----- | :-----: | ----- |
-| [Batch prediction](how-to-run-batch-predictions.md) | ✔ | Make predictions on large quantities of data asynchronously. |
+| [Batch prediction](how-to-use-parallel-run-step.md) | ✔ | Make predictions on large quantities of data asynchronously. |
 | [Web service](how-to-deploy-and-where.md) | &nbsp; | Deploy models as a web service. |
 | [Azure IoT Edge module](how-to-deploy-and-where.md) | &nbsp; | Deploy models to IoT Edge devices. |
 
 
@@ -0,0 +1,95 @@
+---
+title: Debug and troubleshoot ParallelRunStep
+titleSuffix: Azure Machine Learning
+description: Debug and troubleshoot ParallelRunStep in machine learning pipelines in the Azure Machine Learning SDK for Python. Learn common pitfalls for developing with pipelines, and tips to help you debug your scripts before and during remote execution.
+services: machine-learning
+ms.service: machine-learning
+ms.subservice: core
+ms.topic: conceptual
+ms.reviewer: trbye, jmartens, larryfr, vaidyas
+ms.author: trmccorm
+author: tmccrmck
+ms.date: 01/15/2020
+---
+
+# Debug and troubleshoot ParallelRunStep
+[!INCLUDE [applies-to-skus](../../includes/aml-applies-to-basic-enterprise-sku.md)]
+
+In this article, you learn how to debug and troubleshoot the [ParallelRunStep](https://docs.microsoft.com/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps.parallel_run_step.parallelrunstep?view=azure-ml-py) class from the [Azure Machine Learning SDK](https://docs.microsoft.com/python/api/overview/azure/ml/intro?view=azure-ml-py).
+
+## Testing scripts locally
+
+See the [Testing scripts locally section](how-to-debug-pipelines.md#testing-scripts-locally) for machine learning pipelines. Your ParallelRunStep runs as a step in ML pipelines so the same answer applies to both.
+
+## Debugging scripts from remote context
+
+The transition from debugging a scoring script locally to debugging a scoring script in an actual pipeline can be a difficult leap. For information on finding your logs in the portal, the [machine learning pipelines section on debugging scripts from a remote context](how-to-debug-pipelines.md#debugging-scripts-from-remote-context). The information in that section also applies to a parallel step run.
+
+For example, the log file `70_driver_log.txt` contains information from the controller that launches parallel run step code.
+
+Because of the distributed nature of parallel run jobs, there are logs from several different sources. However, two consolidated files are created that provide high-level information:
+
+- `~/logs/overview.txt`: This file provides a high-level info about the number of mini-batches (also known as tasks) created so far and number of mini-batches processed so far. At this end, it shows the result of the job. If the job failed, it will show the error message and where to start the troubleshooting.
+
+- `~/logs/sys/master.txt`: This file provides the master node (also known as the orchestrator) view of the running job. Includes task creation, progress monitoring, the run result.
+
+Logs generated from entry script using EntryScript.logger and print statements will be found in following files:
+
+- `~/logs/user/<ip_address>/Process-*.txt`: This file contains logs written from entry_script using EntryScript.logger. It also contains print statement (stdout) from entry_script.
+
+When you need a full understanding of how each node executed the score script, look at the individual process logs for each node. The process logs can be found in the `sys/worker` folder, grouped by worker nodes:
+
+- `~/logs/sys/worker/<ip_address>/Process-*.txt`: This file provides detailed info about each mini-batch as it is picked up or completed by a worker. For each mini-batch, this file includes:
+
+    - The IP address and the PID of the worker process. 
+    - The total number of items, successfully processed items count, and failed item count.
+    - The start time, duration, process time and run method time.
+
+You can also find information on the resource usage of the processes for each worker. This information is in CSV format and is located at `~/logs/sys/perf/<ip_address>/`. For a single node, job files will be available under `~logs/sys/perf`. For example, when checking for resource utilization, look at the following files:
+
+- `Process-*.csv`: Per worker process resource usage. 
+- `sys.csv`: Per node log.
+
+### How do I log from my user script from a remote context?
+You can get a logger from EntryScript as shown in below sample code to make the logs show up in **logs/user** folder in the portal.
+
+**A sample entry script using the logger:**
+```python
+from entry_script import EntryScript
+
+def init():
+    """ Initialize the node."""
+    entry_script = EntryScript()
+    logger = entry_script.logger
+    logger.debug("This will show up in files under logs/user on the Azure portal.")
+
+
+def run(mini_batch):
+    """ Accept and return the list back."""
+    # This class is in singleton pattern and will return same instance as the one in init()
+    entry_script = EntryScript()
+    logger = entry_script.logger
+    logger.debug(f"{__file__}: {mini_batch}.")
+    ...
+
+    return mini_batch
+```
+
+### How could I pass a side input such as, a file or file(s) containing a lookup table, to all my workers?
+
+Construct a [Dataset](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py) object containing the side input and register with your workspace. After that you can access it in your inference script (for example, in your init() method) as follows:
+
+```python
+from azureml.core.run import Run
+from azureml.core.dataset import Dataset
+
+ws = Run.get_context().experiment.workspace
+lookup_ds = Dataset.get_by_name(ws, "<registered-name>")
+lookup_ds.download(target_path='.', overwrite=True)
+```
+
+## Next steps
+
+* See the SDK reference for help with the [azureml-contrib-pipeline-step](https://docs.microsoft.com/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps?view=azure-ml-py) package and the [documentation](https://docs.microsoft.com/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps.parallelrunstep?view=azure-ml-py) for ParallelRunStep class.
+
+* Follow the [advanced tutorial](tutorial-pipeline-batch-scoring-classification.md) on using pipelines with parallel run step.
@@ -181,7 +181,7 @@ To deploy the model, you need the following items:
     >
     > * The Azure Machine Learning SDK doesn't provide a way for web services or IoT Edge deployments to access your data store or datasets. If your deployed model needs to access data stored outside the deployment, like data in an Azure storage account, you must develop a custom code solution by using the relevant SDK. For example, the [Azure Storage SDK for Python](https://github.com/Azure/azure-storage-python).
     >
-    >   An alternative that might work for your scenario is [batch prediction](how-to-run-batch-predictions.md), which does provide access to data stores during scoring.
+    >   An alternative that might work for your scenario is [batch prediction](how-to-use-parallel-run-step.md), which does provide access to data stores during scoring.
 
 * **Dependencies**, like helper scripts or Python/Conda packages required to run the entry script or model.
 
 
@@ -62,7 +62,7 @@ Before deploying, you must define what is needed to run the model as a web servi
     > [!IMPORTANT]
     > The Azure Machine Learning SDK does not provide a way for the web service access your datastore or data sets. If you need the deployed model to access data stored outside the deployment, such as in an Azure Storage account, you must develop a custom code solution using the relevant SDK. For example, the [Azure Storage SDK for Python](https://github.com/Azure/azure-storage-python).
     >
-    > Another alternative that may work for your scenario is [batch predictions](how-to-run-batch-predictions.md), which does provide access to datastores when scoring.
+    > Another alternative that may work for your scenario is [batch predictions](how-to-use-parallel-run-step.md), which does provide access to datastores when scoring.
 
     For more information on entry scripts, see [Deploy models with Azure Machine Learning](how-to-deploy-and-where.md).
Original file line number	Diff line number	Diff line change
`@@ -181,7 +181,7 @@ To deploy the model, you need the following items:`
`181`	`181`	`>`
`182`	`182`	`> * The Azure Machine Learning SDK doesn't provide a way for web services or IoT Edge deployments to access your data store or datasets. If your deployed model needs to access data stored outside the deployment, like data in an Azure storage account, you must develop a custom code solution by using the relevant SDK. For example, the [Azure Storage SDK for Python](https://github.com/Azure/azure-storage-python).`
`183`	`183`	`>`
`184`		`- > An alternative that might work for your scenario is [batch prediction](how-to-run-batch-predictions.md), which does provide access to data stores during scoring.`
	`184`	`+ > An alternative that might work for your scenario is [batch prediction](how-to-use-parallel-run-step.md), which does provide access to data stores during scoring.`
`185`	`185`
`186`	`186`	`* Dependencies, like helper scripts or Python/Conda packages required to run the entry script or model.`
`187`	`187`
Original file line number	Diff line number	Diff line change
`@@ -62,7 +62,7 @@ Before deploying, you must define what is needed to run the model as a web servi`
`62`	`62`	`> [!IMPORTANT]`
`63`	`63`	`> The Azure Machine Learning SDK does not provide a way for the web service access your datastore or data sets. If you need the deployed model to access data stored outside the deployment, such as in an Azure Storage account, you must develop a custom code solution using the relevant SDK. For example, the [Azure Storage SDK for Python](https://github.com/Azure/azure-storage-python).`
`64`	`64`	`>`
`65`		`- > Another alternative that may work for your scenario is [batch predictions](how-to-run-batch-predictions.md), which does provide access to datastores when scoring.`
	`65`	`+ > Another alternative that may work for your scenario is [batch predictions](how-to-use-parallel-run-step.md), which does provide access to datastores when scoring.`
`66`	`66`
`67`	`67`	`For more information on entry scripts, see [Deploy models with Azure Machine Learning](how-to-deploy-and-where.md).`
`68`	`68`