Skip to content

Commit 9aa7a28

Browse files
author
Marc Gelormino
committed
Links to SDK and change user to you.
1 parent d668dbc commit 9aa7a28

File tree

1 file changed

+8
-9
lines changed

1 file changed

+8
-9
lines changed

articles/machine-learning/how-to-use-parallel-run-step.md

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ In this article, you learn the following tasks:
3030
3131
## Prerequisites
3232

33-
* If you dont have an Azure subscription, create a free account before you begin. Try the [free or paid version of the Azure Machine Learning](https://aka.ms/AMLFree).
33+
* If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of the Azure Machine Learning](https://aka.ms/AMLFree).
3434

3535
* For a guided quickstart, complete the [setup tutorial](tutorial-1st-experiment-sdk-setup.md) if you don't already have an Azure Machine Learning workspace or notebook virtual machine.
3636

@@ -66,7 +66,7 @@ mnist_blob = Datastore.register_azure_blob_container(ws,
6666

6767
Next, specify the workspace default datastore as the output datastore. You'll use it for inference output.
6868

69-
When you create your workspace, [Azure Files](https://docs.microsoft.com/azure/storage/files/storage-files-introduction) and [Blob storage](https://docs.microsoft.com/azure/storage/blobs/storage-blobs-introduction) are attached to the workspace by default. Azure Files is the default datastore for a workspace, but you can also use Blob storage as a datastore. For more information, see [Azure storage options](https://docs.microsoft.com/azure/storage/common/storage-decide-blobs-files-disks).
69+
When you create your workspace, [Azure Files](https://docs.microsoft.com/azure/storage/files/storage-files-introduction) and [Blob storage](https://docs.microsoft.com/azure/storage/blobs/storage-blobs-introduction) are attached to the workspace by default. Azure Files is the default datastore for a workspace, but you can also use Blob storage as a datastore. For more information, see [Azure Storage options](https://docs.microsoft.com/azure/storage/common/storage-decide-blobs-files-disks).
7070

7171
```python
7272
def_data_store = ws.get_default_datastore()
@@ -81,7 +81,7 @@ Now you need to configure data inputs and outputs, including:
8181
- The directory that contains the labels.
8282
- The directory for output.
8383

84-
`Dataset` is a class for exploring, transforming, and managing data in Azure Machine Learning. This class has two types: `TabularDataset` and `FileDataset`. In this example, you'll use `FileDataset` as the inputs to the batch inference pipeline step.
84+
[`Dataset`](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py) is a class for exploring, transforming, and managing data in Azure Machine Learning. This class has two types: [`TabularDataset`](https://docs.microsoft.com/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py) and [`FileDataset`](https://docs.microsoft.com/python/api/azureml-core/azureml.data.filedataset?view=azure-ml-py). In this example, you'll use `FileDataset` as the inputs to the batch inference pipeline step.
8585

8686
> [!NOTE]
8787
> `FileDataset` support in batch inference is restricted to Azure Blob storage for now.
@@ -90,7 +90,7 @@ You can also reference other datasets in your custom inference script. For examp
9090

9191
For more information about Azure Machine Learning datasets, see [Create and access datasets (preview)](https://docs.microsoft.com/azure/machine-learning/how-to-create-register-datasets).
9292

93-
`PipelineData` objects are used for transferring intermediate data between pipeline steps. In this example, you use it for inference outputs.
93+
[`PipelineData`](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelinedata?view=azure-ml-py) objects are used for transferring intermediate data between pipeline steps. In this example, you use it for inference outputs.
9494

9595
```python
9696
from azureml.core.dataset import Dataset
@@ -180,13 +180,13 @@ model = Model.register(model_path="models/",
180180
## Write your inference script
181181

182182
>[!Warning]
183-
>The following code is only a sample that the [sample notebook](https://aka.ms/batch-inference-notebooks) uses. Youll need to create your own script for your scenario.
183+
>The following code is only a sample that the [sample notebook](https://aka.ms/batch-inference-notebooks) uses. You'll need to create your own script for your scenario.
184184
185185
The script *must contain* two functions:
186186
- `init()`: Use this function for any costly or common preparation for later inference. For example, use it to load the model into a global object. This function will be called only once at beginning of process.
187187
- `run(mini_batch)`: The function will run for each `mini_batch` instance.
188188
- `mini_batch`: Parallel run step will invoke run method and pass either a list or Pandas DataFrame as an argument to the method. Each entry in min_batch will be - a file path if input is a FileDataset, a Pandas DataFrame if input is a TabularDataset.
189-
- `response`: run() method should return a Pandas DataFrame or an array. For append_row output_action, these returned elements are appended into the common output file. For summary_only, the contents of the elements are ignored. For all output actions, each returned output element indicates one successful run of input element in the input mini-batch. User should make sure that enough data is included in run result to map input to run result. Run output will be written in output file and not guaranteed to be in order, user should use some key in the output to map it to input.
189+
- `response`: run() method should return a Pandas DataFrame or an array. For append_row output_action, these returned elements are appended into the common output file. For summary_only, the contents of the elements are ignored. For all output actions, each returned output element indicates one successful run of input element in the input mini-batch. You should make sure that enough data is included in run result to map input to run result. Run output will be written in output file and not guaranteed to be in order, you should use some key in the output to map it to input.
190190

191191
```python
192192
# Snippets from a sample script.
@@ -316,8 +316,7 @@ parallelrun_step = ParallelRunStep(
316316
models=[model],
317317
parallel_run_config=parallel_run_config,
318318
inputs=[named_mnist_ds],
319-
output=output_dir,
320-
arguments=[],
319+
output=output_dir, arguments=[],
321320
allow_reuse=True
322321
)
323322
```
@@ -327,7 +326,7 @@ parallelrun_step = ParallelRunStep(
327326
328327
### Run the pipeline
329328

330-
Now, run the pipeline. First, create a `Pipeline` object by using your workspace reference and the pipeline step that you created. The `steps` parameter is an array of steps. In this case, there's only one step for batch scoring. To build pipelines that have multiple steps, place the steps in order in this array.
329+
Now, run the pipeline. First, create a [`Pipeline`](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.pipeline%28class%29?view=azure-ml-py) object by using your workspace reference and the pipeline step that you created. The `steps` parameter is an array of steps. In this case, there's only one step for batch scoring. To build pipelines that have multiple steps, place the steps in order in this array.
331330

332331
Next, use the `Experiment.submit()` function to submit the pipeline for execution.
333332

0 commit comments

Comments
 (0)