Skip to content

Commit 6c85c58

Browse files
committed
Update side_input documentation
1 parent 9ac4b59 commit 6c85c58

File tree

1 file changed

+21
-8
lines changed

1 file changed

+21
-8
lines changed

articles/machine-learning/how-to-debug-parallel-run-step.md

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ Because of the distributed nature of ParallelRunStep jobs, there are logs from s
3535

3636
Logs generated from entry script using EntryScript helper and print statements will be found in following files:
3737

38-
- `~/logs/user/logs/`: This folder contains logs written from entry_script using EntryScript helper. Also contains print statement (stdout) from entry_script.
38+
- `~/logs/user/<node_name>.log.txt`: These are the logs written from entry_script using EntryScript helper. Also contains print statement (stdout) from entry_script.
3939

4040
For a concise understanding of errors in your script there is:
4141

@@ -82,19 +82,32 @@ def run(mini_batch):
8282

8383
### How could I pass a side input such as, a file or file(s) containing a lookup table, to all my workers?
8484

85-
Construct a [Dataset](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py) object containing the side input and register with your workspace. After that you can access it in your inference script (for example, in your init() method) as follows:
85+
Construct a [Dataset](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py) containing the side input and register it with your workspace. Pass it to the `side_input` parameter of your `ParallelRunStep`. Additionally, you can add it's path in the `arguments` section to easily access it's mounted path:
8686

8787
```python
88-
from azureml.core.run import Run
89-
from azureml.core.dataset import Dataset
88+
label_config = label_ds.as_named_input("labels_input")
89+
batch_score_step = ParallelRunStep(
90+
name=parallel_step_name,
91+
inputs=[input_images.as_named_input("input_images")],
92+
output=output_dir,
93+
arguments=["--labels_dir", label_config],
94+
side_inputs=[label_config],
95+
parallel_run_config=parallel_run_config,
96+
)
97+
```
98+
99+
After that you can access it in your inference script (for example, in your init() method) as follows:
100+
101+
```python
102+
parser = argparse.ArgumentParser()
103+
parser.add_argument('--labels_dir', dest="labels_dir", required=True)
104+
args, _ = parser.parse_known_args()
90105

91-
ws = Run.get_context().experiment.workspace
92-
lookup_ds = Dataset.get_by_name(ws, "<registered-name>")
93-
lookup_ds.download(target_path='.', overwrite=True)
106+
labels_path = args.labels_dir
94107
```
95108

96109
## Next steps
97110

98111
* See the SDK reference for help with the [azureml-contrib-pipeline-step](https://docs.microsoft.com/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps?view=azure-ml-py) package and the [documentation](https://docs.microsoft.com/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps.parallelrunstep?view=azure-ml-py) for ParallelRunStep class.
99112

100-
* Follow the [advanced tutorial](tutorial-pipeline-batch-scoring-classification.md) on using pipelines with parallel run step.
113+
* Follow the [advanced tutorial](tutorial-pipeline-batch-scoring-classification.md) on using pipelines with ParallelRunStep and for an example of passing another file as a side input.

0 commit comments

Comments
 (0)