You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -102,7 +104,7 @@ The method receives a list of file paths as a parameter (`mini_batch`). You can
102
104
>
103
105
> Batch deployments distribute work at the file level, which means that a folder containing 100 files with mini-batches of 10 files will generate 10 batches of 10 files each. Notice that this will happen regardless of the size of the files involved. If your files are too big to be processed in large mini-batches we suggest to either split the files in smaller files to achieve a higher level of parallelism or to decrease the number of files per mini-batch. At this moment, batch deployment can't account for skews in the file's size distribution.
104
106
105
-
The `run()` method should return a Pandas `DataFrame` or an array/list. Each returned output element indicates one successful run of an input element in the input `mini_batch`. For file datasets, each row/element represents a single file processed. For a tabular dataset, each row/element represents a row in a processed file.
107
+
The `run()` method should return a Pandas `DataFrame` or an array/list. Each returned output element indicates one successful run of an input element in the input `mini_batch`. For file or folder data assets, each row/element returned represents a single file processed. For a tabular data asset, each row/element returned represents a row in a processed file.
106
108
107
109
> [!IMPORTANT]
108
110
> __How to write predictions?__
@@ -112,7 +114,7 @@ The `run()` method should return a Pandas `DataFrame` or an array/list. Each ret
112
114
> If you need to write predictions in a different way, you can [customize outputs in batch deployments](how-to-deploy-model-custom-output.md).
113
115
114
116
> [!WARNING]
115
-
> Do not not output complex data types (or lists of complex data types) in the `run` function. Those outputs will be transformed to string and they will be hard to read.
117
+
> Do not not output complex data types (or lists of complex data types) rather than `pandas.DataFrame`in the `run` function. Those outputs will be transformed to string and they will be hard to read.
116
118
117
119
The resulting DataFrame or array is appended to the output file indicated. There's no requirement on the cardinality of the results (1 file can generate 1 or many rows/elements in the output). All elements in the result DataFrame or array are written to the output file as-is (considering the `output_action` isn't `summary_only`).
118
120
@@ -131,7 +133,7 @@ Refer to [Create a batch deployment](how-to-use-batch-endpoint.md#create-a-batch
131
133
By default, the batch deployment writes the model's predictions in a single file as indicated in the deployment. However, there are some cases where you need to write the predictions in multiple files. For instance, if the input data is partitioned, you typically would want to generate your output partitioned too. On those cases you can [Customize outputs in batch deployments](how-to-deploy-model-custom-output.md) to indicate:
132
134
133
135
> [!div class="checklist"]
134
-
> * The file format used (CSV, parquet, json, etc).
136
+
> * The file format used (CSV, parquet, json, etc) to write predictions.
135
137
> * The way data is partitioned in the output.
136
138
137
139
Read the article [Customize outputs in batch deployments](how-to-deploy-model-custom-output.md) for an example about how to achieve it.
@@ -190,7 +192,7 @@ For an example about how to achieve it see [Text processing with batch deploymen
190
192
191
193
### Using models that are folders
192
194
193
-
The environment variable `AZUREML_MODEL_DIR` contains the path to where the selected model is located and it is typically used in the `init()` function to load the model into memory. However, some models may contain its files inside of a folder. When reading the files in this variable, you may need to account for that. You can identify the folder where your MLflow model is placed as follows:
195
+
The environment variable `AZUREML_MODEL_DIR` contains the path to where the selected model is located and it is typically used in the `init()` function to load the model into memory. However, some models may contain their files inside of a folder and you may need to account for that when loading them. You can identify the folder structure of your model as follows:
194
196
195
197
1. Go to [Azure Machine Learning portal](https://ml.azure.com).
0 commit comments