Skip to content

Commit b9f746f

Browse files
Merge pull request #267451 from santiagxf/santiagxf-patch-1
Update how-to-mlflow-batch.md
2 parents f370ba9 + 8ab9fdd commit b9f746f

File tree

1 file changed

+12
-14
lines changed

1 file changed

+12
-14
lines changed

articles/machine-learning/how-to-mlflow-batch.md

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -223,13 +223,13 @@ Output predictions are generated in the `predictions.csv` file as indicated in t
223223

224224
The file is structured as follows:
225225

226-
* There is one row per each data point that was sent to the model. For tabular data, this means that one row is generated for each row in the input files and hence the number of rows in the generated file (`predictions.csv`) equals the sum of all the rows in all the processed files. For other data types, there is one row per each processed file.
226+
* There is one row per each data point that was sent to the model. For tabular data, it means that the file (`predictions.csv`) contains one row for every row present in each of the processed files. For other data types (e.g. images, audio, text), there is one row per each processed file.
227227

228-
* Two columns are indicated:
229-
230-
* The file name where the data was read from. In tabular data, use this field to know which prediction belongs to which input data. For any given file, predictions are returned in the same order they appear in the input file so you can rely on the row number to match the corresponding prediction.
231-
* The prediction associated with the input data. This value is returned "as-is" it was provided by the model's `predict().` function.
228+
* The following columns are in the file (in order):
232229

230+
* `row` (optional), the corresponding row index in the input data file. This only applies if the input data is tabular. Predictions are returned in the same order they appear in the input file so you can rely on the row number to match the corresponding prediction.
231+
* `prediction`, the prediction associated with the input data. This value is returned "as-is" it was provided by the model's `predict().` function.
232+
* `file_name`, the file name where the data was read from. In tabular data, use this field to know which prediction belongs to which input data.
233233

234234
You can download the results of the job by using the job name:
235235

@@ -248,17 +248,15 @@ Once the file is downloaded, you can open it using your favorite tool. The follo
248248

249249
[!notebook-python[] (~/azureml-examples-main/sdk/python/endpoints/batch/deploy-models/heart-classifier-mlflow/mlflow-for-batch-tabular.ipynb?name=read_outputs)]
250250

251-
> [!WARNING]
252-
> The file `predictions.csv` may not be a regular CSV file and can't be read correctly using `pandas.read_csv()` method.
253-
254251
The output looks as follows:
255252

256-
| file | prediction |
257-
| -------------------------- | ----------- |
258-
| heart-unlabeled-0.csv | 0 |
259-
| heart-unlabeled-0.csv | 1 |
260-
| ... | 1 |
261-
| heart-unlabeled-3.csv | 0 |
253+
|row | prediction | file |
254+
|-----| ----------- | -------------------------- |
255+
| 0 | 0 | heart-unlabeled-0.csv |
256+
| 1 | 1 | heart-unlabeled-0.csv |
257+
| 2 | 0 | heart-unlabeled-0.csv |
258+
| ... | ... | ... |
259+
| 307 | 0 | heart-unlabeled-3.csv |
262260

263261
> [!TIP]
264262
> Notice that in this example the input data was tabular data in `CSV` format and there were 4 different input files (heart-unlabeled-0.csv, heart-unlabeled-1.csv, heart-unlabeled-2.csv and heart-unlabeled-3.csv).

0 commit comments

Comments
 (0)