Skip to content

Commit 55cf011

Browse files
Merge pull request #248840 from SturgeonMi/SturgeonMi-patch-mltable
Update mltable save api
2 parents c17f901 + 88648fb commit 55cf011

File tree

1 file changed

+16
-7
lines changed

1 file changed

+16
-7
lines changed

articles/machine-learning/how-to-mltable.md

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -203,20 +203,29 @@ You can optionally choose to load the MLTable object into Pandas, using:
203203
```
204204

205205
#### Save the data loading steps
206-
Next, save all your data loading steps into an MLTable file. If you save your data loading steps, you can reproduce your Pandas data frame at a later point in time, and you don't need to redefine the data loading steps in your code.
206+
Next, save all your data loading steps into an MLTable file. Saving your data loading steps in an MLTable file allows you to reproduce your Pandas data frame at a later point in time, without need to redefine the code each time.
207207

208+
You can choose to save the MLTable yaml file to a cloud storage, or you can also save it to local paths.
208209
```python
209-
# serialize the data loading steps into an MLTable file
210-
tbl.save("./nyc_taxi")
210+
# save the data loading steps in an MLTable file to a cloud storage
211+
# NOTE: the tbl object was defined in the previous snippet.
212+
tbl.save(save_path_dirc= "azureml://subscriptions/<subid>/resourcegroups/<rgname>/workspaces/<wsname>/datastores/<name>/paths/titanic", collocated=True, show_progress=True, allow_copy_errors=False, overwrite=True)
211213
```
212214

213-
You can optionally view the contents of the MLTable file, to understand how the data loading steps are serialized into a file:
214-
215215
```python
216-
with open("./nyc_taxi/MLTable", "r") as f:
217-
print(f.read())
216+
# save the data loading steps in an MLTable file to local
217+
# NOTE: the tbl object was defined in the previous snippet.
218+
tbl.save("./titanic")
218219
```
219220

221+
> [!IMPORTANT]
222+
> - If collocated == True, then we will copy the data to the same folder with MLTable yaml file if they are not currently collocated, and we will use relative paths in MLTable yaml.
223+
> - If collocated == False, we will not move the data and we will use absolute paths for cloud data and use relative paths for local data.
224+
> - We don’t support this parameter combination: data is in local, collocated == False, `save_path_dirc` is a cloud directory. Please upload your local data to cloud and use the cloud data paths for MLTable instead.
225+
> - Parameters `show_progress` (default as True), `allow_copy_errors` (default as False), `overwrite`(default as True) are optional.
226+
>
227+
228+
220229
### Reproduce data loading steps
221230
Now that the data loading steps have been serialized into a file, you can reproduce them at any point in time, with the load() method. This way, you don't need to redefine your data loading steps in code, and you can more easily share the file.
222231

0 commit comments

Comments
 (0)