Skip to content

Commit 11b6555

Browse files
authored
Minor grammar changes to how-to-mltable.md
1 parent 1078007 commit 11b6555

File tree

1 file changed

+18
-18
lines changed

1 file changed

+18
-18
lines changed

articles/machine-learning/how-to-mltable.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,11 @@ ms.custom: contperf-fy21q1, devx-track-python, data4ml
2424
Azure Machine Learning supports a Table type (`mltable`). This allows for the creation of a *blueprint* that defines how to load data files into memory as a Pandas or Spark data frame. In this article you learn:
2525

2626
> [!div class="checklist"]
27-
> - When to use Tables instead of Files or Folders.
28-
> - How to install the MLTable SDK.
29-
> - How to define a data loading blueprint using an `MLTable` file.
30-
> - Examples that show use of Tables in Azure ML.
31-
> - How to use Tables during interactive development (for example, in a notebook).
27+
> - When to use Azure ML Tables instead of Files or Folders.
28+
> - How to install the `mltable` SDK.
29+
> - How to define a data loading blueprint using an `mltable` file.
30+
> - Examples that show how `mltable` is used in Azure ML.
31+
> - How to use the `mltable` during interactive development (for example, in a notebook).
3232
3333
## Prerequisites
3434

@@ -55,7 +55,7 @@ git clone --depth 1 https://github.com/Azure/azureml-examples
5555
> [!TIP]
5656
> Use `--depth 1` to clone only the latest commit to the repository. This reduces the time needed to complete the operation.
5757
58-
The examples relevant to Azure Machine Learning Tables can be found in the following folder of the clone repo:
58+
The examples relevant to Azure Machine Learning Tables can be found in the following folder of the cloned repo:
5959

6060
```bash
6161
cd azureml-examples/sdk/python/using-mltable
@@ -65,7 +65,7 @@ cd azureml-examples/sdk/python/using-mltable
6565

6666
Azure Machine Learning Tables (`mltable`) allow you to define how you want to *load* your data files into memory, as a Pandas and/or Spark data frame. Tables have two key features:
6767

68-
1. **An `MLTable` file.** A YAML-based file that defines the data loading *blueprint*. In the MLTable file, you can specify:
68+
1. **An MLTable file.** A YAML-based file that defines the data loading *blueprint*. In the MLTable file, you can specify:
6969
- The storage location(s) of the data - local, in the cloud, or on a public http(s) server.
7070
- *Globbing* patterns over cloud storage. These locations can specify sets of filenames, with wildcard characters (`*`).
7171
- *read transformation* - for example, the file format type (delimited text, Parquet, Delta, json), delimiters, headers, etc.
@@ -86,9 +86,9 @@ Azure Machine Learning Tables are useful in the following scenarios:
8686
- You want to train ML models using Azure Machine Learning AutoML.
8787

8888
> [!TIP]
89-
> Azure ML *doesn't* require use of Azure ML Tables (`mltable`) for your tabular data. You can use Azure ML File (`uri_file`) and Folder (`uri_folder`) types, and your own parsing logic loads the data into a Pandas or Spark data frame.
89+
> Azure ML *doesn't require* use of Azure ML Tables (`mltable`) for your tabular data. You can use Azure ML File (`uri_file`) and Folder (`uri_folder`) types, and your own parsing logic loads the data into a Pandas or Spark data frame.
9090
>
91-
> If you have a simple CSV file or Parquet folder, it's **easier** to use Azure ML Files/Folders instead of than Tables.
91+
> If you have a simple CSV file or Parquet folder, it's **easier** to use Azure ML Files/Folders instead of Tables.
9292
9393
## Azure Machine Learning Tables Quickstart
9494

@@ -126,8 +126,8 @@ With this data, you want to load into a Pandas data frame:
126126

127127
Pandas code handles this. However, achieving *reproducibility* would become difficult because you must either:
128128

129-
- share code, which means that if the schema changes (for example, a column name change) then all users must update their code, or
130-
- write an ETL pipeline, which has heavy overhead.
129+
- Share code, which means that if the schema changes (for example, a column name change) then all users must update their code, or
130+
- Write an ETL pipeline, which has heavy overhead.
131131

132132
Azure Machine Learning Tables provide a light-weight mechanism to serialize (save) the data loading steps in an `MLTable` file, so that you and members of your team can *reproduce* the Pandas data frame. If the schema changes, you only update the `MLTable` file, instead of updates in many places that involve Python data loading code.
133133

@@ -508,9 +508,9 @@ Azure Machine Learning Tables support reading from:
508508

509509
The `mltable` flexibility allows for data loading into a single dataframe from a combination of:
510510

511-
- local and cloud storage
512-
- different cloud storage locations (for example: different blob containers)
513-
- files, folder and glob patterns.
511+
- Local and cloud storage
512+
- Different cloud storage locations (for example: different blob containers)
513+
- Files, folder and glob patterns.
514514

515515
For example:
516516

@@ -699,7 +699,7 @@ You can create a table containing the paths on cloud storage. This example has s
699699
1.jpeg
700700
```
701701

702-
MLTable can construct a table that contains the storage paths of these images and their folder names (labels), which can be used to stream the images. The following code shows how to create the MLTable:
702+
The `mltable` can construct a table that contains the storage paths of these images and their folder names (labels), which can be used to stream the images. The following code shows how to create the MLTable:
703703

704704
```python
705705
import mltable
@@ -722,7 +722,7 @@ print(df.head())
722722
tbl.save("./pets")
723723
```
724724

725-
The following code shows how to open the storage location in the Pandas data frame, and plot the images in a grid:
725+
The following code shows how to open the storage location in the Pandas data frame, and plot the images:
726726

727727
```python
728728
# plot images on a grid. Note this takes ~1min to execute.
@@ -742,7 +742,7 @@ for i in range(1, columns*rows +1):
742742

743743
#### Create a data asset to aid sharing and reproducibility
744744

745-
You have your MLTable file currently saved on disk, which makes it hard to share with Team members. When you create a data asset in Azure Machine Learning, your MLTable is uploaded to cloud storage and "bookmarked", which allows your Team members to access the MLTable using a friendly name. Also, the data asset is versioned.
745+
You have your `mltable` file currently saved on disk, which makes it hard to share with Team members. When you create a data asset in Azure Machine Learning, the `mltable` is uploaded to cloud storage and "bookmarked", which allows your Team members to access the `mltable` using a friendly name. Also, the data asset is versioned.
746746

747747
```python
748748
import time
@@ -771,7 +771,7 @@ my_data = Data(
771771
ml_client.data.create_or_update(my_data)
772772
```
773773

774-
Now that you have your MLTable stored in the cloud, you and Team members can access it with a friendly name in an interactive session (for example, a notebook):
774+
Now that the `mltable` is stored in the cloud, you and your Team members can access it with a friendly name in an interactive session (for example, a notebook):
775775

776776
```python
777777
import mltable

0 commit comments

Comments
 (0)