Skip to content

Commit b1b6165

Browse files
Merge pull request #223739 from SturgeonMi/patch-20
Update how-to-mltable.md
2 parents 674bade + f85c8f4 commit b1b6165

File tree

2 files changed

+5
-5
lines changed

2 files changed

+5
-5
lines changed

articles/machine-learning/how-to-mltable.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -406,7 +406,7 @@ If you author files locally (or in a DSVM), you can use `azcopy` or the [Azure S
406406
|---------|---------|
407407
|`read_delimited` | `infer_column_types`: Boolean to infer column data types. Defaults to True. Type inference requires that the data source is accessible from the current compute. Currently type inference will only pull the first 200 rows.<br><br>`encoding`: Specify the file encoding. Supported encodings: `utf8`, `iso88591`, `latin1`, `ascii`, `utf16`, `utf32`, `utf8bom` and `windows1252`. Default encoding: `utf8`.<br><br>`header`: user can choose one of the following options: `no_header`, `from_first_file`, `all_files_different_headers`, `all_files_same_headers`. Defaults to `all_files_same_headers`.<br><br>`delimiter`: The separator used to split columns.<br><br>`empty_as_string`: Specify if empty field values should be loaded as empty strings. The default (False) will read empty field values as nulls. Passing this setting as *True* will read empty field values as empty strings. If the values are converted to numeric or datetime, then this setting has no effect, as empty values will be converted to nulls.<br><Br>`include_path_column`: Boolean to keep path information as column in the table. Defaults to False. This setting is useful when reading multiple files, and you want to know which file a particular record originated from. Also, you can keep useful information in file path.<br><br>`support_multi_line`: By default (`support_multi_line=False`), all line breaks, including line breaks in quoted field values, will be interpreted as a record break. Reading data this way is faster and more optimized for parallel execution on multiple CPU cores. However, it may result in silent production of more records with misaligned field values. This setting should be set to True when the delimited files are known to contain quoted line breaks. |
408408
| `read_parquet` | `include_path_column`: Boolean to keep path information as column in the table. Defaults to False. This setting is useful when you're reading multiple files, and you want to know which file a particular record originated from. Also, you can keep useful information in file path. |
409-
| `read_delta_lake` | `timestamp_as_of`: Timestamp to be specified for time-travel on the specific Delta Lake data.<br><br>`version_as_of`: Version to be specified for time-travel on the specific Delta Lake data.
409+
| `read_delta_lake` (preview) | `timestamp_as_of`: Timestamp to be specified for time-travel on the specific Delta Lake data.<br><br>`version_as_of`: Version to be specified for time-travel on the specific Delta Lake data.
410410
| `read_json_lines` | `include_path_column`: Boolean to keep path information as column in the MLTable. Defaults to False. This setting is useful when you're reading multiple files, and you want to know which file a particular record originated from. Also, you can keep useful information in file path.<br><br>`invalid_lines`: How to handle lines that are invalid JSON. Supported values are `error` and `drop`. Defaults to `error`.<br><br>`encoding`: Specify the file encoding. Supported encodings are `utf8`, `iso88591`, `latin1`, `ascii`, `utf16`, `utf32`, `utf8bom` and `windows1252`. Default is `utf8`.
411411

412412
##### Other transformations
@@ -857,7 +857,7 @@ tbl = mltable.load(uri)
857857
df = tbl.to_pandas_dataframe()
858858
```
859859

860-
### Delta Lake
860+
### Delta Lake (preview)
861861

862862
In this example, we assume you have a data in Delta format on Azure Data Lake:
863863

@@ -1250,4 +1250,4 @@ df.head()
12501250

12511251
- [Read data in a job](how-to-read-write-data-v2.md#read-data-in-a-job)
12521252
- [Create data assets](how-to-create-data-assets.md#create-data-assets)
1253-
- [Data administration](how-to-administrate-data-authentication.md#data-administration)
1253+
- [Data administration](how-to-administrate-data-authentication.md#data-administration)

articles/machine-learning/reference-yaml-mltable.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ If the user doesn't define options for `read_parquet` transformation, default op
161161

162162
- `include_path_column`: Boolean to keep path information as column in the table. Defaults to False. This setting is useful when you're reading multiple files, and want to know which file a particular record originated from. And you can also keep useful information in file path.
163163

164-
## MLTable transformations: read_delta_lake
164+
## MLTable transformations: read_delta_lake (preview)
165165
```yaml
166166
type: mltable
167167
@@ -173,7 +173,7 @@ transformations:
173173
timestamp_as_of: '2022-08-26T00:00:00Z'
174174
```
175175

176-
### Delta lake transformations
176+
### Delta lake transformations (preview)
177177

178178
- `timestamp_as_of`: Datetime string in RFC-3339/ISO-8601 format to be specified for time-travel on the specific Delta Lake data.
179179
- `version_as_of`: Version to be specified for time-travel on the specific Delta Lake data.

0 commit comments

Comments
 (0)