Skip to content

Commit b44d46f

Browse files
authored
Update reference-yaml-mltable.md
1 parent 83b15cb commit b44d46f

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

articles/machine-learning/reference-yaml-mltable.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,9 @@ These transformations apply to all mltable-artifact files:
8484
- `convert_column_types`
8585
- `columns`: The column name you want to convert type of.
8686
- `column_type`: The type you want to convert the column to. For example: string, float, int, or datetime with specified formats.
87+
- `extract_partition_format_into_columns`: Specify the partition format of path. Defaults to None. The partition information of each path will be extracted into columns based on the specified format. Format part '{column_name}' creates string column, and '{column_name:yyyy/MM/dd/HH/mm/ss}' creates datetime column, where 'yyyy', 'MM', 'dd', 'HH', 'mm' and 'ss' are used to extract year, month, day, hour, minute and second for the datetime type.
88+
89+
The format should start from the position of first partition key until the end of file path. For example, given the path '../Accounts/2022/01/01/data.csv' where the partition is by department name and time, partition_format='/{Department}/{PartitionDate:yyyy/MM/dd}/data.csv' creates a string column 'Department' with the value 'Accounts' and a datetime column 'PartitionDate' with the value '2022-01-01'. Our principle here is to support transforms specific to data delivery and not to get into wider feature engineering transforms.
8790

8891
## MLTable transformations: read_delimited
8992

@@ -165,14 +168,14 @@ type: mltable
165168
paths:
166169
- folder: abfss://my_delta_files
167170
168-
transforms:
171+
transformations:
169172
- read_delta_lake:
170173
timestamp_as_of: '2022-08-26T00:00:00Z'
171174
```
172175

173176
### Delta lake transformations
174177

175-
- `timestamp_as_of`: Timestamp to be specified for time-travel on the specific Delta Lake data.
178+
- `timestamp_as_of`: Datetime string in RFC-3339/ISO-8601 format to be specified for time-travel on the specific Delta Lake data.
176179
- `version_as_of`: Version to be specified for time-travel on the specific Delta Lake data.
177180

178181
## Next steps

0 commit comments

Comments
 (0)