Skip to content

Commit dd4adfe

Browse files
committed
Fixes per Ornat's comments
1 parent 4f54d62 commit dd4adfe

File tree

2 files changed

+14
-22
lines changed

2 files changed

+14
-22
lines changed

articles/data-explorer/ingestion-properties.md

Lines changed: 9 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -11,33 +11,26 @@ ms.date: 03/19/2020
1111

1212
# Azure Data Explorer data ingestion properties
1313

14-
Data ingestion is the process by which data is added to a table and is made available for query in Azure Data Explorer. The following table describes the properties supported by Azure Data Explorer. You add properties to the Ingestion command after the `with` keyword.
14+
Data ingestion is the process by which data is added to a table and is made available for query in Azure Data Explorer. The following table describes the properties supported by Azure Data Explorer. You add properties to the ingestion command after the `with` keyword.
1515

1616
|Property |Description |Example |
1717
|----------------------|---------------------------------------------------------|----------------------------------------------------|
18-
|`ingestionMapping` |A string value that indicates how to map data from the source file to the actual columns in the table. This property requires defining the `format` value with the relevant mapping type. See [data mappings](https://docs.microsoft.com/azure/kusto/management/mappings).|`with (format="json", ingestionMapping = "[{\"column\":\"rownumber\", \"Properties\":{\"Path\":\"$.RowNumber\"}}, {\"column\":\"rowguid\", \"Properties\":{\"Path\":\"$.RowGuid\"}}]")`<br>(deprecated: `avroMapping`, `csvMapping`, `jsonMapping`) |
19-
|`ingestionMappingReference`|A string value that indicates how to map data from the source file to the actual columns in the table using a named mapping policy object. This property requires defining the `format` value with the relevant mapping type. See [data mappings](https://docs.microsoft.com/azure/kusto/management/mappings).|`with (format="csv", ingestionMappingReference = "Mapping1")`<br>(deprecated: `avroMappingReference`, `csvMappingReference`, `jsonMappingReference`)|
20-
|`creationTime` |The datetime value (formatted as an ISO8601 string) to use at the creation time of the ingested data extents. If unspecified, the current value (`now()`) will be used. Overriding the default is useful when ingesting older data, so that retention policy will be applied correctly.|`with (creationTime="2017-02-13T11:09:36.7992775Z")`|
18+
|`ingestionMapping` |A string value that indicates how to map data from the source file to the actual columns in the table. Define the `format` value with the relevant mapping type. See [data mappings](https://docs.microsoft.com/azure/kusto/management/mappings).|`with (format="json", ingestionMapping = "[{\"column\":\"rownumber\", \"Properties\":{\"Path\":\"$.RowNumber\"}}, {\"column\":\"rowguid\", \"Properties\":{\"Path\":\"$.RowGuid\"}}]")`<br>(deprecated: `avroMapping`, `csvMapping`, `jsonMapping`) |
19+
|`ingestionMappingReference`|A string value that indicates how to map data from the source file to the actual columns in the table using a named mapping policy object. Define the `format` value with the relevant mapping type. See [data mappings](https://docs.microsoft.com/azure/kusto/management/mappings).|`with (format="csv", ingestionMappingReference = "Mapping1")`<br>(deprecated: `avroMappingReference`, `csvMappingReference`, `jsonMappingReference`)|
20+
|`creationTime` |The datetime value (formatted as an ISO8601 string) to use at the creation time of the ingested data extents. If unspecified, the current value (`now()`) will be used. Overriding the default is useful when ingesting older data, so that the retention policy will be applied correctly.|`with (creationTime="2017-02-13T11:09:36.7992775Z")`|
2121
|`extend_schema`|A Boolean value that, if specified, instructs the command to extend the schema of the table (defaults to `false`). This option applies only to `.append` and `.set-or-append` commands. The only allowed schema extensions have additional columns added to the table at the end.|If the original table schema is `(a:string, b:int)`, a valid schema extension would be `(a:string, b:int, c:datetime, d:string)`, but `(a:string, c:datetime)` wouldn't be valid|
22-
|`folder` |For [ingest-from-query](https://docs.microsoft.com/azure/kusto/management/data-ingestion/ingest-from-query) commands, the folder to assign to the table (if the table already exists, this property will override the table's folder)|`with (folder="Tables/Temporary")`|
23-
|`format` |The data format (see [supported data formats](ingestion-supported-formats.md)|`with (format="csv")`|
22+
|`folder` |For [ingest-from-query](https://docs.microsoft.com/azure/kusto/management/data-ingestion/ingest-from-query) commands, the folder to assign to the table. If the table already exists, this property will override the table's folder.|`with (folder="Tables/Temporary")`|
23+
|`format` |The data format (see [supported data formats](ingestion-supported-formats.md)).|`with (format="csv")`|
2424
|`ingestIfNotExists`|A string value that, if specified, prevents ingestion from succeeding if the table already has data tagged with an `ingest-by:` tag with the same value. This ensures idempotent data ingestion. For more information, see [ingest-by: tags](https://docs.microsoft.com/azure/kusto/management/extents-overview#ingest-by-extent-tags).|The properties `with (ingestIfNotExists='["Part0001"]', tags='["ingest-by:Part0001"]')` indicate that if data with the tag `ingest-by:Part0001` already exists, then don't complete the current ingestion. If it doesn't already exist, this new ingestion should have this tag set (in case a future ingestion attempts to ingest the same data again.)|
25-
|`ignoreFirstRecord` |A Boolean value that, if set to `true`, indicates that ingestion should ignore the first record of every file. This property is useful for files in `CSV`and similar formats if the first record in the file is a header record specifying the column names. By default, `false` is assumed.|`with (ignoreFirstRecord=false)`|
25+
|`ignoreFirstRecord` |A Boolean value that, if set to `true`, indicates that ingestion should ignore the first record of every file. This property is useful for files in `CSV`and similar formats, if the first record in the file are the column names. By default, `false` is assumed.|`with (ignoreFirstRecord=false)`|
2626
|`persistDetails` |A Boolean value that, if specified, indicates that the command should persist the detailed results (even if successful) so that the [.show operation details](https://docs.microsoft.com/azure/kusto/management/operations#show-operation-details) command could retrieve them. Defaults to `false`.|`with (persistDetails=true)`|
27-
|`policy_ingestiontime`|A Boolean value that, if specified, describes whether to enable the [Ingestion Time Policy](https://docs.microsoft.com/azure/kusto/management/ingestiontimepolicy) on a table that is created by this command. (The default is `true`.)|`with (policy_ingestiontime=false)`|
27+
|`policy_ingestiontime`|A Boolean value that, if specified, describes whether to enable the [Ingestion Time Policy](https://docs.microsoft.com/azure/kusto/management/ingestiontimepolicy) on a table that is created by this command. The default is `true`.|`with (policy_ingestiontime=false)`|
2828
|`recreate_schema` |A Boolean value that, if specified, describes whether the command may recreate the schema of the table. This property applies only to the `.set-or-replace` command. This property takes precedence over the `extend_schema` property if both are set.|`with (recreate_schema=true)`|
2929
|`tags`|A list of [tags](https://docs.microsoft.com/azure/kusto/management/extents-overview#extent-tagging) to associate with the ingested data, formatted as a JSON string |`with (tags="['Tag1', 'Tag2']")`|
3030
|`validationPolicy`|A JSON string that indicates which validations to run during ingestion. See [Data ingestion](https://docs.microsoft.com/azure/kusto/management/data-ingestion/) for an explanation of the different options.| `with (validationPolicy='{"ValidationOptions":1, "ValidationImplications":1}')` (this is actually the default policy)|
3131
|`zipPattern`|Use this property when ingesting data from storage that has a ZIP archive. This is a string value indicating the regular expression to use when selecting which files in the ZIP archive to ingest. All other files in the archive will be ignored.|`with (zipPattern="*.csv")`|
3232

33-
<!-- TODO: Fill-in the following
34-
The following table shows which property applies to each method of ingestion.
35-
36-
|Property|.set|.append|.set-or-append|.set-or-replace|.ingest inline|.ingest (pull)|
37-
38-
-->
39-
4033
## Next steps
4134

42-
* Learn more about [data ingestion](https://docs.microsoft.com/azure/kusto/management/data-ingestion/)
35+
* Learn more about [data ingestion](https://docs.microsoft.com/azure/data-explorer/ingest-data-overview)
4336
* Learn more about [supported data formats](ingestion-supported-formats.md)

articles/data-explorer/ingestion-supported-formats.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Data ingestion is the process by which data is added to a table and is made avai
1818
|avro |`.avro` |An [Avro container file](https://avro.apache.org/docs/current/). The following codes are supported: `null`, `deflate` (`snappy` is currently not supported).|
1919
|CSV |`.csv` |A text file with comma-separated values (`,`). See [RFC 4180: _Common Format and MIME Type for Comma-Separated Values (CSV) Files_](https://www.ietf.org/rfc/rfc4180.txt).|
2020
|JSON |`.json` |A text file with JSON objects delimited by `\n` or `\r\n`. See [JSON Lines (JSONL)](http://jsonlines.org/).|
21-
|multijson|`.multijson`|A text file with a JSON array of property bags (each representing a record), or any number of property bags delimited by whitespace, `\n` or `\r\n`. Each property bag can be spread on multiple lines. (This format is preferred over `JSON`, unless the data is non-property bags.)|
21+
|multijson|`.multijson`|A text file with a JSON array of property bags (each representing a record), or any number of property bags delimited by whitespace, `\n` or `\r\n`. Each property bag can be spread on multiple lines. This format is preferred over `JSON`, unless the data is non-property bags.|
2222
|orc |`.orc` |An [Orc file](https://en.wikipedia.org/wiki/Apache_ORC).|
2323
|parquet |`.parquet` |A [Parquet file](https://en.wikipedia.org/wiki/Apache_Parquet).|
2424
|psv |`.psv` |A text file with pipe-separated values (<code>&#124;</code>).|
@@ -39,13 +39,12 @@ Blobs and files can be compressed through any of the following compression algor
3939
|Zip |.zip |
4040

4141
Indicate compression by appending the extension to the name of the blob or file.
42-
42+
4343
For example:
44-
* `MyData.csv.zip` indicates a blob or a file formatted as CSV, compressed with ZIP (either as an archive or as a single file)
44+
* `MyData.csv.zip` indicates a blob or a file formatted as CSV, compressed with ZIP (archive or a single file)
4545
* `MyData.csv.gz` indicates a blob or a file formatted as CSV, compressed with GZip
4646

47-
Blob or file names that don't include the format extensions but just compression
48-
(for example, `MyData.zip`) are also supported. In this case, the file format
47+
Blob or file names that don't include the format extensions but just compression (for example, ) is also supported. In this case, the file format
4948
must be specified as an ingestion property because it cannot be inferred.
5049

5150
> [!NOTE]
@@ -56,5 +55,5 @@ must be specified as an ingestion property because it cannot be inferred.
5655
5756
## Next steps
5857

59-
* Learn more about [data ingestion](https://docs.microsoft.com/azure/kusto/management/data-ingestion/)
58+
* Learn more about [data ingestion](https://docs.microsoft.com/azure/data-explorer/ingest-data-overview)
6059
* Learn more about [Azure Data Explorer data ingestion properties](ingestion-properties.md)

0 commit comments

Comments
 (0)