You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* WinRAR - download it from [www.win-rar.com/download.html](http://www.win-rar.com/download.html)
21
24
22
25
## Install LightIngest
@@ -44,11 +47,15 @@ The utility can pull source data from a local folder or from an Azure blob stora
44
47
45
48
For example:
46
49
```
47
-
LightIngest "Data Source=https://{Cluster name and region}.kusto.windows.net;AAD Federated Security=True" -db:{Database} -table:Trips -source:"https://{Account}.blob.core.windows.net/{ROOT_CONTAINER};{StorageAccountKey}" -pattern:"*.csv.gz" -format:csv -limit:2 -ignoreFirst:true -cr:10.0 -dontWait:true
50
+
ingest-{Cluster name and region}.kusto.windows.net;AAD Federated Security=True -db:{Database} -table:Trips -source:"https://{Account}.blob.core.windows.net/{ROOT_CONTAINER};{StorageAccountKey}" -pattern:"*.csv.gz" -format:csv -limit:2 -ignoreFirst:true -cr:10.0 -dontWait:true
48
51
```
49
52
50
53
* The recommended method is for `LightIngest` to work with the ingestion endpoint at `https://ingest-{yourClusterNameAndRegion}.kusto.windows.net`. This way, the Azure Data Explorer service can manage the ingestion load, and you can easily recover from transient errors. However, you can also configure `LightIngest` to work directly with the engine endpoint (`https://{yourClusterNameAndRegion}.kusto.windows.net`).
51
-
* For optimal ingestion performance, it is important for LightIngest to know the raw data size and so `LightIngest` will estimate the uncompressed size of local files. However, `LightIngest` might not be able to correctly estimate the raw size of compressed blobs without first downloading them. Therefore, when ingesting compressed blobs, set the `rawSizeBytes` property on the blob metadata to uncompressed data size in bytes.
54
+
55
+
> [!Note]
56
+
> If you ingest directly with the engine endpoint, you don't need to include `ingest-` but there won't be a DM feature to protect the engine and improve the ingestion success rate.
57
+
58
+
* For optimal ingestion performance, it's important for LightIngest to know the raw data size and so `LightIngest` will estimate the uncompressed size of local files. However, `LightIngest` might not be able to correctly estimate the raw size of compressed blobs without first downloading them. Therefore, when ingesting compressed blobs, set the `rawSizeBytes` property on the blob metadata to uncompressed data size in bytes.
52
59
53
60
## General command-line arguments
54
61
@@ -67,16 +74,21 @@ The utility can pull source data from a local folder or from an Azure blob stora
67
74
|-creationTimePattern | |string |Optional |When set, is used to extract the CreationTime property from the file or blob path. See [Using CreationTimePattern argument](#using-creationtimepattern-argument) |
68
75
|-ignoreFirstRow |-ignoreFirst |bool |Optional |If set, the first record of each file/blob is ignored (for example, if the source data has headers) |
69
76
|-tag | |string |Optional |[Tags](https://docs.microsoft.com/azure/kusto/management/extents-overview#extent-tagging) to associate with the ingested data. Multiple occurrences are permitted |
70
-
|-dontWait | |bool |Optional |If set to 'true', does not wait for ingestion completion. Useful when ingesting large amounts of files/blobs |
77
+
|-dontWait | |bool |Optional |If set to 'true', doesn't wait for ingestion completion. Useful when ingesting large amounts of files/blobs |
71
78
72
79
### Using CreationTimePattern argument
73
80
74
-
The `-creationTimePattern` argument extracts the CreationTime property from the file or blob path. The pattern does not need to reflect the entire item path, just the section enclosing the timestamp you want to use.
75
-
The value of the argument must contain of three sections:
81
+
The `-creationTimePattern` argument extracts the CreationTime property from the file or blob path. The pattern doesn't need to reflect the entire item path, just the section enclosing the timestamp you want to use.
82
+
The value of the argument must include the following:
76
83
* Constant test immediately preceding the timestamp, enclosed in single quotes
77
84
* The timestamp format, in standard [.NET DateTime notation](https://docs.microsoft.com/dotnet/standard/base-types/custom-date-and-time-format-strings)
78
85
* Constant text immediately following the timestamp
79
-
For example, if blob names end with 'historicalvalues19840101.parquet' (the timestamp is four digits for the year, two digits for the month and two digits for the day of month), the corresponding value for the `-creationTimePattern` argument is 'historicalvalues'yyyyMMdd'.parquet'.
86
+
For example, if blob names end with 'historicalvalues19840101.parquet' (the timestamp is four digits for the year, two digits for the month and two digits for the day of month), the corresponding value for the `-creationTimePattern` argument is:
87
+
88
+
```
89
+
ingest-{Cluster name and region}.kusto.windows.net;AAD Federated Security=True -db:{Database} -table:Trips -source:"https://{Account}.blob.core.windows.net/{ROOT_CONTAINER};{StorageAccountKey}" -creationTimePattern:"'historicalvalues'yyyyMMdd'.parquet'"
0 commit comments