Skip to content

Commit a3bc6dc

Browse files
committed
Updated content, added an image
1 parent 1a9da76 commit a3bc6dc

File tree

2 files changed

+20
-8
lines changed

2 files changed

+20
-8
lines changed

articles/data-explorer/lightingest.md

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,20 @@ ms.author: orspodek
66
ms.reviewer: tzgitlin
77
ms.service: data-explorer
88
ms.topic: conceptual
9-
ms.date: 03/17/2020
9+
ms.date: 04/01/2020
1010
---
1111

1212
# Install and use LightIngest
1313

14-
LightIngest is a command-line utility for ad-hoc data ingestion into Azure Data Explorer.
14+
LightIngest is a command-line utility for ad-hoc data ingestion into Azure Data Explorer.
1515
The utility can pull source data from a local folder or from an Azure blob storage container.
1616

1717
## Prerequisites
1818

1919
* LightIngest - download it as part of the [Microsoft.Azure.Kusto.Tools NuGet package](https://www.nuget.org/packages/Microsoft.Azure.Kusto.Tools/)
20+
21+
![Lightingest download](media/lightingest/lightingest-download-area.png)
22+
2023
* WinRAR - download it from [www.win-rar.com/download.html](http://www.win-rar.com/download.html)
2124

2225
## Install LightIngest
@@ -44,11 +47,15 @@ The utility can pull source data from a local folder or from an Azure blob stora
4447

4548
For example:
4649
```
47-
LightIngest "Data Source=https://{Cluster name and region}.kusto.windows.net;AAD Federated Security=True" -db:{Database} -table:Trips -source:"https://{Account}.blob.core.windows.net/{ROOT_CONTAINER};{StorageAccountKey}" -pattern:"*.csv.gz" -format:csv -limit:2 -ignoreFirst:true -cr:10.0 -dontWait:true
50+
ingest-{Cluster name and region}.kusto.windows.net;AAD Federated Security=True -db:{Database} -table:Trips -source:"https://{Account}.blob.core.windows.net/{ROOT_CONTAINER};{StorageAccountKey}" -pattern:"*.csv.gz" -format:csv -limit:2 -ignoreFirst:true -cr:10.0 -dontWait:true
4851
```
4952
5053
* The recommended method is for `LightIngest` to work with the ingestion endpoint at `https://ingest-{yourClusterNameAndRegion}.kusto.windows.net`. This way, the Azure Data Explorer service can manage the ingestion load, and you can easily recover from transient errors. However, you can also configure `LightIngest` to work directly with the engine endpoint (`https://{yourClusterNameAndRegion}.kusto.windows.net`).
51-
* For optimal ingestion performance, it is important for LightIngest to know the raw data size and so `LightIngest` will estimate the uncompressed size of local files. However, `LightIngest` might not be able to correctly estimate the raw size of compressed blobs without first downloading them. Therefore, when ingesting compressed blobs, set the `rawSizeBytes` property on the blob metadata to uncompressed data size in bytes.
54+
55+
> [!Note]
56+
> If you ingest directly with the engine endpoint, you don't need to include `ingest-` but there won't be a DM feature to protect the engine and improve the ingestion success rate.
57+
58+
* For optimal ingestion performance, it's important for LightIngest to know the raw data size and so `LightIngest` will estimate the uncompressed size of local files. However, `LightIngest` might not be able to correctly estimate the raw size of compressed blobs without first downloading them. Therefore, when ingesting compressed blobs, set the `rawSizeBytes` property on the blob metadata to uncompressed data size in bytes.
5259
5360
## General command-line arguments
5461
@@ -67,16 +74,21 @@ The utility can pull source data from a local folder or from an Azure blob stora
6774
|-creationTimePattern | |string |Optional |When set, is used to extract the CreationTime property from the file or blob path. See [Using CreationTimePattern argument](#using-creationtimepattern-argument) |
6875
|-ignoreFirstRow |-ignoreFirst |bool |Optional |If set, the first record of each file/blob is ignored (for example, if the source data has headers) |
6976
|-tag | |string |Optional |[Tags](https://docs.microsoft.com/azure/kusto/management/extents-overview#extent-tagging) to associate with the ingested data. Multiple occurrences are permitted |
70-
|-dontWait | |bool |Optional |If set to 'true', does not wait for ingestion completion. Useful when ingesting large amounts of files/blobs |
77+
|-dontWait | |bool |Optional |If set to 'true', doesn't wait for ingestion completion. Useful when ingesting large amounts of files/blobs |
7178
7279
### Using CreationTimePattern argument
7380
74-
The `-creationTimePattern` argument extracts the CreationTime property from the file or blob path. The pattern does not need to reflect the entire item path, just the section enclosing the timestamp you want to use.
75-
The value of the argument must contain of three sections:
81+
The `-creationTimePattern` argument extracts the CreationTime property from the file or blob path. The pattern doesn't need to reflect the entire item path, just the section enclosing the timestamp you want to use.
82+
The value of the argument must include the following:
7683
* Constant test immediately preceding the timestamp, enclosed in single quotes
7784
* The timestamp format, in standard [.NET DateTime notation](https://docs.microsoft.com/dotnet/standard/base-types/custom-date-and-time-format-strings)
7885
* Constant text immediately following the timestamp
79-
For example, if blob names end with 'historicalvalues19840101.parquet' (the timestamp is four digits for the year, two digits for the month and two digits for the day of month), the corresponding value for the `-creationTimePattern` argument is 'historicalvalues'yyyyMMdd'.parquet'.
86+
For example, if blob names end with 'historicalvalues19840101.parquet' (the timestamp is four digits for the year, two digits for the month and two digits for the day of month), the corresponding value for the `-creationTimePattern` argument is:
87+
88+
```
89+
ingest-{Cluster name and region}.kusto.windows.net;AAD Federated Security=True -db:{Database} -table:Trips -source:"https://{Account}.blob.core.windows.net/{ROOT_CONTAINER};{StorageAccountKey}" -creationTimePattern:"'historicalvalues'yyyyMMdd'.parquet'"
90+
-pattern:"*.csv.gz" -format:csv -limit:2 -ignoreFirst:true -cr:10.0 -dontWait:true
91+
```
8092
8193
### Command-line arguments for advanced scenarios
8294
30.7 KB
Loading

0 commit comments

Comments
 (0)