Skip to content

Commit 860929b

Browse files
Learn Build Service GitHub AppLearn Build Service GitHub App
authored andcommitted
Merging changes synced from https://github.com/MicrosoftDocs/dataexplorer-docs-pr (branch live)
2 parents 9b88bd2 + 34e89a5 commit 860929b

File tree

6 files changed

+54
-47
lines changed

6 files changed

+54
-47
lines changed

data-explorer/data-factory-integration.md

Lines changed: 29 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: 'Azure Data Explorer integration with Azure Data Factory'
33
description: 'In this article, integrate Azure Data Explorer with Azure Data Factory to use the copy, lookup, and command activities.'
44
ms.reviewer: tomersh26
55
ms.topic: how-to
6-
ms.date: 08/30/2023
6+
ms.date: 09/02/2025
77

88
#Customer intent: I want to use Azure Data Factory to integrate with Azure Data Explorer.
99
---
@@ -18,7 +18,7 @@ Various integrations with Azure Data Factory are available for Azure Data Explor
1818

1919
### Copy activity
2020

21-
Azure Data Factory Copy activity is used to transfer data between data stores. Azure Data Explorer is supported as a source, where data is copied from Azure Data Explorer to any supported data store, and a sink, where data is copied from any supported data store to Azure Data Explorer. For more information, see [copy data to or from Azure Data Explorer using Azure Data Factory](/azure/data-factory/connector-azure-data-explorer). For a detailed walk-through see [load data from Azure Data Factory into Azure Data Explorer](data-factory-load-data.md).
21+
Azure Data Factory Copy activity is used to transfer data between data stores. Azure Data Explorer is supported as a source, where data is copied from Azure Data Explorer to any supported data store, and a sink, where data is copied from any supported data store to Azure Data Explorer. For more information, see [copy data to or from Azure Data Explorer using Azure Data Factory](/azure/data-factory/connector-azure-data-explorer). For a detailed walk-through, see [load data from Azure Data Factory into Azure Data Explorer](data-factory-load-data.md).
2222
Azure Data Explorer is supported by Azure IR (Integration Runtime), used when data is copied within Azure, and self-hosted IR, used when copying data from/to data stores located on-premises or in a network with access control, such as an Azure Virtual Network. For more information, see [which IR to use.](/azure/data-factory/concepts-integration-runtime#determining-which-ir-to-use)
2323

2424
> [!TIP]
@@ -33,7 +33,7 @@ In addition to the response size limit of 5,000 rows and 2 MB, the activity also
3333
### Command activity
3434

3535
The Command activity allows the execution of Azure Data Explorer [management commands](/kusto/query/index?view=azure-data-explorer&preserve-view=true#management-commands). Unlike queries, the management commands can potentially modify data or metadata. Some of the management commands are targeted to ingest data into Azure Data Explorer, using commands such as `.ingest`or `.set-or-append`) or copy data from Azure Data Explorer to external data stores using commands such as `.export`.
36-
For a detailed walk-through of the command activity, see [use Azure Data Factory command activity to run Azure Data Explorer management commands](data-factory-command-activity.md). Using a management command to copy data can, at times, be a faster and cheaper option than the Copy activity. To determine when to use the Command activity versus the Copy activity, see [select between Copy and Command activities when copying data](#select-between-copy-and-azure-data-explorer-command-activities-when-copy-data).
36+
For a detailed walk-through of the command activity, see [use Azure Data Factory command activity to run Azure Data Explorer management commands](data-factory-command-activity.md). Using a management command to copy data can, at times, be a faster and cheaper option than the Copy activity. To determine when to use the Command activity versus the Copy activity, see [select between Copy and Command activities when copying data](#select-between-copy-and-azure-data-explorer-command-activities-when-copy-data).
3737

3838
### Copy in bulk from a database template
3939

@@ -51,7 +51,7 @@ The [Copy in bulk from a database to Azure Data Explorer by using the Azure Data
5151

5252
This section assists you in selecting the correct activity for your data copying needs.
5353

54-
When copying data from or to Azure Data Explorer, there are two available options in Azure Data Factory:
54+
When you copy data from or to Azure Data Explorer, there are two available options in Azure Data Factory:
5555
* Copy activity.
5656
* Azure Data Explorer Command activity, which executes one of the management commands that transfer data in Azure Data Explorer.
5757

@@ -80,13 +80,13 @@ See the following table for a comparison of the Copy activity, and ingestion com
8080
| | Copy activity | Ingest from query<br> `.set-or-append` / `.set-or-replace` / `.set` / `.replace` | Ingest from storage <br> `.ingest` |
8181
|---|---|---|---|
8282
| **Flow description** | ADF gets the data from the source data store, converts it into a tabular format, and does the required schema-mapping changes. ADF then uploads the data to Azure blobs, splits it into chunks, then downloads the blobs to ingest them into the Azure Data Explorer table. <br> (**Source data store > ADF > Azure blobs > Azure Data Explorer**) | These commands can execute a query or a `.show` command, and ingest the results of the query into a table (**Azure Data Explorer > Azure Data Explorer**). | This command ingests data into a table by "pulling" the data from one or more cloud storage artifacts. |
83-
| **Supported source data stores** | [variety of options](/azure/data-factory/copy-activity-overview#supported-data-stores-and-formats) | ADLS Gen 2, Azure Blob, SQL (using the [sql_request() plugin](/kusto/query/sql-request-plugin?view=azure-data-explorer&preserve-view=true)), Azure Cosmos DB (using the [cosmosdb_sql_request plugin](/kusto/query/mysql-request-plugin?view=azure-data-explorer&preserve-view=true)), and any other data store that provides HTTP or Python APIs. | Filesystem, Azure Blob Storage, ADLS Gen 1, ADLS Gen 2 |
83+
| **Supported source data stores** | [variety of options](/azure/data-factory/copy-activity-overview#supported-data-stores-and-formats) | Azure Data Lake Storage (ADLS) Gen 2, Azure Blob, SQL (using the [sql_request() plugin](/kusto/query/sql-request-plugin?view=azure-data-explorer&preserve-view=true)), Azure Cosmos DB (using the [cosmosdb_sql_request plugin](/kusto/query/mysql-request-plugin?view=azure-data-explorer&preserve-view=true)), and any other data store that provides HTTP or Python APIs. | Filesystem, Azure Blob Storage, ADLS Gen 1, ADLS Gen 2 |
8484
| **Performance** | Ingestions are queued and managed, which ensures small-size ingestions and assures high availability by providing load balancing, retries and error handling. | <ul><li>Those commands weren't designed for high volume data importing.</li><li>Works as expected and cheaper. But for production scenarios and when traffic rates and data sizes are large, use the Copy activity.</li></ul> |
8585
| **Server Limits** | <ul><li>No size limit.</li><li>Max timeout limit: One hour per ingested blob. |<ul><li>There's only a size limit on the query part, which can be skipped by specifying `noTruncation=true`.</li><li>Max timeout limit: One hour.</li></ul> | <ul><li>No size limit.</li><li>Max timeout limit: One hour.</li></ul>|
8686

8787
> [!TIP]
8888
>
89-
> * When copying data from ADF to Azure Data Explorer use the `ingest from query` commands.
89+
> * To copy data from Azure Data Factor to Azure Data Explorer, use the `ingest from query` commands.
9090
> * For large datasets (>1GB), use the Copy activity.
9191
9292
## Required permissions
@@ -115,8 +115,8 @@ This section addresses the use of copy activity where Azure Data Explorer is the
115115
| Parameter | Notes |
116116
|---|---|
117117
| **Components geographical proximity** | Place all components in the same region:<ul><li>source and sink data stores.</li><li>ADF integration runtime.</li><li>Your Azure Data Explorer cluster.</li></ul>Make sure that at least your integration runtime is in the same region as your Azure Data Explorer cluster. |
118-
| **Number of DIUs** | One VM for every four DIUs used by ADF. <br>Increasing the DIUs helps only if your source is a file-based store with multiple files. Each VM will then process a different file in parallel. Therefore, copying a single large file has a higher latency than copying multiple smaller files.|
119-
|**Amount and SKU of your Azure Data Explorer cluster** | High number of Azure Data Explorer nodes boosts ingestion processing time. Use of dev SKUs will severely limit performance|
118+
| **Number of DIUs** | One virtual machine (VM) for every four DIUs used by ADF. <br>Increasing the DIUs helps only if your source is a file-based store with multiple files. Each VM then processes a different file in parallel. Therefore, copying a single large file has a higher latency than copying multiple smaller files.|
119+
|**Amount and SKU of your Azure Data Explorer cluster** | High number of Azure Data Explorer nodes boosts ingestion processing time. Use of dev SKUs severely limits performance. |
120120
| **Parallelism** | To copy a large amount of data from a database, partition your data and then use a ForEach loop that copies each partition in parallel or use the [Bulk Copy from Database to Azure Data Explorer Template](data-factory-template.md). Note: **Settings** > **Degree of Parallelism** in the Copy activity isn't relevant to Azure Data Explorer. |
121121
| **Data processing complexity** | Latency varies according to source file format, column mapping, and compression.|
122122
| **The VM running your integration runtime** | <ul><li>For Azure copy, ADF VMs and machine SKUs can't be changed.</li><li> For on-premises to Azure copy, determine that the VM hosting your self-hosted IR is strong enough.</li></ul>|
@@ -125,47 +125,48 @@ This section addresses the use of copy activity where Azure Data Explorer is the
125125

126126
### Monitor activity progress
127127

128-
* When monitoring the activity progress, the *Data written* property may be larger than the *Data read* property
128+
* When you monitor the activity progress, the *Data written* property can be larger than the *Data read* property
129129
because *Data read* is calculated according to the binary file size, while *Data written* is calculated according to the in-memory size, after data is deserialized and decompressed.
130130

131-
* When monitoring the activity progress, you can see that data is written to the Azure Data Explorer sink. When querying the Azure Data Explorer table, you see that data hasn't arrived. This is because there are two stages when copying to Azure Data Explorer.
131+
* When you monitor the activity progress, you can see that data is written to the Azure Data Explorer sink. When querying the Azure Data Explorer table, you see that data hasn't arrived. It's is because there are two stages when copying to Azure Data Explorer.
132132
* First stage reads the source data, splits it to 900-MB chunks, and uploads each chunk to an Azure Blob. The first stage is seen by the ADF activity progress view.
133133
* The second stage begins once all the data is uploaded to Azure Blobs. The nodes of your cluster download the blobs and ingest the data into the sink table. The data is then seen in your Azure Data Explorer table.
134134

135135
### Failure to ingest CSV files due to improper escaping
136136

137137
Azure Data Explorer expects CSV files to align with [RFC 4180](https://www.ietf.org/rfc/rfc4180.txt).
138138
It expects:
139-
* Fields that contain characters that require escaping (such as " and new lines) should start and end with a **"** character, without whitespace. All **"** characters *inside* the field are escaped by using a double **"** character (**""**). For example, _"Hello, ""World"""_ is a valid CSV file with a single record having a single column or field with the content _Hello, "World"_.
139+
* Fields that contain characters that require escaping (such as " and new lines) should start and end with a **"** character, without whitespace. All **"** characters *inside* the field are escaped by using a double **"** character (**""**). For example, `_"Hello, ""World"""_` is a valid CSV file with a single record having a single column or field with the content `_Hello, "World"_`.
140140
* All records in the file must have the same number of columns and fields.
141141

142-
Azure Data Factory allows the backslash (escape) character. If you generate a CSV file with a backslash character using Azure Data Factory, ingestion of the file to Azure Data Explorer will fail.
142+
Azure Data Factory allows the backslash (escape) character. If you generate a CSV file with a backslash character using Azure Data Factory, ingestion of the file to Azure Data Explorer fails.
143143

144144
#### Example
145145

146146
The following text values:
147-
Hello, "World"<br/>
148-
ABC DEF<br/>
149-
"ABC\D"EF<br/>
150-
"ABC DEF<br/>
147+
`Hello, "World"`<br/>
148+
`ABC DEF`<br/>
149+
`"ABC\D"EF`<br/>
150+
`"ABC DEF`<br/>
151151

152152
Should appear in a proper CSV file as follows:
153-
"Hello, ""World"""<br/>
154-
"ABC DEF"<br/>
155-
"""ABC\D""EF"<br/>
156-
"""ABC DEF"<br/>
153+
`"Hello, ""World"""`<br/>
154+
`"ABC DEF"`<br/>
155+
`"""ABC\D""EF"`<br/>
156+
`"""ABC DEF"`<br/>
157157

158-
By using the default escape character (backslash), the following CSV won't work with Azure Data Explorer:
159-
"Hello, \"World\""<br/>
160-
"ABC DEF"<br/>
161-
"\"ABC\D\"EF"<br/>
162-
"\"ABC DEF"<br/>
158+
When you use the default escape character (backslash), the following CSV doesn't work with Azure Data Explorer:
159+
`"Hello, \"World\""`<br/>
160+
`"ABC DEF"`<br/>
161+
`"\"ABC\D\"EF"`<br/>
162+
`"\"ABC DEF"`<br/>
163163

164164
### Nested JSON objects
165165

166-
When copying a JSON file to Azure Data Explorer, note that:
166+
When copying a JSON file to Azure Data Explorer, note the following points:
167+
167168
* Arrays aren't supported.
168-
* If your JSON structure contains object data types, Azure Data Factory will flatten the object's child items, and try to map each child item to a different column in your Azure Data Explorer table. If you want the entire object item to be mapped to a single column in Azure Data Explorer:
169+
* If your JSON structure contains object data types, Azure Data Factory flattens the object's child items, and try to map each child item to a different column in your Azure Data Explorer table. If you want the entire object item to be mapped to a single column in Azure Data Explorer:
169170
* Ingest the entire JSON row into a single dynamic column in Azure Data Explorer.
170171
* Manually edit the pipeline definition by using Azure Data Factory's JSON editor. In **Mappings**
171172
* Remove the multiple mappings that were created for each child item, and add a single mapping that maps your object type to your table column.
@@ -184,7 +185,7 @@ You can add additional [ingestion properties](/kusto/ingestion-properties?view=a
184185
1. In the **Activities** canvas, select the **Copy data** activity.
185186
1. In the activity details, select **Sink**, and then expand **Additional properties**.
186187
1. Select **New**, select either **Add node** or **Add array** as required, and then specify the ingestion property name and value. Repeat this step to add more properties.
187-
1. Once complete save and publish your pipeline.
188+
1. Once completed, save and publish your pipeline.
188189

189190
## Next step
190191

-2.72 KB
Loading
393 Bytes
Loading
-19.8 KB
Loading

data-explorer/pricing-calculator.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Azure Data Explorer Pricing Calculator
33
description: Explore different pricing options based on your specific cluster needs with Azure Data Explorer pricing calculator.
44
ms.topic: how-to
5-
ms.date: 11/21/2022
5+
ms.date: 09/02/2025
66
---
77

88
# Azure Data Explorer pricing calculator
@@ -29,7 +29,7 @@ At the bottom of the form, the individual component estimates are added together
2929
1. Scroll down the page until you see a tab titled **Your Estimate**.
3030
1. Verify that **Azure Data Explorer** appears in the tab. If it doesn't, do the following:
3131
1. Scroll back to the top of the page.
32-
1. In the search box, type Azure Data Explorer.
32+
1. In the search box, type **Azure Data Explorer**.
3333
1. Select the **Azure Data Explorer** widget.
3434
1. Start the configuration.
3535

@@ -41,19 +41,19 @@ The sections of this article correspond to the components in the calculator and
4141

4242
The region and environment you choose for your cluster will affect the cost of each component. This is because the different regions and environments don't provide exactly the same services or capacity.
4343

44+
1. Choose the **Environment** for your cluster.
45+
46+
* **Production** clusters contain two or more nodes for engine and data management and operate under the Azure Data Explorer SLA.
47+
48+
* **Dev/test** clusters are the lowest cost option, which makes them great for service evaluation, conducting PoCs, and scenario validations. They're limited in size and can't grow beyond a single node. There's no Azure Data Explorer markup charge or product SLA for these clusters.
4449
1. Select the desired **Region** for your cluster.
4550

46-
Use the [regions decision guide](/azure/cloud-adoption-framework/migrate/azure-best-practices/multiple-regions) to find the right region for you. Your choice may depend on requirements such as:
51+
Use the [regions decision guide](/azure/cloud-adoption-framework/migrate/azure-best-practices/multiple-regions) to find the right region for you. Your choice might depend on requirements such as:
4752

4853
* [Availability zone support](/azure/reliability/availability-zones-service-support#azure-regions-with-availability-zone-support)
4954
* [Disaster recovery](/azure/reliability/cross-region-replication-azure)
5055
* [Data residency and protection](https://azure.microsoft.com/resources/achieving-compliant-data-residency-and-security-with-azure/)
5156

52-
1. Choose the **Environment** for your cluster.
53-
54-
* **Production** clusters contain two or more nodes for engine and data management and operate under the Azure Data Explorer SLA.
55-
56-
* **Dev/test** clusters are the lowest cost option, which makes them great for service evaluation, conducting PoCs, and scenario validations. They're limited in size and can't grow beyond a single node. There's no Azure Data Explorer markup charge or product SLA for these clusters.
5757

5858
## Estimated data ingestion
5959

@@ -71,7 +71,7 @@ In the calculator, enter estimates for the following fields:
7171

7272
### Auto-select engine instances
7373

74-
If you want to individually configure the remaining components, turn off **AUTO-SELECT ENGINE INSTANCES**. When turned on, the calculator selects the most optimal SKU based on the ingestion inputs.
74+
If you want to individually configure the remaining components, turn off **AUTO-SELECT ENGINE INSTANCES**. When turned on, the calculator selects the most optimal Stock Keeping Unit (SKU) based on the ingestion inputs.
7575

7676
:::image type="content" source="media/pricing/auto-select-engine-instances.png" alt-text="Screenshot of the auto select engine instances toggle.":::
7777

@@ -102,7 +102,7 @@ To get an estimate for **Engine instances**:
102102
The **Premium Managed Disk** component is based on the SKU selected.
103103

104104
> [!NOTE]
105-
> Not all **VM Series** are offered in each region. If you are looking for a SKU that is not listed in the selected region, choose a different region.
105+
> Not all **VM Series** are offered in each region. If you're looking for a SKU that isn't listed in the selected region, choose a different region.
106106
107107
### Data management instances
108108

0 commit comments

Comments
 (0)