Skip to content

Commit 1dea1aa

Browse files
authored
Merge pull request #267239 from dominicbetts/aio-processor-feb-2
AIO Data Processor February updates
2 parents 5144f2d + 3c526bc commit 1dea1aa

28 files changed

+690
-141
lines changed
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
---
2+
title: Send data to Azure Blob Storage from a pipeline
3+
description: Configure a pipeline destination stage to send the pipeline output to Azure Blob Storage for storage and analysis.
4+
author: dominicbetts
5+
ms.author: dobett
6+
ms.subservice: data-processor
7+
ms.topic: how-to
8+
ms.date: 02/16/2024
9+
10+
#CustomerIntent: As an operator, I want to send data from a pipeline to Azure Blob Storage so that I can store and analyze my data in the cloud.
11+
---
12+
13+
# Send data to Azure Blob Storage from a Data Processor pipeline
14+
15+
[!INCLUDE [public-preview-note](../includes/public-preview-note.md)]
16+
17+
Use the _Azure Blob Storage_ destination to write unstructured data to Azure Blob Storage for storage and analysis.
18+
19+
## Prerequisites
20+
21+
To configure and use this Azure Blob Storage destination pipeline stage, you need:
22+
23+
- A deployed instance of Data Processor.
24+
- An Azure Blob Storage account.
25+
26+
## Configure the destination stage
27+
28+
The _Azure Blob Storage_ destination stage JSON configuration defines the details of the stage. To author the stage, you can either interact with the form-based UI, or provide the JSON configuration on the **Advanced** tab:
29+
30+
| Field | Type | Description | Required? | Default | Example |
31+
|--|--|--|--|--|--|
32+
| `accountName` | string | The name the Azure Blob Storage account. | Yes | | `myBlobStorageAccount` |
33+
| `containerName` | string | The name of container created in the storage account to store the blobs. | Yes | | `mycontainer` |
34+
| `authentication` | string | Authentication information to connect to the storage account. One of `servicePrincipal`, `systemAssignedManagedIdentity`, and `accessKey`. | Yes | | See the [sample configuration](#sample-configuration). |
35+
| `format` | Object. | Formatting information for data. All types are supported. | Yes | | `{"type": "json"}` |
36+
| `blobPath` | [Templates](../process-data/concept-configuration-patterns.md#templates)| Template string that identifies the path to write files to. All the template components shown in the default are required. | No | `{{{instanceId}}}/{{{pipelineId}}}/{{{partitionId}}}/{{{YYYY}}}/{{{MM}}}/{{{DD}}}/{{{HH}}}/{{{mm}}}/{{{fileNumber}}}` | `{{{instanceId}}}/{{{pipelineId}}}/{{{partitionId}}}/{{{YYYY}}}/{{{MM}}}/{{{DD}}}/{{{HH}}}/{{{mm}}}/{{{fileNumber}}}.xyz` |
37+
| `batch` | [Batch](../process-data/concept-configuration-patterns.md#batch) | How to batch data before writing it to Blob Storage. | No | `{"time": "60s"}` | `{"time": "60s"}` |
38+
| `retry` | [Retry](../process-data/concept-configuration-patterns.md#retry) | The retry mechanism to use when a Blob Storage operation fails. | No | (empty) | `{"type": "fixed"}` |
39+
40+
## Sample configuration
41+
42+
The following JSON shows a sample configuration for the _Azure Blob Storage_ destination stage:
43+
44+
```json
45+
{
46+
"displayName": "Sample blobstorage output",
47+
"description": "An example blobstorage output stage",
48+
"type": "output/blobstorage@v1",
49+
"accountName": "myStorageAccount",
50+
"containerName": "mycontainer",
51+
"blobPath": "{{{instanceId}}}/{{{pipelineId}}}/{{{partitionId}}}/{{{YYYY}}}/{{{MM}}}/{{{DD}}}/{{{HH}}}/{{{mm}}}/{{{fileNumber}}}",
52+
"authentication": {
53+
"type": "systemAssignedManagedIdentity"
54+
},
55+
"format": {
56+
"type": "json"
57+
},
58+
"batch": {
59+
"time": "60s",
60+
"path": ".payload"
61+
},
62+
"retry": {
63+
"type": "fixed"
64+
}
65+
}
66+
```
67+
68+
## Related content
69+
70+
- [Send data to Azure Data Explorer](howto-configure-destination-data-explorer.md)
71+
- [Send data to Microsoft Fabric](howto-configure-destination-fabric.md)
72+
- [Send data to a gRPC endpoint](../process-data/howto-configure-destination-grpc.md)
73+
- [Send data to an HTTP endpoint](../process-data/howto-configure-destination-http.md)
74+
- [Publish data to an MQTT broker](../process-data/howto-configure-destination-mq-broker.md)
75+
- [Send data to the reference data store](../process-data/howto-configure-destination-reference-store.md)

articles/iot-operations/connect-to-cloud/howto-configure-destination-data-explorer.md

Lines changed: 43 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,11 @@ To configure and use an Azure Data Explorer destination pipeline stage, you need
2828

2929
## Set up Azure Data Explorer
3030

31-
Before you can write to Azure Data Explorer from a data pipeline, enable [service principal authentication](/azure/data-explorer/provision-azure-ad-app) in your database. To create a service principal with a client secret:
31+
Before you can write to Azure Data Explorer from a data pipeline, you need to grant access to the database from the pipeline. You can use either a service principal or a managed identity to authenticate the pipeline to the database. The advantage of using a managed identity is that you don't need to manage the lifecycle of the service principal. The managed identity is automatically managed by Azure and is tied to the lifecycle of the resource it's assigned to.
32+
33+
# [Service principal](#tab/serviceprincipal)
34+
35+
To create a service principal with a client secret:
3236

3337
[!INCLUDE [data-processor-create-service-principal](../includes/data-processor-create-service-principal.md)]
3438

@@ -38,12 +42,36 @@ To grant admin access to your Azure Data Explorer database, run the following co
3842
.add database <DatabaseName> admins (<ApplicationId>) <Notes>
3943
```
4044

45+
For the destination stage to connect to Azure Data Explorer, it needs access to a secret that contains the authentication details. To create a secret:
46+
47+
1. Use the following command to add a secret to your Azure Key Vault that contains the client secret you made a note of when you created the service principal:
48+
49+
```azurecli
50+
az keyvault secret set --vault-name <your-key-vault-name> --name AccessADXSecret --value <client-secret>
51+
```
52+
53+
1. Add the secret reference to your Kubernetes cluster by following the steps in [Manage secrets for your Azure IoT Operations deployment](../deploy-iot-ops/howto-manage-secrets.md).
54+
55+
# [Managed identity](#tab/managedidentity)
56+
57+
[!INCLUDE [get-managed-identity](../includes/get-managed-identity.md)]
58+
59+
To add the managed identity to the database, navigate to the Azure Data Explorer portal and run the following query on your database. Replace the placeholders with the values you made a note of in the previous step:
60+
61+
```kusto
62+
.add database ['<your-database-name>'] admins ('aadapp=<your-app-ID>;<your-tenant-ID>');
63+
```
64+
65+
---
66+
67+
### Batching
68+
4169
Data Processor writes to Azure Data Explorer in batches. While you batch data in data processor before sending it, Azure Data Explorer has its own default [ingestion batching policy](/azure/data-explorer/kusto/management/batchingpolicy). Therefore, you might not see your data in Azure Data Explorer immediately after Data Processor writes it to the Azure Data Explorer destination.
4270

4371
To view data in Azure Data Explorer as soon as the pipeline sends it, you can set the ingestion batching policy count to `1`. To edit the ingestion batching policy, run the following command in your database query tab:
4472

4573
````kusto
46-
.alter database <YourDatabaseName> policy ingestionbatching
74+
.alter database <your-database-name> policy ingestionbatching
4775
```
4876
{
4977
"MaximumBatchingTimeSpan" : "00:00:30",
@@ -53,18 +81,6 @@ To view data in Azure Data Explorer as soon as the pipeline sends it, you can se
5381
```
5482
````
5583

56-
## Configure your secret
57-
58-
For the destination stage to connect to Azure Data Explorer, it needs access to a secret that contains the authentication details. To create a secret:
59-
60-
1. Use the following command to add a secret to your Azure Key Vault that contains the client secret you made a note of when you created the service principal:
61-
62-
```azurecli
63-
az keyvault secret set --vault-name <your-key-vault-name> --name AccessADXSecret --value <client-secret>
64-
```
65-
66-
1. Add the secret reference to your Kubernetes cluster by following the steps in [Manage secrets for your Azure IoT Operations deployment](../deploy-iot-ops/howto-manage-secrets.md).
67-
6884
## Configure the destination stage
6985

7086
The Azure Data Explorer destination stage JSON configuration defines the details of the stage. To author the stage, you can either interact with the form-based UI, or provide the JSON configuration on the **Advanced** tab:
@@ -77,11 +93,14 @@ The Azure Data Explorer destination stage JSON configuration defines the details
7793
| Database | String | The database name. | Yes | - | |
7894
| Table | String | The name of the table to write to. | Yes | - | |
7995
| Batch | [Batch](../process-data/concept-configuration-patterns.md#batch) | How to [batch](../process-data/concept-configuration-patterns.md#batch) data. | No | `60s` | `10s` |
80-
| Authentication<sup>1</sup> | The authentication details to connect to Azure Data Explorer. | Service principal | Yes | - |
96+
| Retry | [Retry](../process-data/concept-configuration-patterns.md#retry) | The retry policy to use. | No | `default` | `fixed` |
97+
| Authentication<sup>1</sup> | String | The authentication details to connect to Azure Data Explorer. `Service principal` or `Managed identity` | Service principal | Yes | - |
8198
| Columns&nbsp;>&nbsp;Name | string | The name of the column. | Yes | | `temperature` |
8299
| Columns&nbsp;>&nbsp;Path | [Path](../process-data/concept-configuration-patterns.md#path) | The location within each record of the data where the value of the column should be read from. | No | `.{{name}}` | `.temperature` |
83100

84-
Authentication<sup>1</sup>: Currently, the destination stage supports service principal based authentication when it connects to Azure Data Explorer. In your Azure Data Explorer destination, provide the following values to authenticate. You made a note of these values when you created the service principal and added the secret reference to your cluster.
101+
<sup>1</sup>Authentication: Currently, the destination stage supports service principal based authentication or managed identity when it connects to Azure Data Explorer.
102+
103+
To configure service principal based authentication provide the following values. You made a note of these values when you created the service principal and added the secret reference to your cluster.
85104

86105
| Field | Description | Required |
87106
| --- | --- | --- |
@@ -149,7 +168,12 @@ The following JSON example shows a complete Azure Data Explorer destination stag
149168
"name": "IsSpare",
150169
"path": ".IsSpare"
151170
}
152-
]
171+
],
172+
"retry": {
173+
"type": "fixed",
174+
"interval": "20s",
175+
"maxRetries": 4
176+
}
153177
}
154178
```
155179

@@ -189,6 +213,8 @@ The following example shows a sample input message to the Azure Data Explorer de
189213
## Related content
190214

191215
- [Send data to Microsoft Fabric](howto-configure-destination-fabric.md)
216+
- [Send data to Azure Blob Storage](howto-configure-destination-blob.md)
192217
- [Send data to a gRPC endpoint](../process-data/howto-configure-destination-grpc.md)
218+
- [Send data to an HTTP endpoint](../process-data/howto-configure-destination-http.md)
193219
- [Publish data to an MQTT broker](../process-data/howto-configure-destination-mq-broker.md)
194220
- [Send data to the reference data store](../process-data/howto-configure-destination-reference-store.md)

0 commit comments

Comments
 (0)