Skip to content

Commit 7ec1401

Browse files
Merge pull request #274044 from PatAltimore/patricka-mq-0.4.0-release-aio-april-updates
IoT MQ data lake 0.4.0 changes
2 parents ce9e2b7 + d3e7377 commit 7ec1401

File tree

1 file changed

+56
-55
lines changed

1 file changed

+56
-55
lines changed

articles/iot-operations/connect-to-cloud/howto-configure-data-lake.md

Lines changed: 56 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: patricka
77
ms.topic: how-to
88
ms.custom:
99
- ignite-2023
10-
ms.date: 04/23/2024
10+
ms.date: 05/01/2024
1111

1212
#CustomerIntent: As an operator, I want to understand how to configure Azure IoT MQ so that I can send data from Azure IoT MQ to Data Lake Storage.
1313
---
@@ -53,6 +53,8 @@ Configure a data lake connector to connect to Microsoft Fabric OneLake using man
5353

5454
1. Ensure that IoT MQ Arc extension is installed and configured with managed identity.
5555

56+
1. In Azure portal, go to the Arc-connected Kubernetes cluster and select **Settings** > **Extensions**. In the extension list, look for your IoT MQ extension name. The name begins with `mq-` followed by five random characters. For example, *mq-4jgjs*.
57+
5658
1. Get the *app ID* associated to the IoT MQ Arc extension managed identity, and note down the GUID value. The *app ID* is different than the object or principal ID. You can use the Azure CLI by finding the object ID of the managed identity and then querying the app ID of the service principal associated to the managed identity. For example:
5759

5860
```bash
@@ -223,51 +225,44 @@ authentication:
223225
224226
Configure the data lake connector to send data to an Azure Data Explorer endpoint using managed identity.
225227
226-
### Deploy an Azure Data Explorer cluster
227-
228-
1. To deploy an Azure Data Explorer cluster, follow the **Full cluster** steps in the [Quickstart: Create an Azure Data Explorer cluster and database](/azure/data-explorer/create-cluster-database-portal).
228+
1. To deploy an Azure Data Explorer cluster, follow the **Full cluster** steps in the [Quickstart: Create an Azure Data Explorer cluster and database](/azure/data-explorer/create-cluster-and-database?tabs=full).
229229
230230
1. After the cluster is created, create a database to store your data.
231231
232232
1. You can create a table for given data via the Azure portal and create columns manually, or you can use [KQL](/azure/data-explorer/kusto/management/create-table-command) in the query tab.
233233
234234
For example:
235-
236-
```kql
237235
.create table thermostat (
238236
externalAssetId: string,
239237
assetName: string,
240-
currentTemperature: real,
241-
pressure: real,
242-
mqttTopic: string,
238+
CurrentTemperature: real,
239+
Pressure: real,
240+
MqttTopic: string,
241+
Timestamp: datetime
242+
)
243243
timestamp: datetime
244244
)
245245
```
246246
247247
### Enable streaming ingestion
248248
249-
Enable streaming ingestion on your table and database. In the query tab, run the following command, substituting `<TABLE_NAME>` and `<DATABASE_NAME>` with your table and database names:
249+
Enable streaming ingestion on your table and database. In the query tab, run the following command, substituting `<DATABASE_NAME>` with your database name:
250250
251251
```kql
252-
.alter table <TABLE_NAME> policy streamingingestion enable
253252
.alter database <DATABASE_NAME> policy streamingingestion enable
254253
```
255254
256255
For example:
257256
258257
```kql
259-
.alter table thermostat policy streamingingestion enable
260258
.alter database TestDatabase policy streamingingestion enable
261259
```
262-
### Deploy ARC extension
263-
264-
Deploy the broker as an ARC extension so that you get managed identity support. Follow the steps outlined in [Quickstart: Deploy Azure IoT Operations Preview to an Arc-enabled Kubernetes cluster](../get-started/quickstart-deploy.md).
265260
266261
### Add the managed identity to the Azure Data Explorer cluster
267262
268263
In order for the connector to authenticate to Azure Data Explorer, you must add the managed identity to the Azure Data Explorer cluster.
269264
270-
1. In Azure portal, go to the Arc-connected Kubernetes cluster and select **Settings** > **Extensions**. In the extension list, look for the name of your MQ extension. The name begins with `mq-` followed by five random characters. For example, *mq-4jgjs*. The MQ extension name is the same as the MQ managed identity name.
265+
1. In Azure portal, go to the Arc-connected Kubernetes cluster and select **Settings** > **Extensions**. In the extension list, look for the name of your IoT MQ extension. The name begins with `mq-` followed by five random characters. For example, *mq-4jgjs*. The IoT MQ extension name is the same as the MQ managed identity name.
271266
1. In your Azure Data Explorer database, select **Permissions** > **Add** > **Ingestor**. Search for the MQ managed identity name and add it.
272267
273268
For more information on adding permissions, see [Manage Azure Data Explorer cluster permissions](/azure/data-explorer/manage-cluster-permissions).
@@ -280,59 +275,73 @@ Example deployment file for the Azure Data Explorer connector. Comments that beg
280275

281276
```yaml
282277
apiVersion: mq.iotoperations.azure.com/v1beta1
283-
kind: DataLakeConnector
284-
metadata:
278+
name: my-adx-connector
279+
namespace: azure-iot-operations
285280
name: my-datalake-connector
286281
namespace: mq
287282
spec:
288-
protocol: v5
289-
image:
283+
repository: mcr.microsoft.com/azureiotoperations/datalake
284+
tag: 0.4.0-preview
290285
repository: edgebuilds.azurecr.io/datalake
291286
tag: edge
292287
pullPolicy: Always
293-
instances: 1
294-
logLevel: "debug"
288+
repository: mcr.microsoft.com/azureiotoperations/datalake
289+
tag: 0.4.0-preview
295290
databaseFormat: "adx"
296291
target:
297-
adx:
292+
endpoint: https://<cluster>.<region>.kusto.windows.net
298293
# TODO: insert the ADX cluster endpoint formatted as <cluster>.<region>.kusto.windows.net
299294
endpoint: "<endpoint>"
300295
authentication:
301-
systemAssignedManagedIdentity:
296+
localBrokerConnection:
297+
endpoint: aio-mq-dmqtt-frontend:8883
298+
tls:
299+
tlsEnabled: true
300+
trustedCaCertificateConfigMap: aio-ca-trust-bundle-test-only
301+
authentication:
302+
kubernetes: {}
303+
---
304+
endpoint: aio-mq-dmqtt-frontend:8883
305+
tls:
306+
tlsEnabled: true
307+
name: adx-topicmap
308+
authentication:
309+
kubernetes: {}
310+
dataLakeConnectorRef: my-adx-connector
302311
audience: "https://api.kusto.windows.net"
303312
---
304313
apiVersion: mq.iotoperations.azure.com/v1beta1
305314
kind: DataLakeConnectorTopicMap
306315
metadata:
307-
name: datalake-topicmap
316+
mqttSourceTopic: "azure-iot-operations/data/thermostat"
308317
namespace: azure-iot-operations
309318
spec:
310319
dataLakeConnectorRef: "my-datalake-connector"
311-
mapping:
312-
allowedLatencySecs: 1
313-
messagePayloadType: "json"
314-
maxMessagesPerBatch: 10
315-
clientId: id
316-
mqttSourceTopic: "dlc"
317-
qos: 1
318-
table:
319-
# TODO: add db and table name
320-
tablePath: "<db>"
321-
tableName: "thermostat"
322320
schema:
323-
- name: "externalAssetId"
321+
- name: externalAssetId
324322
format: utf8
325323
optional: false
326-
mapping: "data.externalAssetId"
327-
- name: "assetName"
324+
mapping: $property.externalAssetId
325+
- name: assetName
328326
format: utf8
329327
optional: false
330-
mapping: "data.assetName"
331-
- name: "currentTemperature"
328+
mapping: DataSetWriterName
329+
- name: CurrentTemperature
332330
format: float32
333331
optional: false
334-
mapping: "$data.currentTemperature"
335-
- name: "pressure"
332+
mapping: Payload.temperature.Value
333+
- name: Pressure
334+
format: float32
335+
optional: true
336+
mapping: "Payload.Tag 10.Value"
337+
- name: MqttTopic
338+
format: utf8
339+
optional: false
340+
mapping: $topic
341+
- name: Timestamp
342+
format: timestamp
343+
optional: false
344+
mapping: $received_time
336345
format: float32
337346
optional: false
338347
mapping: "$data.pressure"
@@ -361,16 +370,7 @@ This example accepts data from the `dlc` topic with messages in JSON format such
361370
}
362371
```
363372

364-
Example command to send data to Azure Data Explorer:
365373

366-
```bash
367-
mosquitto_pub -t dlc -q 1 -r -V 5 -d -i "orderClient" \
368-
-h 10.0.0.4 \
369-
-m '{"data": {"externalAssetID": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "assetName": "thermostat-de", "currentTemperature": 5506, "pressure": 5506}}' \
370-
--repeat 10 \
371-
--repeat-delay 0 \
372-
-c
373-
```
374374

375375
## DataLakeConnector
376376

@@ -429,9 +429,8 @@ The specification field of a DataLakeConnectorTopicMap resource contains the fol
429429
- `name`: The name of the column in the Delta table.
430430
- `format`: The data type of the column in the Delta table. It can be one of `boolean`, `int8`, `int16`, `int32`, `int64`, `uInt8`, `uInt16`, `uInt32`, `uInt64`, `float16`, `float32`, `float64`, `date32`, `timestamp`, `binary`, or `utf8`. Unsigned types, like `uInt8`, aren't fully supported, and are treated as signed types if specified here.
431431
- `optional`: A boolean value that indicates whether the column is optional or required. This field is optional and defaults to false.
432-
- `mapping`: JSON path expression that defines how to extract the value of the column from the MQTT message payload. Built-in mappings `$client_id`, `$topic`, `$properties`, `$topicSegment`, and `$received_time` are available to use as columns to enrich the JSON in MQTT message body. This field is required.
432+
- `mapping`: JSON path expression that defines how to extract the value of the column from the MQTT message payload. Built-in mappings `$client_id`, `$topic`, `$properties`, and `$received_time` are available to use as columns to enrich the JSON in MQTT message body. This field is required.
433433
Use $properties for MQTT user properties. For example, $properties.assetId represents the value of the assetId property from the MQTT message.
434-
Use $topicSegment to identify the segment of the topic that you want to extract. For example, if the topic is `devices/+/messages/events`, you can do $topicSegment.1 to get the first segment. The segment index is 1-based.
435434

436435
Here's an example of a *DataLakeConnectorTopicMap* resource:
437436
@@ -475,7 +474,9 @@ spec:
475474
mapping: $received_time
476475
```
477476
478-
Stringified JSON like `"{\"SequenceNumber\": 4697, \"Timestamp\": \"2024-04-02T22:36:03.1827681Z\", \"DataSetWriterName\": \"thermostat-de\", \"MessageType\": \"ua-deltaframe\", \"Payload\": {\"temperature\": {\"SourceTimestamp\": \"2024-04-02T22:36:02.6949717Z\", \"Value\": 5506}, \"Tag 10\": {\"SourceTimestamp\": \"2024-04-02T22:36:02.6949888Z\", \"Value\": 5506}}}"` isn't supported and causes the connector to throw a *convertor found a null value* error. An example message for the `dlc` topic that works with this schema:
477+
Stringified JSON like `"{\"SequenceNumber\": 4697, \"Timestamp\": \"2024-04-02T22:36:03.1827681Z\", \"DataSetWriterName\": \"thermostat-de\", \"MessageType\": \"ua-deltaframe\", \"Payload\": {\"temperature\": {\"SourceTimestamp\": \"2024-04-02T22:36:02.6949717Z\", \"Value\": 5506}, \"Tag 10\": {\"SourceTimestamp\": \"2024-04-02T22:36:02.6949888Z\", \"Value\": 5506}}}"` isn't supported and causes the connector to throw a *convertor found a null value* error.
478+
479+
An example message for the `dlc` topic that works with this schema:
479480

480481
```json
481482
{

0 commit comments

Comments
 (0)