Skip to content

Commit 290f04d

Browse files
committed
Important dataflow configuration
1 parent 11f37d5 commit 290f04d

File tree

3 files changed

+176
-17
lines changed

3 files changed

+176
-17
lines changed

articles/iot-operations/connect-to-cloud/howto-configure-dataflow-endpoint.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,24 @@ Use the following table to choose the endpoint type to configure:
2727
| [Microsoft Fabric OneLake](howto-configure-fabric-endpoint.md) | For uploading data to Microsoft Fabric OneLake lakehouses. | No | Yes |
2828
| [Local storage](howto-configure-local-storage-endpoint.md) | For sending data to a locally available persistent volume, through which you can upload data via Azure Container Storage enabled by Azure Arc edge volumes. | No | Yes |
2929

30+
## Dataflows must use local MQTT broker endpoint
31+
32+
When you create a dataflow, you specify the source and destination endpoints. The dataflow moves data from the source endpoint to the destination endpoint. You can use the same endpoint for multiple dataflows, and you can use the same endpoint as both the source and destination in a dataflow.
33+
34+
However, using custom endpoints as both the source and destination in a dataflow isn't supported. This means the built-in MQTT broker in Azure IoT Operations must be either the source or destination for every dataflow. To avoid dataflow deployment failures, use the [default MQTT dataflow endpoint](./howto-configure-mqtt-endpoint.md#default-endpoint) as either the source or destination for every dataflow.
35+
36+
The specific requirement is that each dataflow must have either the source or destination with an MQTT endpoint with the host `aio-broker`. So while using the default endpoint isn't strictly required, and you can create new other redundant dataflow endpoints pointing to the local MQTT broker as long as the host is `aio-broker`, it's recommended to use the default endpoint to avoid confusion and manageability issues.
37+
38+
The following table shows the supported scenarios:
39+
40+
| Scenario | Supported |
41+
|----------|-----------|
42+
| Default endpoint as source | Yes |
43+
| Default endpoint as destination | Yes |
44+
| Custom endpoint as source | Yes, if destination is default endpoint or an MQTT endpoint with host `aio-broker` |
45+
| Custom endpoint as destination | Yes, if source is default endpoint or an MQTT endpoint with host `aio-broker` |
46+
| Custom endpoint as source and destination | No, unless one of them is an MQTT endpoints with host `aio-broker` |
47+
3048
## Reuse endpoints
3149

3250
Think of each dataflow endpoint as a bundle of configuration settings that contains where the data should come from or go to (the `host` value), how to authenticate with the endpoint, and other settings like TLS configuration or batching preference. So you just need to create it once and then you can reuse it in multiple dataflows where these settings would be the same.

articles/iot-operations/connect-to-cloud/howto-configure-mqtt-endpoint.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: patricka
66
ms.service: azure-iot-operations
77
ms.subservice: azure-data-flows
88
ms.topic: how-to
9-
ms.date: 10/30/2024
9+
ms.date: 11/01/2024
1010
ai-usage: ai-assisted
1111

1212
#CustomerIntent: As an operator, I want to understand how to understand how to configure dataflow endpoints for MQTT sources and destinations in Azure IoT Operations so that I can send data to and from MQTT brokers.
@@ -25,17 +25,24 @@ MQTT dataflow endpoints are used for MQTT sources and destinations. You can conf
2525

2626
## Azure IoT Operations local MQTT broker
2727

28+
Azure IoT Operations provides a [built-in local MQTT broker](../manage-mqtt-broker/overview-iot-mq.md) that you can use with dataflows. You can use the MQTT broker as a source to receive messages from other systems or as a destination to send messages to other systems.
29+
2830
### Default endpoint
2931

30-
Azure IoT Operations provides a built-in MQTT broker that you can use with dataflows. When you deploy Azure IoT Operations, an MQTT broker dataflow endpoint named "default" is created with default settings. You can use this endpoint as a source or destination for dataflows. The default endpoint uses the following settings:
32+
When you deploy Azure IoT Operations, an MQTT broker dataflow endpoint named "default" is created with default settings. You can use this endpoint as a source or destination for dataflows.
33+
34+
> [!IMPORTANT]
35+
> The default endpoint **must always be used as either the source or destination in every dataflow**. To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
36+
37+
The default endpoint uses the following settings:
3138

3239
- Host: `aio-broker:18883` through the [default MQTT broker listener](../manage-mqtt-broker/howto-configure-brokerlistener.md#default-brokerlistener)
3340
- Authentication: service account token (SAT) through the [default BrokerAuthentication resource](../manage-mqtt-broker/howto-configure-authentication.md#default-brokerauthentication-resource)
3441
- TLS: Enabled
3542
- Trusted CA certificate: The default CA certificate `azure-iot-operations-aio-ca-trust-bundle` from the [default root CA](../deploy-iot-ops/concept-default-root-ca.md)
3643

37-
> [!IMPORTANT]
38-
> If any of these default MQTT broker settings change, the dataflow endpoint must be updated to reflect the new settings. For example, if the default MQTT broker listener changes to use a different service name `my-mqtt-broker` and port 8885, you must update the endpoint to use the new host `host: my-mqtt-broker:8885`. Same applies to other settings like authentication and TLS.
44+
> [!CAUTION]
45+
> Don't delete the default endpoint. If you delete the default endpoint, you must recreate it with the same settings.
3946
4047
To view or edit the default MQTT broker endpoint settings:
4148

@@ -104,7 +111,7 @@ kubectl get dataflowendpoint default -n azure-iot-operations -o yaml
104111

105112
### Create new endpoint
106113

107-
You can also create new local MQTT broker endpoints with custom settings. For example, you can create a new MQTT broker endpoint using a different port, authentication, or other settings.
114+
You can also create new local MQTT broker endpoints with custom settings. For example, you can create a new MQTT broker endpoint using a different port, authentication, or authorization settings. However, you must still always use the default endpoint as either the source or destination in every dataflow, even if you create new endpoints.
108115

109116
# [Portal](#tab/portal)
110117

articles/iot-operations/connect-to-cloud/howto-create-dataflow.md

Lines changed: 146 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: patricka
66
ms.service: azure-iot-operations
77
ms.subservice: azure-data-flows
88
ms.topic: how-to
9-
ms.date: 10/30/2024
9+
ms.date: 11/01/2024
1010
ai-usage: ai-assisted
1111

1212
#CustomerIntent: As an operator, I want to understand how to create a dataflow to connect data sources.
@@ -39,7 +39,10 @@ flowchart LR
3939

4040
:::image type="content" source="media/howto-create-dataflow/dataflow.svg" alt-text="Diagram of a dataflow showing flow from source to transform then destination.":::
4141

42-
To define the source and destination, you need to configure the dataflow endpoints. The transformation is optional and can include operations like enriching the data, filtering the data, and mapping the data to another field.
42+
To define the source and destination, you need to configure the dataflow endpoints. The transformation is optional and can include operations like enriching the data, filtering the data, and mapping the data to another field.
43+
44+
> [!IMPORTANT]
45+
> Each dataflow must have the Azure IoT Operations local MQTT broker default endpoint [as *either* the source or destination](#proper-dataflow-configuration).
4346
4447
You can use the operations experience in Azure IoT Operations to create a dataflow. The operations experience provides a visual interface to configure the dataflow. You can also use Bicep to create a dataflow using a Bicep template file, or use Kubernetes to create a dataflow using a YAML file.
4548

@@ -168,7 +171,7 @@ Review the following sections to learn how to configure the operation types of t
168171

169172
To configure a source for the dataflow, specify the endpoint reference and a list of data sources for the endpoint.
170173

171-
### Use Asset as source
174+
### Use asset as source
172175

173176
# [Portal](#tab/portal)
174177

@@ -194,6 +197,10 @@ Configuring an asset as a source is only available in the operations experience.
194197

195198
---
196199

200+
When using an asset as the source, the asset definition is used to infer the schema for the dataflow. The asset definition includes the schema for the asset's datapoints. To learn more, see [Manage asset configurations remotely](../discover-manage-assets/howto-manage-assets-remotely.md).
201+
202+
Once configured, the data from the asset reached the dataflow via the local MQTT broker. So, when using an asset as the source, the dataflow uses the local MQTT broker default endpoint as the source in actuality.
203+
197204
### Use default MQTT endpoint as source
198205

199206
# [Portal](#tab/portal)
@@ -243,7 +250,7 @@ Because `dataSources` allows you to specify MQTT or Kafka topics without modifyi
243250

244251
---
245252

246-
For more information about the default MQTT endpoint and creating an MQTT endpoint as a dataflow source, see [MQTT Endpoint](howto-configure-mqtt-endpoint.md).
253+
If the default endpoint isn't used as the source, it must be used as the [destination](#destination). To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
247254

248255
### Use custom MQTT or Kafka dataflow endpoint as source
249256

@@ -748,7 +755,7 @@ For more information about schema registry, see [Understand message schemas](con
748755

749756
To configure a destination for the dataflow, specify the endpoint reference and data destination. You can specify a list of data destinations for the endpoint.
750757

751-
To send data to a destination other than the local MQTT broker, create a dataflow endpoint. To learn how, see [Configure dataflow endpoints](howto-configure-dataflow-endpoint.md).
758+
To send data to a destination other than the local MQTT broker, create a dataflow endpoint. To learn how, see [Configure dataflow endpoints](howto-configure-dataflow-endpoint.md). If the destination isn't the local MQTT broker, it must be used as a source. To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
752759

753760
> [!IMPORTANT]
754761
> Storage endpoints require a schema reference. If you've created storage destination endpoints for Microsoft Fabric OneLake, ADLS Gen 2, Azure Data Explorer and Local Storage, you must specify schema reference.
@@ -878,7 +885,113 @@ destinationSettings:
878885

879886
## Example
880887

881-
The following example is a dataflow configuration that uses the MQTT endpoint for the source and destination. The source filters the data from the MQTT topics `thermostats/+/telemetry/temperature/#` and `humidifiers/+/telemetry/humidity/#`. The transformation converts the temperature to Fahrenheit and filters the data where the temperature is less than 100000. The destination sends the data to the MQTT topic `factory`.
888+
The following example is a dataflow configuration that uses the MQTT endpoint for the source and destination. The source filters the data from the MQTT topic `azure-iot-operations/data/thermostat`. The transformation converts the temperature to Fahrenheit and filters the data where the temperature multiplied by the humiditiy is less than 100000. The destination sends the data to the MQTT topic `factory`.
889+
890+
# [Bicep](#tab/bicep)
891+
892+
```bicep
893+
param aioInstanceName string = '<AIO_INSTANCE_NAME>'
894+
param customLocationName string = '<CUSTOM_LOCATION_NAME>'
895+
param dataflowName string = '<DATAFLOW_NAME>'
896+
897+
resource aioInstance 'Microsoft.IoTOperations/instances@2024-09-15-preview' existing = {
898+
name: aioInstanceName
899+
}
900+
901+
resource customLocation 'Microsoft.ExtendedLocation/customLocations@2021-08-31-preview' existing = {
902+
name: customLocationName
903+
}
904+
905+
// Pointer to the default dataflow endpoint
906+
resource defaultDataflowEndpoint 'Microsoft.IoTOperations/instances/dataflowEndpoints@2024-09-15-preview' existing = {
907+
parent: aioInstance
908+
name: 'default'
909+
}
910+
911+
// Pointer to the default dataflow profile
912+
resource defaultDataflowProfile 'Microsoft.IoTOperations/instances/dataflowProfiles@2024-09-15-preview' existing = {
913+
parent: aioInstance
914+
name: 'default'
915+
}
916+
917+
resource dataflow 'Microsoft.IoTOperations/instances/dataflowProfiles/dataflows@2024-09-15-preview' = {
918+
// Reference to the parent dataflow profile, the default profile in this case
919+
// Same usage as profileRef in Kubernetes YAML
920+
parent: defaultDataflowProfile
921+
name: dataflowName
922+
extendedLocation: {
923+
name: customLocation.id
924+
type: 'CustomLocation'
925+
}
926+
properties: {
927+
mode: 'Enabled'
928+
operations: [
929+
{
930+
operationType: 'Source'
931+
sourceSettings: {
932+
// Use the default MQTT endpoint as the source
933+
endpointRef: defaultDataflowEndpoint.name
934+
// Filter the data from the MQTT topic azure-iot-operations/data/thermostat
935+
dataSources: [
936+
'azure-iot-operations/data/thermostat'
937+
]
938+
}
939+
}
940+
// Transformation optional
941+
{
942+
operationType: 'BuiltInTransformation'
943+
builtInTransformationSettings: {
944+
// Filter the data where temperature * "Tag 10" < 100000
945+
filter: [
946+
{
947+
inputs: [
948+
'temperature.Value'
949+
'"Tag 10".Value'
950+
]
951+
expression: '$1 * $2 < 100000'
952+
}
953+
]
954+
map: [
955+
// Passthrough all values by default
956+
{
957+
inputs: [
958+
'*'
959+
]
960+
output: '*'
961+
}
962+
// Convert temperature to Fahrenheit and output it to TemperatureF
963+
{
964+
inputs: [
965+
'temperature.Value'
966+
]
967+
output: 'TemperatureF'
968+
expression: 'cToF($1)'
969+
}
970+
// Extract the "Tag 10" value and output it to Humidity
971+
{
972+
inputs: [
973+
'"Tag 10".Value'
974+
]
975+
output: 'Humidity'
976+
}
977+
]
978+
}
979+
}
980+
{
981+
operationType: 'Destination'
982+
destinationSettings: {
983+
// Use the default MQTT endpoint as the destination
984+
endpointRef: defaultDataflowEndpoint.name
985+
// Send the data to the MQTT topic factory
986+
dataDestination: 'factory'
987+
}
988+
}
989+
]
990+
}
991+
}
992+
```
993+
994+
# [Kubernetes](#tab/kubernetes)
882995

883996
```yaml
884997
apiVersion: connectivity.iotoperations.azure.com/v1beta1
@@ -887,43 +1000,52 @@ metadata:
8871000
name: my-dataflow
8881001
namespace: azure-iot-operations
8891002
spec:
1003+
# Reference to the default dataflow profile
8901004
profileRef: default
8911005
mode: Enabled
8921006
operations:
8931007
- operationType: Source
8941008
sourceSettings:
1009+
# Use the default MQTT endpoint as the source
8951010
endpointRef: default
1011+
# Filter the data from the MQTT topic azure-iot-operations/data/thermostat
8961012
dataSources:
897-
- thermostats/+/telemetry/temperature/#
898-
- humidifiers/+/telemetry/humidity/#
1013+
- azure-iot-operations/data/thermostat
1014+
# Transformation optional
8991015
- operationType: builtInTransformation
9001016
builtInTransformationSettings:
1017+
# Filter the data where temperature * "Tag 10" < 100000
9011018
filter:
9021019
- inputs:
9031020
- 'temperature.Value'
9041021
- '"Tag 10".Value'
905-
expression: "$1*$2<100000"
1022+
expression: '$1 * $2 < 100000'
9061023
map:
1024+
# Passthrough all values by default
9071025
- inputs:
9081026
- '*'
9091027
output: '*'
1028+
# Convert temperature to Fahrenheit and output it to TemperatureF
9101029
- inputs:
9111030
- temperature.Value
9121031
output: TemperatureF
9131032
expression: cToF($1)
1033+
# Extract the "Tag 10" value and output it to Humidity
9141034
- inputs:
9151035
- '"Tag 10".Value'
916-
output: 'Tag 10'
1036+
output: 'Humidity'
9171037
- operationType: Destination
9181038
destinationSettings:
1039+
# Use the default MQTT endpoint as the destination
9191040
endpointRef: default
1041+
# Send the data to the MQTT topic factory
9201042
dataDestination: factory
9211043
```
9221044

923-
<!-- TODO: add links to examples in the reference docs -->
924-
9251045
---
9261046

1047+
To see more examples of dataflow configurations, see [Azure REST API - Dataflow](/rest/api/iotoperations/dataflow/create-or-update#examples) and the [quickstart Bicep](https://github.com/Azure-Samples/explore-iot-operations/blob/main/samples/quickstarts/quickstart.bicep).
1048+
9271049
## Verify a dataflow is working
9281050

9291051
Follow [Tutorial: Bi-directional MQTT bridge to Azure Event Grid](tutorial-mqtt-bridge.md) to verify the dataflow is working.
@@ -950,6 +1072,18 @@ kubectl get dataflow my-dataflow -o yaml > my-dataflow.yaml
9501072

9511073
---
9521074

1075+
## Proper dataflow configuration
1076+
1077+
To ensure the dataflow is working as expected, verify the following:
1078+
1079+
- The default MQTT dataflow endpoint [must be used as *either* the source or destination](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
1080+
- The dataflow profile exists and is referenced in the dataflow configuration.
1081+
- Source is either an MQTT endpoint, Kafka endpoint, or an asset. [Storage type endpoints can't be used as a source](./howto-configure-dataflow-endpoint.md).
1082+
- When using Event Grid as the source, the [dataflow profile instance count](./howto-configure-dataflow-profile.md#scaling) is set to 1 because Event Grid MQTT broker doesn't support shared subscriptions.
1083+
- When using Event Hubs as the source, each event hub in the namespace is a separate Kafka topic and must be specified as the data source.
1084+
- Transformation, if used, is configured with proper syntax, including consideration for proper [escapaing of special characters](./concept-dataflow-mapping.md#escaping).
1085+
- When using storage type endpoints as destination, a [schema reference is specified](#serialize-data-according-to-a-schema).
1086+
9531087
## Next steps
9541088

9551089
- [Map data by using dataflows](concept-dataflow-mapping.md)

0 commit comments

Comments
 (0)