You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot-operations/connect-to-cloud/howto-create-dataflow.md
+86-78Lines changed: 86 additions & 78 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -169,39 +169,11 @@ Review the following sections to learn how to configure the operation types of t
169
169
170
170
## Source
171
171
172
-
To configure a source for the dataflow, specify the endpoint reference and a list of data sources for the endpoint.
172
+
To configure a source for the dataflow, specify the endpoint reference and a list of data sources for the endpoint. Choose one of the following options as the source for the dataflow.
173
173
174
-
### Use asset as source
175
-
176
-
# [Portal](#tab/portal)
177
-
178
-
You can use an [asset](../discover-manage-assets/overview-manage-assets.md) as the source for the dataflow. Using an asset as a source is only available in the operations experience.
179
-
180
-
1. Under **Source details**, select **Asset**.
181
-
1. Select the asset you want to use as the source endpoint.
182
-
1. Select **Proceed**.
183
-
184
-
A list of datapoints for the selected asset is displayed.
185
-
186
-
:::image type="content" source="media/howto-create-dataflow/dataflow-source-asset.png" alt-text="Screenshot using operations experience to select an asset as the source endpoint.":::
187
-
188
-
1. Select **Apply** to use the asset as the source endpoint.
189
-
190
-
# [Bicep](#tab/bicep)
191
-
192
-
Configuring an asset as a source is only available in the operations experience.
193
-
194
-
# [Kubernetes (preview)](#tab/kubernetes)
195
-
196
-
Configuring an asset as a source is only available in the operations experience.
197
-
198
-
---
199
-
200
-
When using an asset as the source, the asset definition is used to infer the schema for the dataflow. The asset definition includes the schema for the asset's datapoints. To learn more, see [Manage asset configurations remotely](../discover-manage-assets/howto-manage-assets-remotely.md).
201
-
202
-
Once configured, the data from the asset reached the dataflow via the local MQTT broker. So, when using an asset as the source, the dataflow uses the local MQTT broker default endpoint as the source in actuality.
174
+
If the default endpoint isn't used as the source, it must be used as the [destination](#destination). To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
203
175
204
-
### Use default MQTT endpoint as source
176
+
### Option 1: Use default MQTT endpoint as source
205
177
206
178
# [Portal](#tab/portal)
207
179
@@ -250,9 +222,37 @@ Because `dataSources` allows you to specify MQTT or Kafka topics without modifyi
250
222
251
223
---
252
224
253
-
If the default endpoint isn't used as the source, it must be used as the [destination](#destination). To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
225
+
### Option 2: Use asset as source
254
226
255
-
### Use custom MQTT or Kafka dataflow endpoint as source
227
+
# [Portal](#tab/portal)
228
+
229
+
You can use an [asset](../discover-manage-assets/overview-manage-assets.md) as the source for the dataflow. Using an asset as a source is only available in the operations experience.
230
+
231
+
1. Under **Source details**, select **Asset**.
232
+
1. Select the asset you want to use as the source endpoint.
233
+
1. Select **Proceed**.
234
+
235
+
A list of datapoints for the selected asset is displayed.
236
+
237
+
:::image type="content" source="media/howto-create-dataflow/dataflow-source-asset.png" alt-text="Screenshot using operations experience to select an asset as the source endpoint.":::
238
+
239
+
1. Select **Apply** to use the asset as the source endpoint.
240
+
241
+
# [Bicep](#tab/bicep)
242
+
243
+
Configuring an asset as a source is only available in the operations experience.
244
+
245
+
# [Kubernetes (preview)](#tab/kubernetes)
246
+
247
+
Configuring an asset as a source is only available in the operations experience.
248
+
249
+
---
250
+
251
+
When using an asset as the source, the asset definition is used to infer the schema for the dataflow. The asset definition includes the schema for the asset's datapoints. To learn more, see [Manage asset configurations remotely](../discover-manage-assets/howto-manage-assets-remotely.md).
252
+
253
+
Once configured, the data from the asset reached the dataflow via the local MQTT broker. So, when using an asset as the source, the dataflow uses the local MQTT broker default endpoint as the source in actuality.
254
+
255
+
### Option 3: Use custom MQTT or Kafka dataflow endpoint as source
256
256
257
257
If you created a custom MQTT or Kafka dataflow endpoint (for example, to use with Event Grid or Event Hubs), you can use it as the source for the dataflow. Remember that storage type endpoints, like Data Lake or Fabric OneLake, can't be used as source.
258
258
@@ -384,14 +384,16 @@ sourceSettings:
384
384
---
385
385
386
386
387
-
If the instance count in the [dataflow profile](howto-configure-dataflow-profile.md) is greater than 1, shared subscription is automatically enabled for all dataflows that use MQTT source. In this case, the `$shared` prefix is added and the shared subscription group name automatically generated. For example, if you have a dataflow profile with an instance count of 3, and your dataflow uses an MQTT endpoint as source configured with topics `topic1` and `topic2`, they are automatically converted to shared subscriptions as `$shared/<GENERATED_GROUP_NAME>/topic1` and `$shared/<GENERATED_GROUP_NAME>/topic2`. If you want to use a different shared subscription group ID, you can override it in the topic, like `$shared/mygroup/topic1`.
387
+
If the instance count in the [dataflow profile](howto-configure-dataflow-profile.md) is greater than one, shared subscription is automatically enabled for all dataflows that use MQTT source. In this case, the `$shared` prefix is added and the shared subscription group name automatically generated. For example, if you have a dataflow profile with an instance count of 3, and your dataflow uses an MQTT endpoint as source configured with topics `topic1` and `topic2`, they are automatically converted to shared subscriptions as `$shared/<GENERATED_GROUP_NAME>/topic1` and `$shared/<GENERATED_GROUP_NAME>/topic2`.
388
+
389
+
You can explicitly create a topic named `$shared/mygroup/topic` in your configuration. However, adding the `$shared` topic explicitly isn't recommended since the `$shared` prefix is automatically added when needed. Dataflows can make optimizations with the group name if it isn't set. For example, `$share` isn't set and dataflows only has to operate over the topic name.
388
390
389
391
> [!IMPORTANT]
390
-
> Dataflows requireing shared subscription when instance count is greater than 1 is important when using Event Grid MQTT broker as a source since it [doesn't support shared subscriptions](../../event-grid/mqtt-support.md#mqttv5-current-limitations). To avoid missing messages, set the dataflow profile instance count to 1 when using Event Grid MQTT broker as the source. That is when the dataflow is the subscriber and receiving messages from the cloud.
392
+
> Dataflows requiring shared subscription when instance count is greater than one is important when using Event Grid MQTT broker as a source since it [doesn't support shared subscriptions](../../event-grid/mqtt-support.md#mqttv5-current-limitations). To avoid missing messages, set the dataflow profile instance count to one when using Event Grid MQTT broker as the source. That is when the dataflow is the subscriber and receiving messages from the cloud.
391
393
392
394
#### Kafka topics
393
395
394
-
When the source is a Kafka (Event Hubs included) endpoint, specify the individual kafka topics to subscribe to for incoming messages. Wildcards are not supported, so you must specify each topic statically.
396
+
When the source is a Kafka (Event Hubs included) endpoint, specify the individual Kafka topics to subscribe to for incoming messages. Wildcards are not supported, so you must specify each topic statically.
395
397
396
398
> [!NOTE]
397
399
> When using Event Hubs via the Kafka endpoint, each individual event hub within the namespace is the Kafka topic. For example, if you have an Event Hubs namespace with two event hubs, `thermostats` and `humidifiers`, you can specify each event hub as a Kafka topic.
@@ -430,7 +432,7 @@ sourceSettings:
430
432
431
433
### Specify source schema
432
434
433
-
When using MQTT or Kafka as the source, you can specify a schema to display the list of data points in the operations experience portal. Note that using a schema to deserialize and validate incoming messages [isn't currently supported](../troubleshoot/known-issues.md#dataflows).
435
+
When using MQTT or Kafka as the source, you can specify a [schema](concept-schema-registry.md) to display the list of data points in the operations experience portal. Note that using a schema to deserialize and validate incoming messages [isn't currently supported](../troubleshoot/known-issues.md#dataflows).
434
436
435
437
If the source is an asset, the schema is automatically inferred from the asset definition.
436
438
@@ -469,9 +471,11 @@ To learn more, see [Understand message schemas](concept-schema-registry.md).
469
471
470
472
The transformation operation is where you can transform the data from the source before you send it to the destination. Transformations are optional. If you don't need to make changes to the data, don't include the transformation operation in the dataflow configuration. Multiple transformations are chained together in stages regardless of the order in which they're specified in the configuration. The order of the stages is always:
471
473
472
-
1. **Enrich**, **Rename**, or add a **New property**: Add additional data to the source data given a dataset and condition to match.
474
+
1. **Enrich**: Add additional data to the source data given a dataset and condition to match.
473
475
1. **Filter**: Filter the data based on a condition.
474
-
1. **Map** or **Compute**: Move data from one field to another with an optional conversion.
476
+
1. **Map**, **Compute**, **Rename**, or add a **New property**: Move data from one field to another with an optional conversion.
477
+
478
+
This section is an introduction to dataflow transforms. For more detailed information, see [Map data by using dataflows](concept-dataflow-mapping.md), [Convert data by using dataflow conversions](concept-dataflow-conversions.md), and [Enrich data by using dataflows](concept-dataflow-enrich.md).
475
479
476
480
# [Portal](#tab/portal)
477
481
@@ -517,43 +521,7 @@ You can load sample data into the state store by using the [state store CLI](htt
517
521
518
522
# [Portal](#tab/portal)
519
523
520
-
In the operations experience, the *Enrich* stage is currently supported using the **Rename** and **New property** transforms.
521
-
522
-
#### Rename
523
-
524
-
You can rename a datapoint using the **Rename** transform. This operation is used to rename a datapoint in the source data to a new name. The new name can be used in the subsequent stages of the dataflow.
525
-
526
-
1. Under **Transform (optional)**, select **Rename** > **Add**.
527
-
528
-
:::image type="content" source="media/howto-create-dataflow/dataflow-rename.png" alt-text="Screenshot using operations experience to rename a datapoint.":::
| Datapoint | Select a datapoint from the dropdown or enter a $metadata header using the format `$metadata.<header>.` |
535
-
| New datapoint name | Enter the new name for the datapoint. |
536
-
| Description | Provide a description for the transformation. |
537
-
538
-
1. Select **Apply**.
539
-
540
-
#### New property
541
-
542
-
You can add a new property to the source data using the **New property** transform. This operation is used to add a new property to the source data. The new property can be used in the subsequent stages of the dataflow.
543
-
544
-
1. Under **Transform (optional)**, select **New property** > **Add**.
545
-
546
-
:::image type="content" source="media/howto-create-dataflow/dataflow-new-property.png" alt-text="Screenshot using operations experience to add a new property.":::
| Property key | Enter the key for the new property. |
553
-
| Property value | Enter the value for the new property. |
554
-
| Description | Provide a description for the new property. |
555
-
556
-
1. Select **Apply**.
524
+
Currently, the *Enrich* stage isn't supported in the operations experience.
557
525
558
526
# [Bicep](#tab/bicep)
559
527
@@ -668,7 +636,11 @@ To map the data to another field with optional conversion, you can use the `map`
668
636
669
637
# [Portal](#tab/portal)
670
638
671
-
In the operations experience, mapping is currently supported using **Compute** transforms.
639
+
In the operations experience, mapping is currently supported using **Compute**, **Rename**, and **New property** transforms.
640
+
641
+
#### Compute
642
+
643
+
You can use the **Compute** transform to apply a formula to the source data. This operation is used to apply a formula to the source data and store the result field.
672
644
673
645
1. Under **Transform (optional)**, select **Compute** > **Add**.
674
646
@@ -691,6 +663,42 @@ In the operations experience, mapping is currently supported using **Compute** t
691
663
692
664
1. Select **Apply**.
693
665
666
+
#### Rename
667
+
668
+
You can rename a datapoint using the **Rename** transform. This operation is used to rename a datapoint in the source data to a new name. The new name can be used in the subsequent stages of the dataflow.
669
+
670
+
1. Under **Transform (optional)**, select **Rename** > **Add**.
671
+
672
+
:::image type="content" source="media/howto-create-dataflow/dataflow-rename.png" alt-text="Screenshot using operations experience to rename a datapoint.":::
| Datapoint | Select a datapoint from the dropdown or enter a $metadata header using the format `$metadata.<header>.` |
679
+
| New datapoint name | Enter the new name for the datapoint. |
680
+
| Description | Provide a description for the transformation. |
681
+
682
+
1. Select **Apply**.
683
+
684
+
#### New property
685
+
686
+
You can add a new property to the source data using the **New property** transform. This operation is used to add a new property to the source data. The new property can be used in the subsequent stages of the dataflow.
687
+
688
+
1. Under **Transform (optional)**, select **New property** > **Add**.
689
+
690
+
:::image type="content" source="media/howto-create-dataflow/dataflow-new-property.png" alt-text="Screenshot using operations experience to add a new property.":::
| Property key | Enter the key for the new property. |
697
+
| Property value | Enter the value for the new property. |
698
+
| Description | Provide a description for the new property. |
699
+
700
+
1. Select **Apply**.
701
+
694
702
# [Bicep](#tab/bicep)
695
703
696
704
For example, you could use the `temperature` field in the source data to convert the temperature to Celsius and store it in the `temperatureCelsius` field. You could also enrich the source data with the `location` field from the contextualization dataset:
@@ -1148,4 +1156,4 @@ To ensure the dataflow is working as expected, verify the following:
1148
1156
- [Convert data by using dataflows](concept-dataflow-conversions.md)
1149
1157
- [Enrich data by using dataflows](concept-dataflow-enrich.md)
0 commit comments