Skip to content

Commit 95fbdbb

Browse files
committed
Add feedback
1 parent e4d5134 commit 95fbdbb

File tree

1 file changed

+41
-39
lines changed

1 file changed

+41
-39
lines changed

articles/iot-operations/connect-to-cloud/howto-create-dataflow.md

Lines changed: 41 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -169,39 +169,11 @@ Review the following sections to learn how to configure the operation types of t
169169

170170
## Source
171171

172-
To configure a source for the dataflow, specify the endpoint reference and a list of data sources for the endpoint.
172+
To configure a source for the dataflow, specify the endpoint reference and a list of data sources for the endpoint. Choose one of the following options as the source for the dataflow.
173173

174-
### Use asset as source
175-
176-
# [Portal](#tab/portal)
177-
178-
You can use an [asset](../discover-manage-assets/overview-manage-assets.md) as the source for the dataflow. Using an asset as a source is only available in the operations experience.
179-
180-
1. Under **Source details**, select **Asset**.
181-
1. Select the asset you want to use as the source endpoint.
182-
1. Select **Proceed**.
183-
184-
A list of datapoints for the selected asset is displayed.
185-
186-
:::image type="content" source="media/howto-create-dataflow/dataflow-source-asset.png" alt-text="Screenshot using operations experience to select an asset as the source endpoint.":::
187-
188-
1. Select **Apply** to use the asset as the source endpoint.
189-
190-
# [Bicep](#tab/bicep)
191-
192-
Configuring an asset as a source is only available in the operations experience.
193-
194-
# [Kubernetes (preview)](#tab/kubernetes)
195-
196-
Configuring an asset as a source is only available in the operations experience.
197-
198-
---
199-
200-
When using an asset as the source, the asset definition is used to infer the schema for the dataflow. The asset definition includes the schema for the asset's datapoints. To learn more, see [Manage asset configurations remotely](../discover-manage-assets/howto-manage-assets-remotely.md).
201-
202-
Once configured, the data from the asset reached the dataflow via the local MQTT broker. So, when using an asset as the source, the dataflow uses the local MQTT broker default endpoint as the source in actuality.
174+
If the default endpoint isn't used as the source, it must be used as the [destination](#destination). To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
203175

204-
### Use default MQTT endpoint as source
176+
### Option 1: Use default MQTT endpoint as source
205177

206178
# [Portal](#tab/portal)
207179

@@ -250,9 +222,37 @@ Because `dataSources` allows you to specify MQTT or Kafka topics without modifyi
250222

251223
---
252224

253-
If the default endpoint isn't used as the source, it must be used as the [destination](#destination). To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
225+
### Option 2: Use asset as source
254226

255-
### Use custom MQTT or Kafka dataflow endpoint as source
227+
# [Portal](#tab/portal)
228+
229+
You can use an [asset](../discover-manage-assets/overview-manage-assets.md) as the source for the dataflow. Using an asset as a source is only available in the operations experience.
230+
231+
1. Under **Source details**, select **Asset**.
232+
1. Select the asset you want to use as the source endpoint.
233+
1. Select **Proceed**.
234+
235+
A list of datapoints for the selected asset is displayed.
236+
237+
:::image type="content" source="media/howto-create-dataflow/dataflow-source-asset.png" alt-text="Screenshot using operations experience to select an asset as the source endpoint.":::
238+
239+
1. Select **Apply** to use the asset as the source endpoint.
240+
241+
# [Bicep](#tab/bicep)
242+
243+
Configuring an asset as a source is only available in the operations experience.
244+
245+
# [Kubernetes (preview)](#tab/kubernetes)
246+
247+
Configuring an asset as a source is only available in the operations experience.
248+
249+
---
250+
251+
When using an asset as the source, the asset definition is used to infer the schema for the dataflow. The asset definition includes the schema for the asset's datapoints. To learn more, see [Manage asset configurations remotely](../discover-manage-assets/howto-manage-assets-remotely.md).
252+
253+
Once configured, the data from the asset reached the dataflow via the local MQTT broker. So, when using an asset as the source, the dataflow uses the local MQTT broker default endpoint as the source in actuality.
254+
255+
### Option 3: Use custom MQTT or Kafka dataflow endpoint as source
256256

257257
If you created a custom MQTT or Kafka dataflow endpoint (for example, to use with Event Grid or Event Hubs), you can use it as the source for the dataflow. Remember that storage type endpoints, like Data Lake or Fabric OneLake, can't be used as source.
258258

@@ -384,14 +384,16 @@ sourceSettings:
384384
---
385385

386386

387-
If the instance count in the [dataflow profile](howto-configure-dataflow-profile.md) is greater than 1, shared subscription is automatically enabled for all dataflows that use MQTT source. In this case, the `$shared` prefix is added and the shared subscription group name automatically generated. For example, if you have a dataflow profile with an instance count of 3, and your dataflow uses an MQTT endpoint as source configured with topics `topic1` and `topic2`, they are automatically converted to shared subscriptions as `$shared/<GENERATED_GROUP_NAME>/topic1` and `$shared/<GENERATED_GROUP_NAME>/topic2`. If you want to use a different shared subscription group ID, you can override it in the topic, like `$shared/mygroup/topic1`.
387+
If the instance count in the [dataflow profile](howto-configure-dataflow-profile.md) is greater than one, shared subscription is automatically enabled for all dataflows that use MQTT source. In this case, the `$shared` prefix is added and the shared subscription group name automatically generated. For example, if you have a dataflow profile with an instance count of 3, and your dataflow uses an MQTT endpoint as source configured with topics `topic1` and `topic2`, they are automatically converted to shared subscriptions as `$shared/<GENERATED_GROUP_NAME>/topic1` and `$shared/<GENERATED_GROUP_NAME>/topic2`.
388+
389+
You can explicitly create a topic named `$shared/mygroup/topic` in your configuration. However, adding the `$shared` topic explicitly isn't recommended since the `$shared` prefix is automatically added when needed. Dataflows can make optimizations with the group name if it isn't set. For example, `$share` isn't set and dataflows only has to operate over the topic name.
388390

389391
> [!IMPORTANT]
390-
> Dataflows requireing shared subscription when instance count is greater than 1 is important when using Event Grid MQTT broker as a source since it [doesn't support shared subscriptions](../../event-grid/mqtt-support.md#mqttv5-current-limitations). To avoid missing messages, set the dataflow profile instance count to 1 when using Event Grid MQTT broker as the source. That is when the dataflow is the subscriber and receiving messages from the cloud.
392+
> Dataflows requiring shared subscription when instance count is greater than one is important when using Event Grid MQTT broker as a source since it [doesn't support shared subscriptions](../../event-grid/mqtt-support.md#mqttv5-current-limitations). To avoid missing messages, set the dataflow profile instance count to one when using Event Grid MQTT broker as the source. That is when the dataflow is the subscriber and receiving messages from the cloud.
391393

392394
#### Kafka topics
393395

394-
When the source is a Kafka (Event Hubs included) endpoint, specify the individual kafka topics to subscribe to for incoming messages. Wildcards are not supported, so you must specify each topic statically.
396+
When the source is a Kafka (Event Hubs included) endpoint, specify the individual Kafka topics to subscribe to for incoming messages. Wildcards are not supported, so you must specify each topic statically.
395397

396398
> [!NOTE]
397399
> When using Event Hubs via the Kafka endpoint, each individual event hub within the namespace is the Kafka topic. For example, if you have an Event Hubs namespace with two event hubs, `thermostats` and `humidifiers`, you can specify each event hub as a Kafka topic.
@@ -430,7 +432,7 @@ sourceSettings:
430432

431433
### Specify schema to deserialize data
432434

433-
If the source data has optional fields or fields with different types, specify a deserialization schema to ensure consistency. For example, the data might have fields that aren't present in all messages. Without the schema, the transformation can't handle these fields as they would have empty values. With the schema, you can specify default values or ignore the fields.
435+
If the source data has optional fields or fields with different types, specify a [deserialization schema](concept-schema-registry.md) to ensure consistency. For example, the data might have fields that aren't present in all messages. Without the schema, the transformation can't handle these fields as they would have empty values. With the schema, you can specify default values or ignore the fields.
434436

435437
Specifying the schema is only relevant when using the MQTT or Kafka source. If the source is an asset, the schema is automatically inferred from the asset definition.
436438

@@ -511,7 +513,7 @@ builtInTransformationSettings:
511513

512514
To enrich the data, you can use the reference dataset in the Azure IoT Operations [state store](../create-edge-apps/concept-about-state-store-protocol.md). The dataset is used to add extra data to the source data based on a condition. The condition is specified as a field in the source data that matches a field in the dataset.
513515

514-
You can load sample data into the state store by using the [DSS set tool sample](https://github.com/Azure-Samples/explore-iot-operations/tree/main/samples/dss_set). Key names in the state store correspond to a dataset in the dataflow configuration.
516+
You can load sample data into the state store by using the [DSS set tool sample](https://github.com/Azure-Samples/explore-iot-operations/tree/main/samples/dss_set) (Linux/x86 only). Key names in the state store correspond to a dataset in the dataflow configuration.
515517

516518
# [Portal](#tab/portal)
517519

@@ -1146,4 +1148,4 @@ To ensure the dataflow is working as expected, verify the following:
11461148
- [Convert data by using dataflows](concept-dataflow-conversions.md)
11471149
- [Enrich data by using dataflows](concept-dataflow-enrich.md)
11481150
- [Understand message schemas](concept-schema-registry.md)
1149-
- [Manage dataflow profiles](howto-configure-dataflow-profile.md)
1151+
- [Manage dataflow profiles](howto-configure-dataflow-profile.md)

0 commit comments

Comments
 (0)