Skip to content

Commit e4d5134

Browse files
committed
Add feedback
1 parent ecc329a commit e4d5134

File tree

1 file changed

+38
-40
lines changed

1 file changed

+38
-40
lines changed

articles/iot-operations/connect-to-cloud/howto-create-dataflow.md

Lines changed: 38 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -169,11 +169,39 @@ Review the following sections to learn how to configure the operation types of t
169169

170170
## Source
171171

172-
To configure a source for the dataflow, specify the endpoint reference and a list of data sources for the endpoint. Choose one of the following options as the source for the dataflow.
172+
To configure a source for the dataflow, specify the endpoint reference and a list of data sources for the endpoint.
173173

174-
If the default endpoint isn't used as the source, it must be used as the [destination](#destination). To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
174+
### Use asset as source
175+
176+
# [Portal](#tab/portal)
177+
178+
You can use an [asset](../discover-manage-assets/overview-manage-assets.md) as the source for the dataflow. Using an asset as a source is only available in the operations experience.
179+
180+
1. Under **Source details**, select **Asset**.
181+
1. Select the asset you want to use as the source endpoint.
182+
1. Select **Proceed**.
183+
184+
A list of datapoints for the selected asset is displayed.
185+
186+
:::image type="content" source="media/howto-create-dataflow/dataflow-source-asset.png" alt-text="Screenshot using operations experience to select an asset as the source endpoint.":::
187+
188+
1. Select **Apply** to use the asset as the source endpoint.
175189

176-
### Option 1: Use default MQTT endpoint as source
190+
# [Bicep](#tab/bicep)
191+
192+
Configuring an asset as a source is only available in the operations experience.
193+
194+
# [Kubernetes (preview)](#tab/kubernetes)
195+
196+
Configuring an asset as a source is only available in the operations experience.
197+
198+
---
199+
200+
When using an asset as the source, the asset definition is used to infer the schema for the dataflow. The asset definition includes the schema for the asset's datapoints. To learn more, see [Manage asset configurations remotely](../discover-manage-assets/howto-manage-assets-remotely.md).
201+
202+
Once configured, the data from the asset reached the dataflow via the local MQTT broker. So, when using an asset as the source, the dataflow uses the local MQTT broker default endpoint as the source in actuality.
203+
204+
### Use default MQTT endpoint as source
177205

178206
# [Portal](#tab/portal)
179207

@@ -222,37 +250,9 @@ Because `dataSources` allows you to specify MQTT or Kafka topics without modifyi
222250

223251
---
224252

225-
### Option 2: Use asset as source
226-
227-
# [Portal](#tab/portal)
228-
229-
You can use an [asset](../discover-manage-assets/overview-manage-assets.md) as the source for the dataflow. Using an asset as a source is only available in the operations experience.
230-
231-
1. Under **Source details**, select **Asset**.
232-
1. Select the asset you want to use as the source endpoint.
233-
1. Select **Proceed**.
234-
235-
A list of datapoints for the selected asset is displayed.
236-
237-
:::image type="content" source="media/howto-create-dataflow/dataflow-source-asset.png" alt-text="Screenshot using operations experience to select an asset as the source endpoint.":::
238-
239-
1. Select **Apply** to use the asset as the source endpoint.
240-
241-
# [Bicep](#tab/bicep)
242-
243-
Configuring an asset as a source is only available in the operations experience.
244-
245-
# [Kubernetes (preview)](#tab/kubernetes)
246-
247-
Configuring an asset as a source is only available in the operations experience.
248-
249-
---
250-
251-
When using an asset as the source, the asset definition is used to infer the schema for the dataflow. The asset definition includes the schema for the asset's datapoints. To learn more, see [Manage asset configurations remotely](../discover-manage-assets/howto-manage-assets-remotely.md).
252-
253-
Once configured, the data from the asset reached the dataflow via the local MQTT broker. So, when using an asset as the source, the dataflow uses the local MQTT broker default endpoint as the source in actuality.
253+
If the default endpoint isn't used as the source, it must be used as the [destination](#destination). To learn more about, see [Dataflows must use local MQTT broker endpoint](./howto-configure-dataflow-endpoint.md#dataflows-must-use-local-mqtt-broker-endpoint).
254254

255-
### Option 3: Use custom MQTT or Kafka dataflow endpoint as source
255+
### Use custom MQTT or Kafka dataflow endpoint as source
256256

257257
If you created a custom MQTT or Kafka dataflow endpoint (for example, to use with Event Grid or Event Hubs), you can use it as the source for the dataflow. Remember that storage type endpoints, like Data Lake or Fabric OneLake, can't be used as source.
258258

@@ -384,16 +384,14 @@ sourceSettings:
384384
---
385385

386386

387-
If the instance count in the [dataflow profile](howto-configure-dataflow-profile.md) is greater than one, shared subscription is automatically enabled for all dataflows that use MQTT source. In this case, the `$shared` prefix is added and the shared subscription group name automatically generated. For example, if you have a dataflow profile with an instance count of 3, and your dataflow uses an MQTT endpoint as source configured with topics `topic1` and `topic2`, they are automatically converted to shared subscriptions as `$shared/<GENERATED_GROUP_NAME>/topic1` and `$shared/<GENERATED_GROUP_NAME>/topic2`.
388-
389-
You can explicitly create a topic named `$shared/mygroup/topic` in your configuration. However, adding the `$shared` topic explicitly isn't recommended since the `$shared` prefix is automatically added when needed. Dataflows can make optimizations with the group name if it isn't set. For example, `$share` isn't set and dataflows only has to operate over the topic name.
387+
If the instance count in the [dataflow profile](howto-configure-dataflow-profile.md) is greater than 1, shared subscription is automatically enabled for all dataflows that use MQTT source. In this case, the `$shared` prefix is added and the shared subscription group name automatically generated. For example, if you have a dataflow profile with an instance count of 3, and your dataflow uses an MQTT endpoint as source configured with topics `topic1` and `topic2`, they are automatically converted to shared subscriptions as `$shared/<GENERATED_GROUP_NAME>/topic1` and `$shared/<GENERATED_GROUP_NAME>/topic2`. If you want to use a different shared subscription group ID, you can override it in the topic, like `$shared/mygroup/topic1`.
390388

391389
> [!IMPORTANT]
392-
> Dataflows requiring shared subscription when instance count is greater than one is important when using Event Grid MQTT broker as a source since it [doesn't support shared subscriptions](../../event-grid/mqtt-support.md#mqttv5-current-limitations). To avoid missing messages, set the dataflow profile instance count to one when using Event Grid MQTT broker as the source. That is when the dataflow is the subscriber and receiving messages from the cloud.
390+
> Dataflows requireing shared subscription when instance count is greater than 1 is important when using Event Grid MQTT broker as a source since it [doesn't support shared subscriptions](../../event-grid/mqtt-support.md#mqttv5-current-limitations). To avoid missing messages, set the dataflow profile instance count to 1 when using Event Grid MQTT broker as the source. That is when the dataflow is the subscriber and receiving messages from the cloud.
393391

394392
#### Kafka topics
395393

396-
When the source is a Kafka (Event Hubs included) endpoint, specify the individual Kafka topics to subscribe to for incoming messages. Wildcards are not supported, so you must specify each topic statically.
394+
When the source is a Kafka (Event Hubs included) endpoint, specify the individual kafka topics to subscribe to for incoming messages. Wildcards are not supported, so you must specify each topic statically.
397395

398396
> [!NOTE]
399397
> When using Event Hubs via the Kafka endpoint, each individual event hub within the namespace is the Kafka topic. For example, if you have an Event Hubs namespace with two event hubs, `thermostats` and `humidifiers`, you can specify each event hub as a Kafka topic.
@@ -432,7 +430,7 @@ sourceSettings:
432430

433431
### Specify schema to deserialize data
434432

435-
If the source data has optional fields or fields with different types, specify a [deserialization schema](concept-schema-registry.md) to ensure consistency. For example, the data might have fields that aren't present in all messages. Without the schema, the transformation can't handle these fields as they would have empty values. With the schema, you can specify default values or ignore the fields.
433+
If the source data has optional fields or fields with different types, specify a deserialization schema to ensure consistency. For example, the data might have fields that aren't present in all messages. Without the schema, the transformation can't handle these fields as they would have empty values. With the schema, you can specify default values or ignore the fields.
436434

437435
Specifying the schema is only relevant when using the MQTT or Kafka source. If the source is an asset, the schema is automatically inferred from the asset definition.
438436

@@ -513,7 +511,7 @@ builtInTransformationSettings:
513511

514512
To enrich the data, you can use the reference dataset in the Azure IoT Operations [state store](../create-edge-apps/concept-about-state-store-protocol.md). The dataset is used to add extra data to the source data based on a condition. The condition is specified as a field in the source data that matches a field in the dataset.
515513

516-
You can load sample data into the state store by using the [DSS set tool sample](https://github.com/Azure-Samples/explore-iot-operations/tree/main/samples/dss_set) (Linux/x86 only). Key names in the state store correspond to a dataset in the dataflow configuration.
514+
You can load sample data into the state store by using the [DSS set tool sample](https://github.com/Azure-Samples/explore-iot-operations/tree/main/samples/dss_set). Key names in the state store correspond to a dataset in the dataflow configuration.
517515

518516
# [Portal](#tab/portal)
519517

0 commit comments

Comments
 (0)