Skip to content

Commit 2255506

Browse files
authored
Merge pull request #283508 from PatAltimore/patricka-dataflows-release-aio-july-updates
Dataflow updates
2 parents 9cb5f6b + a319d77 commit 2255506

13 files changed

+63
-261
lines changed

articles/iot-operations/connect-to-cloud/concept-dataflow-conversions.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,18 @@ author: PatAltimore
55
ms.author: patricka
66
ms.subservice: azure-data-flows
77
ms.topic: concept-article
8-
ms.date: 08/01/2024
8+
ms.date: 08/03/2024
99

1010
#CustomerIntent: As an operator, I want to understand how to use dataflow conversions to transform data.
1111
---
1212

1313
# Convert data using dataflow conversions
1414

15+
[!INCLUDE [public-preview-note](../includes/public-preview-note.md)]
16+
1517
You can use dataflow conversions to transform data in Azure IoT Operations. The *conversion* element in a dataflow is used to compute values for output fields. You can use input fields, available operations, data types, and type conversions in dataflow conversions.
1618

17-
Dataflow *conversion* element is used to compute values for output fields:
19+
The dataflow *conversion* element is used to compute values for output fields:
1820

1921
```yaml
2022
- inputs:
@@ -167,7 +169,7 @@ This mapping creates an array containing the minimum, maximum, average, and mean
167169
* Handling missing fields in the input by providing an alternative value.
168170
* Conditionally removing a field based on its presence.
169171

170-
Example mapping using *missing value*`:
172+
Example mapping using *missing value*:
171173

172174
```json
173175
{
@@ -217,7 +219,7 @@ Functions can be used in the conversion formula to perform various operations.
217219

218220
## Available operations
219221

220-
Dataflow offers a wide range of out-of-the-box (OOTB) conversion functions that allow users to easily perform unit conversions without the need for complex calculations. These predefined functions cover common conversions such as temperature, pressure, length, weight, and volume. The following is a list of the available conversion functions, along with their corresponding formulas and function names:
222+
Dataflows offer a wide range of out-of-the-box (OOTB) conversion functions that allow users to easily perform unit conversions without the need for complex calculations. These predefined functions cover common conversions such as temperature, pressure, length, weight, and volume. The following is a list of the available conversion functions, along with their corresponding formulas and function names:
221223

222224
| Conversion | Formula | Function Name |
223225
| --- | --- | --- |

articles/iot-operations/connect-to-cloud/concept-dataflow-enrich.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,15 @@ author: PatAltimore
55
ms.author: patricka
66
ms.subservice: azure-data-flows
77
ms.topic: concept-article
8-
ms.date: 08/01/2024
8+
ms.date: 08/03/2024
99

1010
#CustomerIntent: As an operator, I want to understand how to create a dataflow to enrich data sent to endpoints.
1111
---
1212

1313
# Enrich data using dataflows
1414

15+
[!INCLUDE [public-preview-note](../includes/public-preview-note.md)]
16+
1517
You can enrich data using the *contextualization datasets* function. When processing incoming records, these datasets can be queried based on conditions that relate to the fields of the incoming record. This capability allows for dynamic interactions where data from these datasets can be used to supplement information in the output fields and participate in complex calculations during the mapping process.
1618

1719
For example, consider the following dataset with a few records, represented as JSON records:

articles/iot-operations/connect-to-cloud/concept-dataflow-mapping.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ author: PatAltimore
55
ms.author: patricka
66
ms.subservice: azure-data-flows
77
ms.topic: concept-article
8-
ms.date: 08/02/2024
8+
ms.date: 08/03/2024
99

1010
#CustomerIntent: As an operator, I want to understand how to use the dataflow mapping language to transform data.
1111
---
@@ -20,7 +20,7 @@ Mapping allows you to transform data from one format to another. Consider the fo
2020

2121
```json
2222
{
23-
"Name": "John Doe",
23+
"Name": "Grace Owens",
2424
"Place of birth": "London, TX",
2525
"Birth Date": "19840202",
2626
"Start Date": "20180812",
@@ -34,7 +34,7 @@ Compare it with the output record:
3434
```json
3535
{
3636
"Employee": {
37-
"Name": "John Doe",
37+
"Name": "Grace Owens",
3838
"Date of Birth": "19840202"
3939
},
4040
"Employment": {
@@ -104,7 +104,7 @@ Dot-notation is widely used in computer science to reference fields, even recurs
104104
- Person.Address.Street.Number
105105
```
106106

107-
However, in a dataflow a path described by dot-notation might include strings and some special characters without needing escaping:
107+
However, in a dataflow, a path described by dot-notation might include strings and some special characters without needing escaping:
108108

109109
```yaml
110110
- inputs:
@@ -120,7 +120,7 @@ However, in other cases, escaping is necessary:
120120

121121
The previous example, among other special characters, contains dots within the field name, which, without escaping, would serve as a separator in the dot-notation itself.
122122

123-
While dataflow parses a path, it treats only two characters as special:
123+
While a dataflow parses a path, it treats only two characters as special:
124124

125125
* Dots ('.') act as field separators.
126126
* Quotes, when placed at the beginning or the end of a segment, start an escaped section where dots aren't treated as field separators.
@@ -450,12 +450,12 @@ Consider a special case for the same fields to help deciding the right action:
450450

451451
An empty `output` field in the second definition implies not writing the fields in the output record (effectively removing `Opacity`). This setup is more of a `Specialization` than a `Second Rule`.
452452

453-
Resolution of overlapping mappings by dataflow:
453+
Resolution of overlapping mappings by dataflows:
454454

455455
* The evaluation progresses from the top rule in the mapping definition.
456456
* If a new mapping resolves to the same fields as a previous rule, the following applies:
457457
* A `Rank` is calculated for each resolved input based on the number of segments the wildcard captures. For instance, if the `Captured Segments` are `Properties.Opacity`, the `Rank` is 2. If only `Opacity`, the `Rank` is 1. A mapping without wildcards has a `Rank` of 0.
458-
* If the `Rank` of the latter rule is equal to or higher than the previous rule, dataflow treats it as a `Second Rule`.
458+
* If the `Rank` of the latter rule is equal to or higher than the previous rule, a dataflow treats it as a `Second Rule`.
459459
* Otherwise, it treats the configuration as a `Specialization`.
460460

461461
For example, the mapping that directs `Opacity.Max` and `Opacity.Min` to an empty output has a `Rank` of zero. Since the second rule has a lower `Rank` than the previous, it's considered a specialization and overrides the previous rule, which would calculate a value for `Opacity`

articles/iot-operations/connect-to-cloud/howto-configure-dataflow-endpoint.md

Lines changed: 15 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
2-
title: Configure dataflow endpoints
2+
title: Configure dataflow endpoints in Azure IoT Operations
33
description: Configure dataflow endpoints to create connection points for data sources.
44
author: PatAltimore
55
ms.author: patricka
66
ms.subservice: azure-data-flows
7-
ms.topic: conceptual
8-
ms.date: 07/26/2024
7+
ms.topic: how-to
8+
ms.date: 08/03/2024
99

1010
#CustomerIntent: As an operator, I want to understand how to configure source and destination endpoints so that I can create a dataflow.
1111
---
@@ -23,19 +23,22 @@ kind: DataflowEndpoint
2323
metadata:
2424
name: <endpoint-name>
2525
spec:
26-
endpointType: <endpointType> # mqtt, kafka, dataExplorer, dataLakeStorage, fabricOneLake, or localStorage
26+
endpointType: <endpointType> # mqtt, kafka, or localStorage
2727
authentication:
2828
method: <method> # systemAssignedManagedIdentity, x509Credentials, userAssignedManagedIdentity, or serviceAccountToken
2929
systemAssignedManagedIdentitySettings: # Required if method is systemAssignedManagedIdentity
3030
audience: https://eventgrid.azure.net
31-
x509CredentialsSettings: # Required if method is x509Credentials
32-
certificateSecretName: x509-certificate
33-
userAssignedManagedIdentitySettings: # Required if method is userAssignedManagedIdentity
34-
clientId: <id>
35-
tenantId: <id>
36-
audience: https://eventgrid.azure.net
37-
serviceAccountTokenSettings: # Required if method is serviceAccountToken
38-
audience: my-audience
31+
### OR
32+
# x509CredentialsSettings: # Required if method is x509Credentials
33+
# certificateSecretName: x509-certificate
34+
### OR
35+
# userAssignedManagedIdentitySettings: # Required if method is userAssignedManagedIdentity
36+
# clientId: <id>
37+
# tenantId: <id>
38+
# audience: https://eventgrid.azure.net
39+
### OR
40+
# serviceAccountTokenSettings: # Required if method is serviceAccountToken
41+
# audience: my-audience
3942
mqttSettings: # Required if endpoint type is mqtt
4043
host: example.westeurope-1.ts.eventgrid.azure.net:8883
4144
tls: # Omit for no TLS or MQTT.
@@ -304,85 +307,6 @@ spec:
304307

305308
## Endpoint types for destinations only
306309

307-
### Azure Data Lake (ADLSv2)
308-
309-
Azure Data Lake endpoints are used for Azure Data Lake destinations. You can configure the endpoint, authentication, table, and other settings.
310-
311-
```yaml
312-
apiVersion: connectivity.iotoperations.azure.com/v1beta1
313-
kind: DataflowEndpoint
314-
metadata:
315-
name: adls
316-
spec:
317-
endpointType: dataLakeStorage
318-
authentication:
319-
method: systemAssignedManagedIdentity
320-
systemAssignedManagedIdentitySettings: {}
321-
datalakeStorageSettings:
322-
host: example.blob.core.windows.net
323-
```
324-
325-
Other supported authentication method is SAS tokens or user-assigned managed identity.
326-
327-
```yaml
328-
spec:
329-
authentication:
330-
method: accessToken
331-
accessTokenSecretRef: <your access token secret name>
332-
# OR
333-
userAssignedManagedIdentitySettings:
334-
clientId: <id>
335-
tenantId: <id>
336-
```
337-
338-
You can also configure batching latency, max bytes, and max messages.
339-
340-
```yaml
341-
spec:
342-
endpointType: dataLakeStorage
343-
datalakeStorageSettings:
344-
batching:
345-
latencyMs: 100
346-
maxBytes: 1000000
347-
maxMessages: 1000
348-
```
349-
350-
### Azure Data Explorer (ADX)
351-
352-
Azure Data Explorer endpoints are used for Azure Data Explorer destinations. You can configure the endpoint, authentication, and other settings.
353-
354-
```yaml
355-
apiVersion: connectivity.iotoperations.azure.com/v1beta1
356-
kind: DataflowEndpoint
357-
metadata:
358-
name: adx
359-
spec:
360-
endpointType: dataExplorer
361-
authentication:
362-
method: systemAssignedManagedIdentity
363-
systemAssignedManagedIdentitySettings: {}
364-
# OR
365-
method: userAssignedManagedIdentity
366-
userAssignedManagedIdentitySettings:
367-
clientId: <id>
368-
tenantId: <id>
369-
dataExplorerSettings:
370-
host: example.westeurope.kusto.windows.net
371-
database: example-database
372-
```
373-
374-
Again, you can configure batching latency, max bytes, and max messages.
375-
376-
```yaml
377-
spec:
378-
endpointType: dataExplorer
379-
dataExplorerSettings:
380-
batching:
381-
latencyMs: 100
382-
maxBytes: 1000000
383-
maxMessages: 1000
384-
```
385-
386310
### Local storage and Edge Storage Accelerator
387311

388312
Use the local storage option to send data to a locally available persistent volume, through which you can upload data via Edge Storage Accelerator (ESA) edge volumes. In this case, the format must be parquet.

articles/iot-operations/connect-to-cloud/howto-configure-dataflow-profile.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
22
title: Configure dataflow profile in Azure IoT Operations
3-
description: How to configure dataflow profile in Azure IoT Operations to change dataflow behavior.
3+
description: How to configure a dataflow profile in Azure IoT Operations to change a dataflow behavior.
44
author: PatAltimore
55
ms.author: patricka
66
ms.subservice: azure-data-flows
7-
ms.topic: conceptual
8-
ms.date: 07/25/2024
7+
ms.topic: how-to
8+
ms.date: 08/03/2024
99

10-
#CustomerIntent: As an operator, I want to understand how to I can configure a dataflow profile to control dataflow behavior.
10+
#CustomerIntent: As an operator, I want to understand how to I can configure a a dataflow profile to control a dataflow behavior.
1111
---
1212

1313
# Configure dataflow profile
@@ -22,7 +22,7 @@ kind: DataflowProfile
2222
metadata:
2323
name: my-dataflow-profile
2424
spec:
25-
maxInstances: 4
25+
instanceCount: 1
2626
tolerations:
2727
...
2828
diagnostics:
@@ -47,7 +47,7 @@ spec:
4747
4848
| Field Name | Description |
4949
|-------------------------------------------------|-----------------------------------------------------------------------------|
50-
| `maxInstances` | Number of instances to spread the dataflows across. Optional; automatically determined if not set. |
50+
| `instanceCount` | Number of instances to spread the dataflow across. Optional; automatically determined if not set. Currently in the preview release, set the value to `1`. |
5151
| `tolerations` | Node tolerations. Optional; see [Kubernetes Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/). |
5252
| `diagnostics` | Diagnostics settings. |
5353
| `diagnostics.logFormat` | Format of the logs. For example, `text`. |

0 commit comments

Comments
 (0)