You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/iot-operations/connect-to-cloud/concept-dataflow-conversions.md
+18-50Lines changed: 18 additions & 50 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ author: PatAltimore
5
5
ms.author: patricka
6
6
ms.subservice: azure-data-flows
7
7
ms.topic: concept-article
8
-
ms.date: 10/30/2024
8
+
ms.date: 11/11/2024
9
9
10
10
#CustomerIntent: As an operator, I want to understand how to use dataflow conversions to transform data.
11
11
ms.service: azure-iot-operations
@@ -84,7 +84,7 @@ In this example, the conversion results in an array containing the values of `[M
84
84
85
85
## Data types
86
86
87
-
Different serialization formats support various data types. For instance, JSON offers a few primitive types: string, number, Boolean, and null. Also included are arrays of these primitive types. In contrast, other serialization formats like Avro have a more complex type system, including integers with multiple bit field lengths and timestamps with different resolutions. Examples are milliseconds and microseconds.
87
+
Different serialization formats support various data types. For instance, JSON offers a few primitive types: string, number, Boolean, and null. It also includes arrays of these primitive types.
88
88
89
89
When the mapper reads an input property, it converts it into an internal type. This conversion is necessary for holding the data in memory until it's written out into an output field. The conversion to an internal type happens regardless of whether the input and output serialization formats are the same.
90
90
@@ -105,11 +105,7 @@ The internal representation utilizes the following data types:
105
105
106
106
### Input record fields
107
107
108
-
When an input record field is read, its underlying type is converted into one of these internal type variants. The internal representation is versatile enough to handle most input types with minimal or no conversion. However, some input types require conversion or are unsupported. Some examples:
109
-
110
-
* **Avro** `UUID` **type**: It's converted to a `string` because there's no specific `UUID` type in the internal representation.
111
-
* **Avro** `decimal` **type**: It isn't supported by the mapper, so fields of this type can't be included in mappings.
112
-
* **Avro** `duration` **type**: Conversion can vary. If the `months` field is set, it's unsupported. If only `days` and `milliseconds` are set, it's converted to the internal `duration` representation.
108
+
When an input record field is read, its underlying type is converted into one of these internal type variants. The internal representation is versatile enough to handle most input types with minimal or no conversion.
113
109
114
110
For some formats, surrogate types are used. For example, JSON doesn't have a `datetime` type and instead stores `datetime` values as strings formatted according to ISO8601. When the mapper reads such a field, the internal representation remains a string.
115
111
@@ -126,10 +122,6 @@ The mapper is designed to be flexible by converting internal types into output t
126
122
* Converted to `0`/`1` if the output field is numerical.
127
123
* Converted to `true`/`false` if the output field is string.
128
124
129
-
### Explicit type conversions
130
-
131
-
Although the automatic conversions operate as you might expect based on common implementation practices, there are instances where the right conversion can't be determined automatically and results in an *unsupported* error. To address these situations, several conversion functions are available to explicitly define how data should be transformed. These functions provide more control over how data is converted and help maintain data integrity even when automatic methods fall short.
132
-
133
125
### Use a conversion formula with types
134
126
135
127
In mappings, an optional formula can specify how data from the input is processed before being written to the output field. If no formula is specified, the mapper copies the input field to the output by using the internal type and conversion rules.
@@ -186,29 +178,6 @@ expression: 'min($1)'
186
178
187
179
This configuration selects the smallest value from the `Measurements` array for the output field.
188
180
189
-
It's also possible to use functions that result in a new array:
190
-
191
-
# [Bicep](#tab/bicep)
192
-
193
-
```bicep
194
-
inputs: [
195
-
'Measurements' // - $1
196
-
]
197
-
output: 'Measurements'
198
-
expression: 'take($1, 10)' // taking at max 10 items
199
-
```
200
-
201
-
# [Kubernetes (preview)](#tab/kubernetes)
202
-
203
-
```yaml
204
-
- inputs:
205
-
- Measurements # - $1
206
-
output: Measurements
207
-
expression: take($1, 10) # taking at max 10 items
208
-
```
209
-
210
-
---
211
-
212
181
Arrays can also be created from multiple single values:
213
182
214
183
# [Bicep](#tab/bicep)
@@ -302,17 +271,22 @@ The `conversion` uses the `if` function that has three parameters:
302
271
303
272
## Available functions
304
273
305
-
Functions can be used in the conversion formula to perform various operations:
274
+
Dataflows provide a set of built-in functions that can be used in conversion formulas. These functions can be used to perform common operations like arithmetic, comparison, and string manipulation. The available functions are:
306
275
307
-
* `min` to select a single item from an array
308
-
* `if` to select between values
309
-
* String manipulation (for example, `uppercase()`)
310
-
* Explicit conversion (for example, `ISO8601_datetime`)
311
-
* Aggregation (for example, `avg()`)
276
+
| Function | Description | Examples |
277
+
|----------|-------------|---------|
278
+
| `min` | Return the minimum value from an array. | `min(2, 3, 1)` returns `1`, `min($1)` returns the minimum value from the array `$1` |
279
+
| `max` | Return the maximum value from an array. | `max(2, 3, 1)` returns `3`, `max($1)` returns the maximum value from the array `$1` |
280
+
| `if` | Return between values based on a condition. | `if($1 > 10, 'High', 'Low')` returns `'High'` if `$1` is greater than `10`, otherwise `'Low'` |
281
+
| `len` | Return the character length of a string or the number of elements in a tuple. | `len("Azure")` returns `5`, `len(1, 2, 3)` returns `3`, `len($1)` returns the number of elements in the array `$1` |
282
+
| `floor` | Return the largest integer less than or equal to a number. | `floor(2.9)` returns `2` |
283
+
| `round` | Return the nearest integer to a number, rounding half-way cases away from 0.0. | `round(2.5)` returns `3` |
284
+
| `ceil` | Return the smallest integer greater than or equal to a number. | `ceil(2.1)` returns `3` |
285
+
| `scale` | Scale a value from one range to another. | `scale($1, 0, 10, 0, 100)` scales the input value from the range 0 to 10 to the range 0 to 100 |
312
286
313
-
## Available operations
287
+
### Conversion functions
314
288
315
-
Dataflows offer a wide range of out-of-the-box conversion functions that allow users to easily perform unit conversions without the need for complex calculations. These predefined functions cover common conversions such as temperature, pressure, length, weight, and volume. The following list shows the available conversion functions, along with their corresponding formulas and function names:
289
+
Dataflows provide several built-in conversion functions for common unit conversions like temperature, pressure, length, weight, and volume. Here are some examples:
316
290
317
291
| Conversion | Formula | Function name |
318
292
| --- | --- | --- |
@@ -323,7 +297,7 @@ Dataflows offer a wide range of out-of-the-box conversion functions that allow u
These functions are designed to simplify the conversion process. They allow users to input values in one unit and receive the corresponding value in another unit effortlessly.
338
-
339
-
We also provide a scaling function to scale the range of value to the user-defined range. For the example `scale($1,0,10,0,100)`, the input value is scaled from the range 0 to 10 to the range 0 to 100.
340
-
341
-
Moreover, users have the flexibility to define their own conversion functions by using simple mathematical formulas. Our system supports basic operators such as addition (`+`), subtraction (`-`), multiplication (`*`), and division (`/`). These operators follow standard rules of precedence. For example, multiplication and division are performed before addition and subtraction. Precedence can be adjusted by using parentheses to ensure the correct order of operations. This capability empowers users to customize their unit conversions to meet specific needs or preferences, enhancing the overall utility and versatility of the system.
342
-
343
-
For more complex calculations, functions like `sqrt` (which finds the square root of a number) are also available.
311
+
Additionally, you can define your own conversion functions using basic mathematical formulas. The system supports operators like addition (`+`), subtraction (`-`), multiplication (`*`), and division (`/`). These operators follow standard rules of precedence, which can be adjusted using parentheses to ensure the correct order of operations. This allows you to customize unit conversions to meet specific needs.
Copy file name to clipboardExpand all lines: articles/iot-operations/connect-to-cloud/concept-dataflow-mapping.md
+87-11Lines changed: 87 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ author: PatAltimore
5
5
ms.author: patricka
6
6
ms.subservice: azure-data-flows
7
7
ms.topic: concept-article
8
-
ms.date: 10/30/2024
8
+
ms.date: 11/11/2024
9
9
ai-usage: ai-assisted
10
10
11
11
#CustomerIntent: As an operator, I want to understand how to use the dataflow mapping language to transform data.
@@ -118,11 +118,22 @@ The example maps:
118
118
119
119
Field references show how to specify paths in the input and output by using dot notation like `Employee.DateOfBirth` or accessing data from a contextual dataset via `$context(position)`.
120
120
121
-
### MQTT user properties
121
+
### MQTT and Kafka metadata properties
122
122
123
-
When you use MQTT as a source or destination, you can access MQTT user properties in the mapping language. User properties can be mapped in the input or output.
123
+
When you use MQTT or Kafka as a source or destination, you can access various metadata properties in the mapping language. These properties can be mapped in the input or output.
124
124
125
-
In the following example, the MQTT `topic` property is mapped to the `origin_topic` field in the output.
125
+
#### Metadata properties
126
+
127
+
* **Topic**: Works for both MQTT and Kafka. It contains the string where the message was published. Example: `$metadata.topic`.
128
+
* **User property**: In MQTT, this refers to the free-form key/value pairs an MQTT message can carry. For example, if the MQTT message was published with a user property with key "priority" and value "high", then the `$metadata.user_property.priority` reference hold the value "high". User property keys can be arbitrary strings and may require escaping: `$metadata.user_property."weird key"` uses the key "weird key" (with a space).
129
+
* **System property**: This term is used for every property that is not a user property. Currently, only a single system property is supported: `$metadata.system_property.content_type`, which reads the content type property of the MQTT message (if set).
130
+
* **Header**: This is the Kafka equivalent of the MQTT user property. Kafka can use any binary value for a key, but dataflow supports only UTF-8 string keys. Example: `$metadata.header.priority`. This functionality is similar to user properties.
131
+
132
+
#### Mapping metadata properties
133
+
134
+
##### Input mapping
135
+
136
+
In the following example, the MQTT `topic` property is mapped to the `origin_topic` field in the output:
126
137
127
138
# [Bicep](#tab/bicep)
128
139
@@ -143,7 +154,30 @@ output: origin_topic
143
154
144
155
---
145
156
146
-
You can also map MQTT properties to an output header. In the following example, the MQTT `topic` is mapped to the `origin_topic` field in the output's user property:
157
+
If the user property `priority` is present in the MQTT message, the following example demonstrates how to map it to an output field:
158
+
159
+
# [Bicep](#tab/bicep)
160
+
161
+
```bicep
162
+
inputs: [
163
+
'$metadata.user_property.priority'
164
+
]
165
+
output: 'priority'
166
+
```
167
+
168
+
# [Kubernetes (preview)](#tab/kubernetes)
169
+
170
+
```yaml
171
+
inputs:
172
+
- $metadata.user_property.priority
173
+
output: priority
174
+
```
175
+
176
+
---
177
+
178
+
##### Output mapping
179
+
180
+
You can also map metadata properties to an output header or user property. In the following example, the MQTT `topic` is mapped to the `origin_topic` field in the output's user property:
If the incoming payload contains a `priority` field, the following example demonstrates how to map it to an MQTT user property:
202
+
203
+
# [Bicep](#tab/bicep)
204
+
205
+
```bicep
206
+
inputs: [
207
+
'priority'
208
+
]
209
+
output: '$metadata.user_property.priority'
210
+
```
211
+
212
+
# [Kubernetes (preview)](#tab/kubernetes)
213
+
214
+
```yaml
215
+
inputs:
216
+
- priority
217
+
output: $metadata.user_property.priority
218
+
```
219
+
220
+
---
221
+
222
+
The same example for Kafka:
223
+
224
+
# [Bicep](#tab/bicep)
225
+
226
+
```bicep
227
+
inputs: [
228
+
'priority'
229
+
]
230
+
output: '$metadata.header.priority'
231
+
```
232
+
233
+
# [Kubernetes (preview)](#tab/kubernetes)
234
+
235
+
```yaml
236
+
inputs:
237
+
- priority
238
+
output: $metadata.header.priority
239
+
```
240
+
241
+
---
242
+
167
243
## Contextualization dataset selectors
168
244
169
245
These selectors allow mappings to integrate extra data from external databases, which are referred to as *contextualization datasets*.
@@ -371,6 +447,7 @@ output: '*'
371
447
```
372
448
373
449
---
450
+
This configuration shows a basic mapping where every field in the input is directly mapped to the same field in the output without any changes. The asterisk (`*`) serves as a wildcard that matches any field in the input record.
374
451
375
452
Here's how the asterisk (`*`) operates in this context:
376
453
@@ -379,8 +456,6 @@ Here's how the asterisk (`*`) operates in this context:
379
456
* **Captured segment**: The portion of the path that the asterisk matches is referred to as the `captured segment`.
380
457
* **Output mapping**: In the output configuration, the `captured segment` is placed where the asterisk appears. This means that the structure of the input is preserved in the output, with the `captured segment` filling the placeholder provided by the asterisk.
381
458
382
-
This configuration demonstrates the most generic form of mapping, where every field in the input is directly mapped to a corresponding field in the output without modification.
383
-
384
459
Another example illustrates how wildcards can be used to match subsections and move them together. This example effectively flattens nested structures within a JSON object.
385
460
386
461
Original JSON:
@@ -454,9 +529,9 @@ Resulting JSON:
454
529
455
530
When you place a wildcard, you must follow these rules:
456
531
457
-
* **Single asterisk per dataDestination:** Only one asterisk (`*`) is allowed within a single path.
532
+
* **Single asterisk per data reference:** Only one asterisk (`*`) is allowed within a single data reference.
458
533
* **Full segment matching:** The asterisk must always match an entire segment of the path. It can't be used to match only a part of a segment, such as `path1.partial*.path3`.
459
-
* **Positioning:** The asterisk can be positioned in various parts of `dataDestination`:
534
+
* **Positioning:** The asterisk can be positioned in various parts of a data reference:
460
535
* **At the beginning:** `*.path2.path3` - Here, the asterisk matches any segment that leads up to `path2.path3`.
461
536
* **In the middle:** `path1.*.path3` - In this configuration, the asterisk matches any segment between `path1` and `path3`.
462
537
* **At the end:** `path1.path2.*` - The asterisk at the end matches any segment that follows after `path1.path2`.
@@ -643,7 +718,7 @@ When you use the previous example from multi-input wildcards, consider the follo
643
718
'*.Min' // - $2
644
719
]
645
720
output: 'ColorProperties.*.Diff'
646
-
expression: 'abs($1 - $2)'
721
+
expression: '$1 - $2'
647
722
}
648
723
```
649
724
@@ -660,7 +735,7 @@ When you use the previous example from multi-input wildcards, consider the follo
660
735
- '*.Max' # - $1
661
736
- '*.Min' # - $2
662
737
output: 'ColorProperties.*.Diff'
663
-
expression: abs($1 - $2)
738
+
expression: $1 - $2
664
739
```
665
740
666
741
---
@@ -752,6 +827,7 @@ Consider a special case for the same fields to help decide the right action:
Copy file name to clipboardExpand all lines: articles/iot-operations/connect-to-cloud/howto-configure-dataflow-profile.md
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ ms.author: patricka
6
6
ms.service: azure-iot-operations
7
7
ms.subservice: azure-data-flows
8
8
ms.topic: how-to
9
-
ms.date: 10/30/2024
9
+
ms.date: 11/11/2024
10
10
11
11
#CustomerIntent: As an operator, I want to understand how to I can configure a a dataflow profile to control a dataflow behavior.
12
12
---
@@ -23,6 +23,8 @@ The most important setting is the instance count, which determines the number of
23
23
24
24
By default, a dataflow profile named "default" is created when Azure IoT Operations is deployed. This dataflow profile has a single instance count. You can use this dataflow profile to get started with Azure IoT Operations.
25
25
26
+
Currently, when using the [operations experience portal](https://iotoperations.azure.com/), the default dataflow profile is used for all dataflows.
27
+
26
28
# [Bicep](#tab/bicep)
27
29
28
30
```bicep
@@ -105,17 +107,13 @@ You can scale the dataflow profile to adjust the number of instances that run th
105
107
106
108
Scaling can also improve the resiliency of the dataflows by providing redundancy in case of failures.
107
109
108
-
To manually scale the dataflow profile, specify the maximum number of instances you want to run. For example, to set the instance count to 3:
110
+
To manually scale the dataflow profile, specify the number of instances you want to run. For example, to set the instance count to 3:
Copy file name to clipboardExpand all lines: articles/iot-operations/connect-to-cloud/howto-configure-fabric-endpoint.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ ms.author: patricka
6
6
ms.service: azure-iot-operations
7
7
ms.subservice: azure-data-flows
8
8
ms.topic: how-to
9
-
ms.date: 11/04/2024
9
+
ms.date: 11/11/2024
10
10
ai-usage: ai-assisted
11
11
12
12
#CustomerIntent: As an operator, I want to understand how to configure dataflow endpoints for Microsoft Fabric OneLake in Azure IoT Operations so that I can send data to Microsoft Fabric OneLake.
0 commit comments