Skip to content

Commit bf64c70

Browse files
Merge pull request #286663 from kgremban/m2-schema
schema registry init
2 parents df54adf + f92e371 commit bf64c70

File tree

4 files changed

+118
-0
lines changed

4 files changed

+118
-0
lines changed
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
---
2+
title: Understand message schemas
3+
description: Learn how schema registry handles message schemas to work with Azure IoT Operations components including dataflows.
4+
author: kgremban
5+
ms.author: kgremban
6+
ms.topic: conceptual
7+
ms.date: 09/23/2024
8+
9+
#CustomerIntent: As an operator, I want to understand how I can use message schemas to filter and transform messages.
10+
---
11+
12+
# Understand message schemas
13+
14+
Schema registry, a feature provided by Azure Device Registry Preview, is a synchronized repository in the cloud and at the edge. The schema registry stores the definitions of messages coming from edge assets, and then exposes an API to access those schemas at the edge.
15+
16+
The connector for OPC UA can create message schemas and add them to the schema registry or customers can upload schemas to the operations experience web UI or using ARM/Bicep templates.
17+
18+
Edge services use message schemas to filter and transform messages as they're routed across your industrial edge scenario.
19+
20+
*Schemas* are documents that describe the format of a message and its contents to enable processing and contextualization.
21+
22+
## Message schema definitions
23+
24+
Schema registry expects the following required fields in a message schema:
25+
26+
| Required field | Definition |
27+
| -------------- | ---------- |
28+
| `$schema` | Either `http://json-schema.org/draft-07/schema#` or `Delta/1.0`. In dataflows, JSON schemas are used for source endpoints and Delta schemas are used for destination endpoints. |
29+
| `type` | `Object` |
30+
| `properties` | The message definition. |
31+
32+
### Sample schemas
33+
34+
The following sample schemas provide examples for defining message schemas in each format.
35+
36+
JSON:
37+
38+
```json
39+
{
40+
"$schema": "http://json-schema.org/draft-07/schema#",
41+
"name": "foobarbaz",
42+
"description": "A representation of an event",
43+
"type": "object",
44+
"required": [ "dtstart", "summary" ],
45+
"properties": {
46+
"summary": {
47+
"type": "string"
48+
},
49+
"location": {
50+
"type": "string"
51+
},
52+
"url": {
53+
"type": "string"
54+
},
55+
"duration": {
56+
"type": "string",
57+
"description": "Event duration"
58+
}
59+
}
60+
}
61+
```
62+
63+
Delta:
64+
65+
```delta
66+
{
67+
"$schema": "Delta/1.0",
68+
"type": "object",
69+
"properties": {
70+
"type": "struct",
71+
"fields": [
72+
{ "name": "asset_id", "type": "string", "nullable": false, "metadata": {} },
73+
{ "name": "asset_name", "type": "string", "nullable": false, "metadata": {} },
74+
{ "name": "location", "type": "string", "nullable": false, "metadata": {} },
75+
{ "name": "manufacturer", "type": "string", "nullable": false, "metadata": {} },
76+
{ "name": "production_date", "type": "string", "nullable": false, "metadata": {} },
77+
{ "name": "serial_number", "type": "string", "nullable": false, "metadata": {} },
78+
{ "name": "temperature", "type": "double", "nullable": false, "metadata": {} }
79+
]
80+
}
81+
}
82+
```
83+
84+
## How dataflows use message schemas
85+
86+
Message schemas are used in all three phases of a dataflow: defining the source input, applying data transformations, and creating the destination output.
87+
88+
### Input schema
89+
90+
Each dataflow source can optionally specify a message schema. If a schema is defined for a dataflow source, any incoming messages that don't match the schema are dropped.
91+
92+
Asset sources have a predefined message schema that was created by the connector for OPC UA.
93+
94+
Schemas can be uploaded for MQTT sources. Currently, Azure IoT Operations supports JSON for source schemas, also known as input schemas. In the operations experience, you can select an existing schema or upload one while defining an MQTT source:
95+
96+
:::image type="content" source="./media/concept-schema-registry/upload-schema.png" alt-text="Screenshot that shows uploading a message schema in the operations experience portal.":::
97+
98+
### Transformation
99+
100+
The operations experience uses the input schema as a starting point for your data, making it easier to select transformations based on the known input message format.
101+
102+
### Output schema
103+
104+
Output schemas are associated with dataflow destinations are only used for dataflows that select local storage, Fabric, Azure Storage (ADLS Gen2), or Azure Data Explorer as the destination endpoint. Currently, Azure IoT Operations experience only supports Parquet output for output schemas.
105+
106+
Note: The Delta schema format is used for both Parquet and Delta output.
107+
108+
For these dataflows, the operations experience applies any transformations to the input schema then creates a new schema in Delta format. When the dataflow custom resource (CR) is created, it includes a `schemaRef` value that points to the generated schema stored in the schema registry.
98.9 KB
Loading

articles/iot-operations/connect-to-cloud/overview-dataflow.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,14 @@ The configuration is specified by using Kubernetes CRDs. Based on this configura
5959

6060
By using dataflows, you can efficiently manage your data paths. You can ensure that data is accurately sent, transformed, and enriched to meet your operational needs.
6161

62+
## Schema registry
63+
64+
Schema registry, a feature provided by Azure Device Registry Preview, is a synchronized repository in the cloud and at the edge. The schema registry stores the definitions of messages coming from edge assets, and then exposes an API to access those schemas at the edge. Southbound connectors like the OPC UA connector can create message schemas and add them to the schema registry or customers can upload schemas to the operations experience web UI.
65+
66+
Dataflows uses messages schemas at both the source and destination points. For sources, message schemas can work as filters to identify the specific messages that you want to capture for a dataflow. For destinations, message schemas help to transform the message into the format expected by the destination endpoint.
67+
68+
For more information, see [Understand message schemas](./concept-schema-registry.md).
69+
6270
## Related content
6371

6472
- [Quickstart: Send asset telemetry to the cloud by using a dataflow](../get-started-end-to-end-sample/quickstart-upload-telemetry-to-cloud.md)

articles/iot-operations/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,8 @@ items:
101101
href: connect-to-cloud/concept-dataflow-conversions.md
102102
- name: Enrich data
103103
href: connect-to-cloud/concept-dataflow-enrich.md
104+
- name: Use message schemas
105+
href: connect-to-cloud/concept-schema-registry.md
104106
- name: Manage dataflow profile
105107
href: connect-to-cloud/howto-configure-dataflow-profile.md
106108
- name: Manage layered network

0 commit comments

Comments
 (0)