|
| 1 | +--- |
| 2 | +title: Understand message schemas |
| 3 | +description: Learn how schema registry handles message schemas to work with Azure IoT Operations components including dataflows. |
| 4 | +author: kgremban |
| 5 | +ms.author: kgremban |
| 6 | +ms.topic: conceptual |
| 7 | +ms.date: 09/23/2024 |
| 8 | + |
| 9 | +#CustomerIntent: As an operator, I want to understand how I can use message schemas to filter and transform messages. |
| 10 | +--- |
| 11 | + |
| 12 | +# Understand message schemas |
| 13 | + |
| 14 | +Schema registry, a feature provided by Azure Device Registry Preview, is a synchronized repository in the cloud and at the edge. The schema registry stores the definitions of messages coming from edge assets, and then exposes an API to access those schemas at the edge. |
| 15 | + |
| 16 | +The connector for OPC UA can create message schemas and add them to the schema registry or customers can upload schemas to the operations experience web UI or using ARM/Bicep templates. |
| 17 | + |
| 18 | +Edge services use message schemas to filter and transform messages as they're routed across your industrial edge scenario. |
| 19 | + |
| 20 | +*Schemas* are documents that describe the format of a message and its contents to enable processing and contextualization. |
| 21 | + |
| 22 | +## Message schema definitions |
| 23 | + |
| 24 | +Schema registry expects the following required fields in a message schema: |
| 25 | + |
| 26 | +| Required field | Definition | |
| 27 | +| -------------- | ---------- | |
| 28 | +| `$schema` | Either `http://json-schema.org/draft-07/schema#` or `Delta/1.0`. In dataflows, JSON schemas are used for source endpoints and Delta schemas are used for destination endpoints. | |
| 29 | +| `type` | `Object` | |
| 30 | +| `properties` | The message definition. | |
| 31 | + |
| 32 | +### Sample schemas |
| 33 | + |
| 34 | +The following sample schemas provide examples for defining message schemas in each format. |
| 35 | + |
| 36 | +JSON: |
| 37 | + |
| 38 | +```json |
| 39 | +{ |
| 40 | + "$schema": "http://json-schema.org/draft-07/schema#", |
| 41 | + "name": "foobarbaz", |
| 42 | + "description": "A representation of an event", |
| 43 | + "type": "object", |
| 44 | + "required": [ "dtstart", "summary" ], |
| 45 | + "properties": { |
| 46 | + "summary": { |
| 47 | + "type": "string" |
| 48 | + }, |
| 49 | + "location": { |
| 50 | + "type": "string" |
| 51 | + }, |
| 52 | + "url": { |
| 53 | + "type": "string" |
| 54 | + }, |
| 55 | + "duration": { |
| 56 | + "type": "string", |
| 57 | + "description": "Event duration" |
| 58 | + } |
| 59 | + } |
| 60 | +} |
| 61 | +``` |
| 62 | + |
| 63 | +Delta: |
| 64 | + |
| 65 | +```delta |
| 66 | +{ |
| 67 | + "$schema": "Delta/1.0", |
| 68 | + "type": "object", |
| 69 | + "properties": { |
| 70 | + "type": "struct", |
| 71 | + "fields": [ |
| 72 | + { "name": "asset_id", "type": "string", "nullable": false, "metadata": {} }, |
| 73 | + { "name": "asset_name", "type": "string", "nullable": false, "metadata": {} }, |
| 74 | + { "name": "location", "type": "string", "nullable": false, "metadata": {} }, |
| 75 | + { "name": "manufacturer", "type": "string", "nullable": false, "metadata": {} }, |
| 76 | + { "name": "production_date", "type": "string", "nullable": false, "metadata": {} }, |
| 77 | + { "name": "serial_number", "type": "string", "nullable": false, "metadata": {} }, |
| 78 | + { "name": "temperature", "type": "double", "nullable": false, "metadata": {} } |
| 79 | + ] |
| 80 | + } |
| 81 | +} |
| 82 | +``` |
| 83 | + |
| 84 | +## How dataflows use message schemas |
| 85 | + |
| 86 | +Message schemas are used in all three phases of a dataflow: defining the source input, applying data transformations, and creating the destination output. |
| 87 | + |
| 88 | +### Input schema |
| 89 | + |
| 90 | +Each dataflow source can optionally specify a message schema. If a schema is defined for a dataflow source, any incoming messages that don't match the schema are dropped. |
| 91 | + |
| 92 | +Asset sources have a predefined message schema that was created by the connector for OPC UA. |
| 93 | + |
| 94 | +Schemas can be uploaded for MQTT sources. Currently, Azure IoT Operations supports JSON for source schemas, also known as input schemas. In the operations experience, you can select an existing schema or upload one while defining an MQTT source: |
| 95 | + |
| 96 | +:::image type="content" source="./media/concept-schema-registry/upload-schema.png" alt-text="Screenshot that shows uploading a message schema in the operations experience portal."::: |
| 97 | + |
| 98 | +### Transformation |
| 99 | + |
| 100 | +The operations experience uses the input schema as a starting point for your data, making it easier to select transformations based on the known input message format. |
| 101 | + |
| 102 | +### Output schema |
| 103 | + |
| 104 | +Output schemas are associated with dataflow destinations are only used for dataflows that select local storage, Fabric, Azure Storage (ADLS Gen2), or Azure Data Explorer as the destination endpoint. Currently, Azure IoT Operations experience only supports Parquet output for output schemas. |
| 105 | + |
| 106 | +Note: The Delta schema format is used for both Parquet and Delta output. |
| 107 | + |
| 108 | +For these dataflows, the operations experience applies any transformations to the input schema then creates a new schema in Delta format. When the dataflow custom resource (CR) is created, it includes a `schemaRef` value that points to the generated schema stored in the schema registry. |
0 commit comments