You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/components/pages/processors/schema_registry_decode.adoc
+56-15Lines changed: 56 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,9 +22,12 @@ Common::
22
22
--
23
23
24
24
```yml
25
-
# Common config fields, showing default values
25
+
# Common configuration fields, showing default values
26
26
label: ""
27
27
schema_registry_decode:
28
+
avro:
29
+
raw_unions: false # No default (optional)
30
+
preserve_logical_types: false
28
31
url: "" # No default (required)
29
32
```
30
33
@@ -34,10 +37,12 @@ Advanced::
34
37
--
35
38
36
39
```yml
37
-
# All config fields, showing default values
40
+
# All configuration fields, showing default values
38
41
label: ""
39
42
schema_registry_decode:
40
-
avro_raw_json: false
43
+
avro:
44
+
raw_unions: false
45
+
preserve_logical_types: false
41
46
url: "" # No default (required)
42
47
oauth:
43
48
enabled: false
@@ -66,36 +71,72 @@ schema_registry_decode:
66
71
--
67
72
======
68
73
69
-
Decodes messages automatically from a schema stored within a https://docs.confluent.io/platform/current/schema-registry/index.html[Confluent Schema Registry service^] by extracting a schema ID from the message and obtaining the associated schema from the registry. If a message fails to match against the schema then it will remain unchanged and the error can be caught using xref:configuration:error_handling.adoc[errorhandling methods].
74
+
Decodes messages automatically from a schema stored within a https://docs.confluent.io/platform/current/schema-registry/index.html[Confluent Schema Registry service^] by extracting a schema ID from the message and obtaining the associated schema from the registry. If a message fails to match against the schema then it will remain unchanged and the error can be caught using xref:configuration:error_handling.adoc[error-handling methods].
70
75
71
76
Avro, Protobuf and JSON schemas are supported, all are capable of expanding from schema references as of v4.22.0.
72
77
73
78
== Avro JSON format
74
79
75
-
This processor creates documents formatted as https://avro.apache.org/docs/current/specification/_print/#json-encoding[Avro JSON^] when decoding with Avro schemas. In this format the value of a union is encoded in JSON as follows:
80
+
By default, this processor expects documents formatted as https://avro.apache.org/docs/current/specification/[Avro JSON^] when decoding with Avro schemas. In this format, the value of a union is encoded in JSON as follows:
76
81
77
-
- if its type is `null`, then it is encoded as a JSON `null`;
78
-
- otherwise it is encoded as a JSON object with one name/value pair whose name is the type's name and whose value is the recursivelyencoded value. For Avro's named types (record, fixed or enum) the user-specified name is used, for other types the type name is used.
82
+
- If the union's type is `null`, it is encoded as a JSON `null`.
83
+
- Otherwise, the union is encoded as a JSON object with one name/value pair. The name is the type's name, and the value is the recursively-encoded value. The user-specified name is used for Avro's named types (record, fixed, or enum). For other types, the type name is used.
79
84
80
-
For example, the union schema `["null","string","Foo"]`, where `Foo` is a record name, would encode:
85
+
For example, the union schema `["null","string","Transaction"]`, where `Transaction` is a record name, would encode:
81
86
82
-
- `null` as `null`;
83
-
- the string `"a"` as `\{"string": "a"}`; and
84
-
- a `Foo` instance as `\{"Foo": {...}}`, where `{...}` indicates the JSON encoding of a `Foo` instance.
87
+
- `null` as a JSON `null`
88
+
- The string `"a"` as `{"string": "a"}`
89
+
- A `Transaction` instance as `{"Transaction": {...}}`, where `{...}` indicates the JSON encoding of a `Transaction` instance
85
90
86
-
However, it is possible to instead create documents in https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard/raw JSON format^] by setting the field <<avro_raw_json, `avro_raw_json`>> to `true`.
91
+
Alternatively, you can create documents in https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard/raw JSON format^] by setting the field <<avro-raw_unions,`avro.raw_unions`>> to `true`.
87
92
88
93
== Protobuf format
89
94
90
-
This processor decodes protobuf messages to JSON documents, you can read more about JSON mapping of protobuf messages here: https://developers.google.com/protocol-buffers/docs/proto3#json
95
+
This processor decodes Protobuf messages to JSON documents. For more information about the JSON mapping of Protobuf messages, see the https://developers.google.com/protocol-buffers/docs/proto3#json[Protocol Buffers documentation^].
91
96
97
+
== Metadata
98
+
99
+
This processor adds the following metadata to processed messages:
100
+
101
+
- `schema_id`: The ID of the schema in the schema registry associated with the message.
92
102
93
103
== Fields
94
104
95
-
=== `avro_raw_json`
105
+
=== `avro.raw_unions`
106
+
107
+
Whether Avro messages should be decoded into normal JSON (JSON that meets the expectations of regular internet JSON) rather than https://avro.apache.org/docs/current/specification/[Avro JSON^].
108
+
109
+
If set to `false`, Avro messages are decoded as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodec[Avro JSON^].
110
+
111
+
For example, the union schema `["null","string","Transaction"]`, where `Transaction` is a record name, would be decoded as:
112
+
113
+
- A `null` as a JSON `null`
114
+
- The string `"a"` as `{"string": "a"}`
115
+
- A `Transaction` instance as `{"Transaction": {...}}`, where `{...}` indicates the JSON encoding of a `Transaction` instance.
116
+
117
+
If set to `true`, Avro messages are decoded as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard JSON^].
118
+
119
+
For example, the same union schema `["null","string","Transaction"]` is decoded as:
120
+
121
+
- A `null` as JSON `null`
122
+
- The string `"a"` as `"a"`
123
+
- A `Transaction` instance as `{...}`, where `{...}` indicates the JSON encoding of a `Transaction` instance.
124
+
125
+
For more details on the difference between standard JSON and Avro JSON, see the https://github.com/linkedin/goavro/blob/5ec5a5ee7ec82e16e6e2b438d610e1cab2588393/union.go#L224-L249[comment in Goavro^] and the https://github.com/linkedin/goavro[underlying library used for Avro serialization^].
126
+
127
+
128
+
*Type*: `bool`
129
+
130
+
*Default*: `false`
131
+
132
+
=== `avro.preserve_logical_types`
133
+
134
+
Choose whether to:
96
135
97
-
Whether Avro messages should be decoded into normal JSON ("json that meets the expectations of regular internet json") rather than https://avro.apache.org/docs/current/specification/_print/#json-encoding[Avro JSON^]. If `true` the schema returned from the subject should be decoded as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard json^] instead of as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodec[avro json^]. There is a https://github.com/linkedin/goavro/blob/5ec5a5ee7ec82e16e6e2b438d610e1cab2588393/union.go#L224-L249[comment in goavro^], the https://github.com/linkedin/goavro[underlining library used for avro serialization^], that explains in more detail the difference between the standard json and avro json.
136
+
- Transform logical types into their primitive type (default). For example, decimals become raw bytes and timestamps become plain integers.
Copy file name to clipboardExpand all lines: modules/components/pages/processors/schema_registry_encode.adoc
+32-15Lines changed: 32 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ Common::
25
25
--
26
26
27
27
```yml
28
-
# Common config fields, showing default values
28
+
# Common configuration fields, showing default values
29
29
label: ""
30
30
schema_registry_encode:
31
31
url: "" # No default (required)
@@ -39,7 +39,7 @@ Advanced::
39
39
--
40
40
41
41
```yml
42
-
# All config fields, showing default values
42
+
# All configuration fields, showing default values
43
43
label: ""
44
44
schema_registry_encode:
45
45
url: "" # No default (required)
@@ -75,36 +75,36 @@ schema_registry_encode:
75
75
76
76
Encodes messages automatically from schemas obtains from a https://docs.confluent.io/platform/current/schema-registry/index.html[Confluent Schema Registry service^] by polling the service for the latest schema version for target subjects.
77
77
78
-
If a message fails to encode under the schema then it will remain unchanged and the error can be caught using xref:configuration:error_handling.adoc[errorhandling methods].
78
+
If a message fails to encode under the schema then it will remain unchanged and the error can be caught using xref:configuration:error_handling.adoc[error-handling methods].
79
79
80
-
Avro, Protobuf and Json schemas are supported, all are capable of expanding from schema references as of v4.22.0.
80
+
Avro, Protobuf and JSON schemas are supported, all are capable of expanding from schema references as of v4.22.0.
81
81
82
82
== Avro JSON format
83
83
84
-
By default this processor expects documents formatted as https://avro.apache.org/docs/current/specification/_print/#json-encoding[Avro JSON^] when encoding with Avro schemas. In this format the value of a union is encoded in JSON as follows:
84
+
By default, this processor expects documents formatted as https://avro.apache.org/docs/current/specification/[Avro JSON^] when encoding with Avro schemas. In this format, the value of a union is encoded in JSON as follows:
85
85
86
-
- if its type is `null`, then it is encoded as a JSON `null`;
87
-
- otherwise it is encoded as a JSON object with one name/value pair whose name is the type's name and whose value is the recursivelyencoded value. For Avro's named types (record, fixed or enum) the user-specified name is used, for other types the type name is used.
86
+
- If the union's type is `null`, it is encoded as a JSON `null`.
87
+
- Otherwise, the union is encoded as a JSON object with one name/value pair. The name is the type's name, and the value is the recursively-encoded value. The user-specified name is used for Avro's named types (record, fixed, or enum). For other types, the type name is used.
88
88
89
-
For example, the union schema `["null","string","Foo"]`, where `Foo` is a record name, would encode:
89
+
For example, the union schema `["null","string","Transaction"]`, where `Transaction` is a record name, would encode:
90
90
91
-
- `null` as `null`;
92
-
- the string `"a"` as `\{"string": "a"}`; and
93
-
- a `Foo` instance as `\{"Foo": {...}}`, where `{...}` indicates the JSON encoding of a `Foo` instance.
91
+
- A `null` as a JSON `null`
92
+
- The string `"a"` as `{"string": "a"}`
93
+
- A `Transaction` instance as `{"Transaction": {...}}`, where `{...}` indicates the JSON encoding of a `Transaction` instance
94
94
95
-
However, it is possible to instead consume documents in https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard/raw JSON format^] by setting the field <<avro_raw_json,`avro_raw_json`>> to `true`.
95
+
Alternatively, you can consume documents in https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard/raw JSON format^] by setting the field <<avro_raw_json,`avro_raw_json`>> to `true`.
96
96
97
97
=== Known issues
98
98
99
99
Important! There is an outstanding issue in the https://github.com/linkedin/goavro[avro serializing library^] that Redpanda Connect uses which means it https://github.com/linkedin/goavro/issues/252[doesn't encode logical types correctly^]. It's still possible to encode logical types that are in-line with the spec if `avro_raw_json` is set to true, though now of course non-logical types will not be in-line with the spec.
100
100
101
101
== Protobuf format
102
102
103
-
This processor encodes protobuf messages either from any format parsed within Redpanda Connect (encoded as JSON by default), or from raw JSON documents, you can read more about JSON mapping of protobuf messages here: https://developers.google.com/protocol-buffers/docs/proto3#json
103
+
This processor encodes Protobuf messages either from any format parsed within Redpanda Connect (encoded as JSON by default), or from raw JSON documents. For more information about the JSON mapping of Protobuf messages, see the https://developers.google.com/protocol-buffers/docs/proto3#json[Protocol Buffers documentation^].
104
104
105
105
=== Multiple message support
106
106
107
-
When a target subject presents a protobuf schema that contains multiple messages it becomes ambiguous which message definition a given input data should be encoded against. In such scenarios Redpanda Connect will attempt to encode the data against each of them and select the first to successfully match against the data, this process currently *ignores all nested message definitions*. In order to speed up this exhaustive search the last known successful message will be attempted first for each subsequent input.
107
+
When a target subject presents a Protobuf schema that contains multiple messages it becomes ambiguous which message definition a given input data should be encoded against. In such scenarios Redpanda Connect will attempt to encode the data against each of them and select the first to successfully match against the data, this process currently *ignores all nested message definitions*. In order to speed up this exhaustive search the last known successful message will be attempted first for each subsequent input.
108
108
109
109
We will be considering alternative approaches in future so please https://redpanda.com/slack[get in touch^] with thoughts and feedback.
110
110
@@ -155,8 +155,25 @@ refresh_period: 1h
155
155
156
156
=== `avro_raw_json`
157
157
158
-
Whether messages encoded in Avro format should be parsed as normal JSON ("json that meets the expectations of regular internet json") rather than https://avro.apache.org/docs/current/specification/_print/#json-encoding[Avro JSON^]. If `true` the schema returned from the subject should be parsed as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard json^] instead of as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodec[avro json^]. There is a https://github.com/linkedin/goavro/blob/5ec5a5ee7ec82e16e6e2b438d610e1cab2588393/union.go#L224-L249[comment in goavro^], the https://github.com/linkedin/goavro[underlining library used for avro serialization^], that explains in more detail the difference between standard json and avro json.
158
+
Whether Avro messages should be parsed as normal JSON (JSON that meets the expectations of regular internet JSON) rather than https://avro.apache.org/docs/current/specification/[Avro JSON^].
159
159
160
+
If set to `false`, the schema returned from the subject is parsed as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodec[Avro JSON^].
161
+
162
+
For example, the union schema `["null","string","Transaction"]`, where `Transaction` is a record name, would be decoded as:
163
+
164
+
- A `null` as a JSON `null`
165
+
- The string `"a"` as `{"string": "a"}`
166
+
- A `Transaction` instance as `{"Transaction": {...}}`, where `{...}` indicates the JSON encoding of a `Transaction` instance.
167
+
168
+
If set to `true`, the schema returned from the subject is parsed as https://pkg.go.dev/github.com/linkedin/goavro/v2#NewCodecForStandardJSONFull[standard JSON^].
169
+
170
+
For example, the same union schema `["null","string","Transaction"]` is decoded as:
171
+
172
+
- A `null` as JSON `null`
173
+
- The string `"a"` as `"a"`
174
+
- A `Transaction` instance as `{...}`, where `{...}` indicates the JSON encoding of a `Transaction` instance.
175
+
176
+
For more details on the difference between standard JSON and Avro JSON, see the https://github.com/linkedin/goavro/blob/5ec5a5ee7ec82e16e6e2b438d610e1cab2588393/union.go#L224-L249[comment in Goavro^] and the https://github.com/linkedin/goavro[underlying library used for Avro serialization^].
0 commit comments