You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<Imageimg={cp_step1}alt="Select data source type"size="md"/>
40
40
41
41
### Configure the data source {#3-configure-data-source}
42
42
Fill out the form by providing your ClickPipe with a name, a description (optional), your credentials, and other connection details.
43
43
<Imageimg={cp_step2}alt="Fill out connection details"size="md"/>
44
44
45
45
### Configure a schema registry (optional) {#4-configure-your-schema-registry}
46
-
A valid schema is required for Avro streams and optional for JSON. This schema will be used to parse [AvroConfluent](../../../interfaces/formats.md/#data-format-avro-confluent) or validate JSON messages on the selected topic.
47
-
- Avro messages that cannot be parsed or JSON messages that fail validation will generate an error.
48
-
- The "root" path of the schema registry. For example, a Confluent Cloud schema registry URL is just an HTTPS url with no path, like `https://test-kk999.us-east-2.aws.confluent.cloud` If only the root
49
-
path is specified, the schema used to determine column names and types in step 4 will be determined by the id embedded in the sampled Kafka messages.
50
-
- the path `/schemas/ids/[ID]` to the schema document by the numeric schema id. A complete url using a schema id would be `https://registry.example.com/schemas/ids/1000`
51
-
- the path `/subjects/[subject_name]` to the schema document by subject name. Optionally, a specific version can be referenced by appending `/versions/[version]` to the url (otherwise ClickPipes
52
-
will retrieve the latest version). A complete url using a schema subject would be `https://registry.example.com/subjects/events` or `https://registry/example.com/subjects/events/versions/4`
53
-
54
-
Note that in all cases ClickPipes will automatically retrieve an updated or different schema from the registry if indicated by the schema ID embedded in the message. If the message is written
55
-
without an embedded schema id, then the specific schema ID or subject must be specified to parse all messages.
46
+
A valid schema is required for Avro streams. See [Schema registries](#schema-registries) for more details on how to configure a schema registry.
56
47
57
48
### Configure a reverse private endpoint (optional) {#5-configure-reverse-private-endpoint}
58
49
Configure a Reverse Private Endpoint to allow ClickPipes to connect to your Kafka cluster using AWS PrivateLink.
@@ -87,6 +78,42 @@ Clicking on "Create ClickPipe" will create and run your ClickPipe. It will now b
87
78
88
79
</VerticalStepper>
89
80
81
+
## Schema registries {#schema-registries}
82
+
ClickPipes supports schema registries for Avro data streams.
83
+
84
+
### Supported registries
85
+
Schema registries that use the Confluent Schema Registry API are suppoirted. This includes:
86
+
- Confluent Kafka and Cloud
87
+
- Redpanda
88
+
- AWS MSK
89
+
- Upstash
90
+
91
+
ClickPipes is not currently compatible with the AWS Glue Schema registry or the Azure Schema Registry.
92
+
93
+
### Configuration
94
+
95
+
A schema registry can be configured by when setting up a ClickPipe. This can be configured in one of three ways:
96
+
97
+
1. Providing the root schema registry URL (e.g. `https://registry.example.com`). **This is the preferred method.**
98
+
2. Providing a complete path to the schema id (e.g. `https://registry.example.com/schemas/ids/1000`)
99
+
3. Providing a complete path to the schema subject (e.g. `https://registry.example.com/subjects/events`)
100
+
- Optionally, a specific version can be referenced by appending `/versions/[version]` to the url (otherwise ClickPipes will retrieve the latest version).
101
+
102
+
### How it works
103
+
ClickPipes dynamically retrieves and applies the Avro schema from the configured Schema Registry.
104
+
- If there's a schema id embedded in the message, it will use that to retrieve the schema.
105
+
- If there's no schema id embedded in the message, it will use the schema id or subject name specified in the ClickPipe configuration to retrieve the schema.
106
+
- If the message is written without an embedded schema id, and no schema id or subject name is specified in the ClickPipe configuration, then the schema will not be retrieved and the message will be skipped with a `SOURCE_SCHEMA_ERROR` logged in the ClickPipes errors table.
107
+
- If the message does not conform to the schema, then the message will be skipped with a `DATA_PARSING_ERROR` logged in the ClickPipes errors table.
108
+
109
+
### Schema mapping
110
+
The following rules are applied to the mapping between the retrieved Avro schema and the ClickHouse destination table:
111
+
112
+
- If the Avro schema contains a field that is not included in the ClickHouse destination mapping, that field is ignored.
113
+
- If the Avro schema is missing a field defined in the ClickHouse destination mapping, the ClickHouse column will be populated with a "zero" value, such as 0 or an empty string. Note that DEFAULT expressions are not currently evaluated for ClickPipes inserts (this is temporary limitation pending updates to the ClickHouse server default processing).
114
+
- If the Avro schema field and the ClickHouse column are incompatible, inserts of that row/message will fail, and the failure will be recorded in the ClickPipes errors table. Note that several implicit conversions are supported (like between numeric types), but not all (for example, an Avro record field can not be inserted into an Int32 ClickHouse column).
115
+
116
+
90
117
## Supported data sources {#supported-data-sources}
91
118
92
119
| Name |Logo|Type| Status | Description |
@@ -165,17 +192,6 @@ Nullable types in Avro are defined by using a Union schema of `(T, null)` or `(n
165
192
166
193
ClickPipes does not currently support schemas that contain other Avro Unions (this may change in the future with the maturity of the new ClickHouse Variant and JSON data types). If the Avro schema contains a "non-null" union, ClickPipes will generate an error when attempting to calculate a mapping between the Avro schema and Clickhouse column types.
ClickPipes dynamically retrieves and applies the Avro schema from the configured Schema Registry using the schema ID embedded in each message/event. Schema updates are detected and processed automatically.
171
-
172
-
At this time ClickPipes is only compatible with schema registries that use the [Confluent Schema Registry API](https://docs.confluent.io/platform/current/schema-registry/develop/api.html). In addition to Confluent Kafka and Cloud, this includes the Redpanda, AWS MSK, and Upstash schema registries. ClickPipes is not currently compatible with the AWS Glue Schema registry or the Azure Schema Registry (coming soon).
173
-
174
-
The following rules are applied to the mapping between the retrieved Avro schema and the ClickHouse destination table:
175
-
- If the Avro schema contains a field that is not included in the ClickHouse destination mapping, that field is ignored.
176
-
- If the Avro schema is missing a field defined in the ClickHouse destination mapping, the ClickHouse column will be populated with a "zero" value, such as 0 or an empty string. Note that [DEFAULT](/sql-reference/statements/create/table#default) expressions are not currently evaluated for ClickPipes inserts (this is temporary limitation pending updates to the ClickHouse server default processing).
177
-
- If the Avro schema field and the ClickHouse column are incompatible, inserts of that row/message will fail, and the failure will be recorded in the ClickPipes errors table. Note that several implicit conversions are supported (like between numeric types), but not all (for example, an Avro `record` field can not be inserted into an `Int32` ClickHouse column).
178
-
179
195
## Kafka virtual columns {#kafka-virtual-columns}
180
196
181
197
The following virtual columns are supported for Kafka compatible streaming data sources. When creating a new destination table virtual columns can be added by using the `Add Column` button.
0 commit comments