Skip to content

Commit 8421122

Browse files
authored
[FLINK-38886] Introduce GenericRecordData and Internal / External converters in flink-cdc-common (#4218)
1 parent a80edd1 commit 8421122

File tree

16 files changed

+2687
-18
lines changed

16 files changed

+2687
-18
lines changed
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: "Type Mappings"
3+
weight: 8
4+
type: docs
5+
aliases:
6+
- /core-concept/type-mappings/
7+
---
8+
<!--
9+
Licensed to the Apache Software Foundation (ASF) under one
10+
or more contributor license agreements. See the NOTICE file
11+
distributed with this work for additional information
12+
regarding copyright ownership. The ASF licenses this file
13+
to you under the Apache License, Version 2.0 (the
14+
"License"); you may not use this file except in compliance
15+
with the License. You may obtain a copy of the License at
16+
17+
http://www.apache.org/licenses/LICENSE-2.0
18+
19+
Unless required by applicable law or agreed to in writing,
20+
software distributed under the License is distributed on an
21+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
22+
KIND, either express or implied. See the License for the
23+
specific language governing permissions and limitations
24+
under the License.
25+
-->
26+
27+
# 类型映射
28+
29+
对于每个 CDC 数据类型(`org.apache.flink.cdc.common.types.DataType` 的子类),我们规定了用于内部序列化和反序列化的 CDC 内部类型,以及用于类型合并、类型转换和 UDF 求值的 Java 外部类型。
30+
31+
## 内部类型和外部类型
32+
33+
一些基本类型对于内部类型和 Java 类可能具有相同的表示形式(例如,`DataTypes.INT()` 使用 `java.lang.Integer` 同时作为内部和外部类型)。
34+
其他类型可能使用不同的表示形式,例如,`DataTypes.TIMESTAMP` 在内部表示中使用 `org.apache.flink.cdc.common.data.TimestampData`,在外部操作中使用 `java.time.LocalDateTime`
35+
36+
如果您正在编写 YAML Pipeline 连接器,`DataChangeEvent` 应当携带内部类型 `RecordData`,并且其所有字段都是内部类型的实例。
37+
38+
如果您正在编写 Transform UDF,则其参数和返回值类型应定义为其外部 Java 类型。
39+
40+
## 完整类型列表
41+
42+
| CDC 数据类型 | CDC 内部类型 | Java 外部类型 |
43+
|--------------------------------|------------------------------------------------------------|-----------------------------------------------------|
44+
| BOOLEAN | `java.lang.Boolean` | `java.lang.Boolean` |
45+
| TINYINT | `java.lang.Byte` | `java.lang.Byte` |
46+
| SMALLINT | `java.lang.Short` | `java.lang.Short` |
47+
| INTEGER | `java.lang.Integer` | `java.lang.Integer` |
48+
| BIGINT | `java.lang.Long` | `java.lang.Long` |
49+
| FLOAT | `java.lang.Float` | `java.lang.Float` |
50+
| DOUBLE | `java.lang.Double` | `java.lang.Double` |
51+
| DECIMAL | `org.apache.flink.cdc.common.data.DecimalData` | `java.math.BigDecimal` |
52+
| DATE | `org.apache.flink.cdc.common.data.DateData` | `java.time.LocalDate` |
53+
| TIME | `org.apache.flink.cdc.common.data.TimeData` | `java.time.LocalTime` |
54+
| TIMESTAMP | `org.apache.flink.cdc.common.data.TimestampData` | `java.time.LocalDateTime` |
55+
| TIMESTAMP_TZ | `org.apache.flink.cdc.common.data.ZonedTimestampData` | `java.time.ZonedDateTime` |
56+
| TIMESTAMP_LTZ | `org.apache.flink.cdc.common.data.LocalZonedTimestampData` | `java.time.Instant` |
57+
| CHAR<br/>VARCHAR<br/>STRING | `org.apache.flink.cdc.common.data.StringData` | `java.lang.String` |
58+
| BINARY<br/>VARBINARY<br/>BYTES | `byte[]` | `byte[]` |
59+
| ARRAY | `org.apache.flink.cdc.common.data.ArrayData` | `java.util.List<T>` |
60+
| MAP | `org.apache.flink.cdc.common.data.MapData` | `java.util.Map<K, V>` |
61+
| ROW | `org.apache.flink.cdc.common.data.RecordData` | `java.util.List<Object>` |
62+
| VARIANT | `org.apache.flink.cdc.common.types.variant.Variant` | `org.apache.flink.cdc.common.types.variant.Variant` |
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: "Type Mappings"
3+
weight: 8
4+
type: docs
5+
aliases:
6+
- /core-concept/type-mappings/
7+
---
8+
<!--
9+
Licensed to the Apache Software Foundation (ASF) under one
10+
or more contributor license agreements. See the NOTICE file
11+
distributed with this work for additional information
12+
regarding copyright ownership. The ASF licenses this file
13+
to you under the Apache License, Version 2.0 (the
14+
"License"); you may not use this file except in compliance
15+
with the License. You may obtain a copy of the License at
16+
17+
http://www.apache.org/licenses/LICENSE-2.0
18+
19+
Unless required by applicable law or agreed to in writing,
20+
software distributed under the License is distributed on an
21+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
22+
KIND, either express or implied. See the License for the
23+
specific language governing permissions and limitations
24+
under the License.
25+
-->
26+
27+
# Type Mappings
28+
29+
For each CDC DataType (subclasses of `org.apache.flink.cdc.common.types.DataType`), we define CDC Internal types for internal serialization & deserialization and Java classes (for type merging, casting, and UDF evaluations).
30+
31+
## Internal and External Types
32+
33+
Primitive types may have the same representation for Internal type and Java classes (`DataTypes.INT()` only uses `java.lang.Integer`).
34+
Other types use different representations, like `DataTypes.TIMESTAMP` uses `org.apache.flink.cdc.common.data.TimestampData` for internal representation and `java.time.LocalDateTime` for external operations.
35+
36+
If you're writing a pipeline source / sink connector, `DataChangeEvent` carries internal type `RecordData`, and all its fields are internal type instances.
37+
38+
If you're writing a UDF, its arguments and return value types should be defined as its external Java type.
39+
40+
## Full Types List
41+
42+
| CDC Data Type | CDC Internal Type | External Java Class |
43+
|--------------------------------|------------------------------------------------------------|-----------------------------------------------------|
44+
| BOOLEAN | `java.lang.Boolean` | `java.lang.Boolean` |
45+
| TINYINT | `java.lang.Byte` | `java.lang.Byte` |
46+
| SMALLINT | `java.lang.Short` | `java.lang.Short` |
47+
| INTEGER | `java.lang.Integer` | `java.lang.Integer` |
48+
| BIGINT | `java.lang.Long` | `java.lang.Long` |
49+
| FLOAT | `java.lang.Float` | `java.lang.Float` |
50+
| DOUBLE | `java.lang.Double` | `java.lang.Double` |
51+
| DECIMAL | `org.apache.flink.cdc.common.data.DecimalData` | `java.math.BigDecimal` |
52+
| DATE | `org.apache.flink.cdc.common.data.DateData` | `java.time.LocalDate` |
53+
| TIME | `org.apache.flink.cdc.common.data.TimeData` | `java.time.LocalTime` |
54+
| TIMESTAMP | `org.apache.flink.cdc.common.data.TimestampData` | `java.time.LocalDateTime` |
55+
| TIMESTAMP_TZ | `org.apache.flink.cdc.common.data.ZonedTimestampData` | `java.time.ZonedDateTime` |
56+
| TIMESTAMP_LTZ | `org.apache.flink.cdc.common.data.LocalZonedTimestampData` | `java.time.Instant` |
57+
| CHAR<br/>VARCHAR<br/>STRING | `org.apache.flink.cdc.common.data.StringData` | `java.lang.String` |
58+
| BINARY<br/>VARBINARY<br/>BYTES | `byte[]` | `byte[]` |
59+
| ARRAY | `org.apache.flink.cdc.common.data.ArrayData` | `java.util.List<T>` |
60+
| MAP | `org.apache.flink.cdc.common.data.MapData` | `java.util.Map<K, V>` |
61+
| ROW | `org.apache.flink.cdc.common.data.RecordData` | `java.util.List<Object>` |
62+
| VARIANT | `org.apache.flink.cdc.common.types.variant.Variant` | `org.apache.flink.cdc.common.types.variant.Variant` |

0 commit comments

Comments
 (0)