-
Notifications
You must be signed in to change notification settings - Fork 476
[common] Add field_id for Nested Row. #2322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c9fa23d to
2e25e3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds field_id support for nested Row types to enable proper schema evolution in Fluss. The changes ensure that nested fields within Row, Array, and Map types maintain globally unique identifiers, allowing the system to correctly track and handle schema changes over time.
Key Changes
- Added field_id tracking to DataField class and extended support to nested structures (RowType, ArrayType, MapType)
- Implemented ReassignFieldId visitor pattern to automatically assign unique field IDs to nested types during schema creation
- Enhanced JSON serialization/deserialization with backward compatibility for schemas without field IDs
- Simplified Projection class to work with field positions instead of IDs, reducing complexity
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| fluss-common/src/main/java/org/apache/fluss/types/DataField.java | Added fieldId parameter to constructors with default value -1 |
| fluss-common/src/main/java/org/apache/fluss/types/DataTypes.java | Added overloaded FIELD methods accepting fieldId parameter |
| fluss-common/src/main/java/org/apache/fluss/types/RowType.java | Added field ID validation, auto-assignment for -1 values, and equalsIgnoreFieldId method |
| fluss-common/src/main/java/org/apache/fluss/types/ReassignFieldId.java | New visitor to recursively reassign field IDs in nested structures |
| fluss-common/src/main/java/org/apache/fluss/metadata/Schema.java | Enhanced validation to check field ID uniqueness across nested fields and updated column() method to reassign nested field IDs |
| fluss-server/src/main/java/org/apache/fluss/server/coordinator/SchemaUpdate.java | Updated addColumn to reassign field IDs for nested types |
| fluss-common/src/main/java/org/apache/fluss/utils/json/DataTypeJsonSerde.java | Added field_id serialization/deserialization with backward compatibility |
| fluss-common/src/main/java/org/apache/fluss/utils/Projection.java | Simplified to work with positions instead of IDs, removing complexity |
| fluss-common/src/main/java/org/apache/fluss/record/FileLogProjection.java | Updated to use field positions for projection |
| fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/utils/FlinkConversions.java | Modified to use column() method which now assigns field IDs |
| fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/source/reader/FlinkSourceSplitReader.java | Added recursive comparison methods to ignore field IDs when comparing schemas |
| fluss-server/src/test/java/org/apache/fluss/server/coordinator/TableManagerITCase.java | Added test for nested row field IDs in schema evolution |
| fluss-flink/fluss-flink-common/src/test/java/org/apache/fluss/flink/utils/FlinkConversionsTest.java | Added comprehensive tests for nested row field ID assignment |
| fluss-flink/fluss-flink-common/src/test/java/org/apache/fluss/flink/sink/FlinkComplexTypeITCase.java | Added integration test for projection and adding columns with nested rows |
| fluss-common/src/test/java/org/apache/fluss/utils/json/DataTypeJsonSerdeTest.java | Added backward compatibility tests and field ID validation tests |
| fluss-common/src/test/java/org/apache/fluss/metadata/TableSchemaTest.java | Added tests for field ID reassignment behavior |
| fluss-common/src/test/java/org/apache/fluss/record/FileLogProjectionTest.java | Updated tests for new projection error messages |
| fluss-client/src/test/java/org/apache/fluss/client/admin/FlussAdminITCase.java | Updated to test nested row column addition with field IDs |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fluss-common/src/test/java/org/apache/fluss/utils/json/DataTypeJsonSerdeTest.java
Outdated
Show resolved
Hide resolved
fluss-common/src/test/java/org/apache/fluss/utils/json/DataTypeJsonSerdeTest.java
Outdated
Show resolved
Hide resolved
fluss-common/src/main/java/org/apache/fluss/types/DataTypes.java
Outdated
Show resolved
Hide resolved
fluss-common/src/test/java/org/apache/fluss/record/FileLogProjectionTest.java
Show resolved
Hide resolved
...ink/fluss-flink-common/src/test/java/org/apache/fluss/flink/sink/FlinkComplexTypeITCase.java
Outdated
Show resolved
Hide resolved
fluss-common/src/test/java/org/apache/fluss/utils/json/DataTypeJsonSerdeTest.java
Outdated
Show resolved
Hide resolved
fluss-common/src/test/java/org/apache/fluss/utils/json/DataTypeJsonSerdeTest.java
Outdated
Show resolved
Hide resolved
fluss-common/src/main/java/org/apache/fluss/metadata/Schema.java
Outdated
Show resolved
Hide resolved
fluss-common/src/main/java/org/apache/fluss/types/ReassignFieldId.java
Outdated
Show resolved
Hide resolved
f3506ae to
45ee54e
Compare
...ink/fluss-flink-common/src/test/java/org/apache/fluss/flink/sink/FlinkComplexTypeITCase.java
Outdated
Show resolved
Hide resolved
fluss-common/src/main/java/org/apache/fluss/utils/json/DataTypeJsonSerde.java
Outdated
Show resolved
Hide resolved
fluss-common/src/main/java/org/apache/fluss/utils/json/DataTypeJsonSerde.java
Outdated
Show resolved
Hide resolved
...-flink-common/src/main/java/org/apache/fluss/flink/source/reader/FlinkSourceSplitReader.java
Outdated
Show resolved
Hide resolved
26285b4 to
55eb34e
Compare
|
@wuchong , I have modified based on comment. Please help check again.Especially, currently, |
Purpose
Linked issue: close #2310 , add field id to nested rows.
Brief change log
All fields are assigned unique, sequential IDs in a flattened order, regardless of nesting level. For example:
Then the field Id for each field is:
Why Flatten Numerical Order?
API and Format
Add field_id to org.apache.fluss.types.DataField.
Documentation