-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[FLINK-38844][pipeline-connector][postgres]Add metadata column support #4202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This commit adds metadata column support for the PostgreSQL Pipeline Connector, enabling users to access metadata information in their data pipelines. Changes: - Add OpTsMetadataColumn for operation timestamp - Add DatabaseNameMetadataColumn for database name - Add SchemaNameMetadataColumn for schema name - Add TableNameMetadataColumn for table name - Update PostgresDataSource to support metadata columns - Add comprehensive E2E test testAllMetadataColumns() - Update documentation (English and Chinese)
yuxiqian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for @tchivs' contribution.
I wonder if we need individual metadata columns for database, schema, and table, since they're always available in Transform expressions (only after FLINK-38840 got closed).
Thanks for the review @yuxiqian! You raise an important point about the overlap with Transform metadata fields. You're right that namespace_name, schema_name, and table_name are already available in Transform expressions. Let me clarify the design rationale:
I see two perspectives here: Argument for keeping them:
Argument for removing them:
My suggestion:
What's your preference? I'm happy to adjust the PR based on the team's direction. |
|
I think it's OK to polish documentations in this PR, leaving metadata definitions as it is. |
…relationship with Transform expressions
Thanks @yuxiqian for the feedback! I've polished the documentation to clarify the relationship between metadata columns and Transform expressions. Changes made:
The metadata definitions remain unchanged as you suggested. |
What is the purpose of the pull request
This PR adds metadata column support for the PostgreSQL Pipeline Connector, enabling users to access metadata information such as operation timestamp, database name, schema name, and table name in their data pipelines.
Brief change log
OpTsMetadataColumn: Operation timestamp metadataDatabaseNameMetadataColumn: Database name metadataSchemaNameMetadataColumn: Schema name metadataTableNameMetadataColumn: Table name metadataPostgresDataSourceto support metadata columns viasupportedMetadataColumns()methodtestAllMetadataColumns()inPostgresFullTypesITCaseVerifying this change
This change added tests and can be verified as follows:
testAllMetadataColumns()E2E test inPostgresFullTypesITCaseDoes this pull request potentially affect one of the following parts:
@Public(Evolving): noDocumentation