feat: defaults should be None to not rewrite already present topic sc…#25967
feat: defaults should be None to not rewrite already present topic sc…#25967schulzrol wants to merge 1 commit intoopen-metadata:mainfrom
Conversation
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
🔍 CI failure analysis for 2abc658: All CI failures are due to a missing 'safe to test' label on the PR. This is a security gate that blocks CI execution before any code is built or tested.IssueAll CI jobs are failing at the label verification step before any actual testing or building occurs. Root CauseThe OpenMetadata CI pipeline requires PRs to have a "safe to test" label as a security measure to prevent running untrusted code. All failing jobs show: DetailsThe label check runs before any code execution:
None of these jobs proceeded to actual code validation because the label gate blocked them. Relationship to Code ChangesThese failures are completely unrelated to the PR changes. The PR modifies one line in Code Review
|
| Auto-apply | Compact |
|
|
Was this helpful? React with 👍 / 👎 | Gitar
| topic.messageSchema = Topic( | ||
| schemaText="", schemaType=SchemaType.Other, schemaFields=[] | ||
| ) | ||
| topic.messageSchema = Topic() |
There was a problem hiding this comment.
⚠️ Bug: Topic() still overwrites existing schema; doesn't achieve PR goal
The PR's stated goal is to preserve already-present topic schema metadata when the connector doesn't find a schema in the registry. However, topic.messageSchema = Topic() still creates a non-null Topic object with default values (schemaText=None, schemaType="None", schemaFields=[]).
Looking at the server-side TopicUpdater.entitySpecificUpdate() in TopicRepository.java (line 607), the update logic checks if (updated.getMessageSchema() != null) — and since Topic() is not null, the server will proceed to compare and record changes:
schemaText:nullvs existing → records a change, clears existing schema textschemaType:"None"vs existing (e.g.,"Avro") → records a change, overwrites typeschemaFields:[]vs existing fields → records a change, clears existing fields
This means existing enriched schema metadata will still be overwritten on subsequent connector runs, just with slightly different default values than before.
To truly preserve existing metadata when no schema is found, messageSchema should not be set at all (leaving it as None on the CreateTopicRequest), so that the server-side null check skips the schema update entirely:
else:
pass # No schema found; leave messageSchema unset to preserve existing metadataOr equivalently, simply remove the entire else block so topic.messageSchema remains at its default None value from the CreateTopicRequest initialization.
Was this helpful? React with 👍 / 👎
Describe your changes:
I worked on software that enriches schemas for Topics metadata already present by/after the provided OpenMetadata Kafka messaging service connector
The issue i was having was with the fact that the versioning for such registered-then-enriched topics didn't correspond to changes to the schema or topic, before setting topic schema to empty strings/lists, it should left already present metadata alone and only supply what it knows (could be configurable to always rewrite with empty or just patch with Kafka metadata without schema)
Type of change:
Checklist:
I have read the CONTRIBUTING document.
My PR title is
Fixes <issue-number>: <short explanation>I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.
I have added tests around the new logic.
For connector/ingestion changes: I updated the documentation.