Skip to content

fix(source-mysql): handle null GTID in saved CDC state and improve error message#75530

Draft
devin-ai-integration[bot] wants to merge 3 commits intomasterfrom
devin/1774615530-fix-mysql-cdc-gtid-npe-error-message
Draft

fix(source-mysql): handle null GTID in saved CDC state and improve error message#75530
devin-ai-integration[bot] wants to merge 3 commits intomasterfrom
devin/1774615530-fix-mysql-cdc-gtid-npe-error-message

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 27, 2026

What

Resolves https://github.com/airbytehq/airbyte-internal-issues/issues/16114

When a MySQL CDC connection has a saved state where the GTID field is a JSON null, Jackson's NullNode.asText() returns the literal string "null" instead of actual null. This propagates into Debezium's MySqlGtidSet, which then throws a NullPointerException during GTID comparison. The NPE bypasses connector-level error handling and surfaces to the user as:

Incumbent CDC state is invalid, reason: java.lang.NullPointerException: Cannot invoke "io.debezium.connector.mysql.gtid.MySqlGtidSet$UUIDSet.getUUID()" because "other" is null

How

Two-layer fix in MySqlSourceDebeziumOperations.kt:

  1. Root cause fix in parseSavedOffset(): Filter out the literal "null" string (and blank strings) from the GTID value using takeUnless { it == "null" || it.isBlank() }, so it becomes actual null before any Debezium API call.

  2. Defensive catch in validate(): Wrap the GTID comparison block in a try-catch(NullPointerException) that routes through abortCdcSync() with a descriptive message. This ensures that even if other malformed GTID edge cases exist, the error is handled gracefully and respects the user's invalidCdcCursorPositionBehavior setting.

Note: An initial version of this PR also updated the CDK-level error template in CdcPartitionsCreator.kt, but this was reverted because the repo enforces that a single PR cannot modify both the bulk CDK and a connector. A follow-up CDK PR could improve the "Incumbent CDC state is invalid, reason: ..." template to remove jargon.

Review guide

  1. MySqlSourceDebeziumOperations.kt — the two substantive changes (root cause fix in parseSavedOffset + defensive catch in validate)
  2. metadata.yaml — version bump 3.51.5 → 3.51.6
  3. docs/integrations/sources/mysql.md — changelog entry (the table whitespace reformatting is from the automated bump_version_in_repo tool, no content changes to old rows)

Recommended reviewer checks:

  • Is catching NullPointerException too broad? It's scoped to just the GTID comparison block, but could mask a real NPE in queryPurgedIds() or MySqlGtidSet.subtract().
  • Verify that Jackson's NullNode.asText() does return "null" (string literal) — this is the assumed root cause.
  • No new tests added — the error path requires mocking Debezium internals. Acceptable risk?

User Impact

Users who encounter a null/malformed GTID in their saved CDC state will now see:

Incumbent CDC state is invalid, reason: Saved CDC replication state contains a malformed GTID set that is incompatible with the current MySQL server

Instead of the raw Java NPE stack trace. The outer "Incumbent CDC state is invalid, reason: ..." prefix comes from the CDK template (unchanged in this PR) — a follow-up CDK PR could clean that up.

The fix also respects invalidCdcCursorPositionBehavior, so if the user has configured the connector to reset on invalid state, it will do so automatically rather than failing.

Can this PR be safely reverted and rolled back?

  • YES 💚

Link to Devin session: https://app.devin.ai/sessions/e0b3f31174ef43b79ef5f10473488203

…ror message

- Fix parseSavedOffset() to treat literal "null" string from NullNode.asText() as absent
- Add try-catch around GTID comparison in validate() to handle NPEs from Debezium gracefully
- Route GTID NPE through abortCdcSync() so invalidCdcCursorPositionBehavior setting is respected
- Improve user-facing error message in CdcPartitionsCreator to remove implementation leakage
- Bump source-mysql version 3.51.5 -> 3.51.6

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
    • /bump-progressive-rollout-version - Bumps connector version with an RC suffix (2.16.10-rc.1) for progressive rollouts (enableProgressiveRollout: true).
      • Example: /bump-progressive-rollout-version changelog="Add new feature for progressive rollout"
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

The airbyte monorepo enforces that a single PR cannot modify both
the bulk CDK and a connector. Reverting CdcPartitionsCreator.kt change;
the connector-level fix in MySqlSourceDebeziumOperations.kt is sufficient.

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 27, 2026

source-mysql Connector Test Results

0 tests   0 ✅  0s ⏱️
0 suites  0 💤
0 files    0 ❌

Results for commit f9458cb.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 27, 2026

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-8hx23c1n1-airbyte-growth.vercel.app

Built with commit f9458cb.
This pull request is being automatically deployed with vercel-action

@github-actions
Copy link
Copy Markdown
Contributor

Deploy preview for airbyte-kotlin-cdk ready!

✅ Preview
https://airbyte-kotlin-ic3zi09bi-airbyte-growth.vercel.app

Built with commit 7bf7886.
This pull request is being automatically deployed with vercel-action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant