Skip to content

fix(connectors): Add unwrap_envelope transform and envelope detection to fix source-sink format mismatch#3197

Open
atharvalade wants to merge 2 commits into
apache:masterfrom
atharvalade:fix/shared-source-sink-contract
Open

fix(connectors): Add unwrap_envelope transform and envelope detection to fix source-sink format mismatch#3197
atharvalade wants to merge 2 commits into
apache:masterfrom
atharvalade:fix/shared-source-sink-contract

Conversation

@atharvalade
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #3174

Rationale

Sources (e.g. Postgres) wrap row data in a DatabaseRecord envelope while sinks (e.g. Iceberg) expect flat JSON matching the target table schema — no shared contract exists, producing silent null failures.

What changed?

The Postgres source emits {table_name, operation_type, timestamp, data: {...}, old_data} envelopes, but the Iceberg sink's Arrow JSON reader maps these nested structures to top-level fields as null, silently violating non-nullable constraints.

This adds a reusable unwrap_envelope transform to the connector SDK that extracts a nested field (e.g. data) and promotes it as the top-level payload, plus explicit envelope detection in the Iceberg sink that errors with an actionable message instead of failing silently.

Local Execution

  • Passed
  • Pre-commit hooks ran (fmt, clippy, license-headers, trailing-whitespace, trailing-newline all clean; 119 tests pass across SDK + integration suites)

AI Usage

  1. Opus 4.6
  2. used for codebase exploration and following existing transform patterns
  3. All 8 new unit tests pass locally, clippy/fmt clean, existing 111 tests unaffected
  4. Yes, all code can be explained

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 91.27907% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.37%. Comparing base (611fca0) to head (8defa26).
⚠️ Report is 36 commits behind head on master.

Files with missing lines Patch % Lines
...nectors/sdk/src/transforms/json/unwrap_envelope.rs 96.52% 5 Missing ⚠️
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 60.00% 3 Missing and 1 partial ⚠️
core/connectors/sdk/src/transforms/mod.rs 0.00% 3 Missing ⚠️
...e/connectors/sdk/src/transforms/unwrap_envelope.rs 80.00% 3 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             master    #3197       +/-   ##
=============================================
- Coverage     74.10%   58.37%   -15.73%     
  Complexity      943      943               
=============================================
  Files          1159     1160        +1     
  Lines        102033    92080     -9953     
  Branches      79083    69130     -9953     
=============================================
- Hits          75607    53755    -21852     
- Misses        23765    35741    +11976     
+ Partials       2661     2584       -77     
Components Coverage Δ
Rust Core 54.28% <91.27%> (-21.05%) ⬇️
Java SDK 60.14% <ø> (ø)
C# SDK 69.38% <ø> (ø)
Python SDK 81.43% <ø> (ø)
Node SDK 91.53% <ø> (ø)
Go SDK 39.43% <ø> (ø)
Files with missing lines Coverage Δ
core/connectors/sdk/src/transforms/json/mod.rs 53.84% <ø> (ø)
core/connectors/sdk/src/transforms/mod.rs 29.62% <0.00%> (-3.71%) ⬇️
...e/connectors/sdk/src/transforms/unwrap_envelope.rs 80.00% <80.00%> (ø)
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 40.71% <60.00%> (+1.48%) ⬆️
...nectors/sdk/src/transforms/json/unwrap_envelope.rs 96.52% <96.52%> (ø)

... and 248 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@atharvalade atharvalade changed the title Add unwrap_envelope transform and envelope detection to fix source-sink format mismatch fix(connectors): Add unwrap_envelope transform and envelope detection to fix source-sink format mismatch Apr 29, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR.

Thank you for your contribution!

@github-actions github-actions Bot added stale Inactive issue or pull request and removed stale Inactive issue or pull request labels May 7, 2026
@hubcio
Copy link
Copy Markdown
Contributor

hubcio commented May 14, 2026

@atharvalade please rebase this PR

/author

@github-actions github-actions Bot added the S-waiting-on-author PR is waiting on author response label May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author PR is waiting on author response

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No shared contract between source output and sink input format

2 participants