Skip to content

[Feature] Support branch merge for append-only tables #7863

@JunRuiLee

Description

@JunRuiLee

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Paimon branch management supports create, delete, rename, and fast-forward, but cannot incrementally merge one branch into another. Users need a non-destructive way to bring new data from a source branch into a
target branch while keeping both branches intact.

Solution

Add merge_branch for append-only tables. It merges data files that exist only in the source branch into the target branch and skips files already present in the target.

Correctness depends on source and target branches sharing a pure-append history. If compaction or INSERT OVERWRITE is allowed, old files may be rewritten or removed from the target branch, and file-level diff can no longer reliably tell whether a source file represents new data or data that has already been rewritten in the target. To make this invariant explicit, branch merge is guarded by the immutable table option 'branch-merge.enabled' = 'true'. Tables created with this option reject compaction and INSERT OVERWRITE from the beginning, and deletion vectors are not supported.

PR Plan

  • PR1: Core APIs, merge implementation, REST endpoint, and unit tests.
  • PR2: Flink/Spark procedures, Flink Action, integration tests, and docs.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions