Skip to content

[Bug] Partial-update merge-function keeps stale row when the last message is a lone UPDATE_BEFORE (-U) #6862

@sandyfog

Description

@sandyfog

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

1.2.0

Compute Engine

flink 1.20

Minimal reproduce step

Create the source changelog table

CREATE TABLE testlog(
  id STRING PRIMARY KEY NOT ENFORCED,
  f1 STRING,
  delete INT
) WITH (
  'merge-engine' = 'deduplicate',
  'changelog-producer' = 'lookup'
);

-- seed data
INSERT INTO testlog VALUES ('11', '11', 0), ('12', '12', 0);

-- update id=12 to be logically deleted
INSERT INTO testlog VALUES ('12', '12', 1), ('13', '13', 0);

Query source table

SELECT * FROM testlog;

op   id   f1   delete
+I   11   11   0
+I   12   12   0
-U   12   12   0
+U   12   12   1
+I   13   13   0

Query source table with filter

SELECT * FROM testlog WHERE delete = 0;

op   id   f1   delete
+I    11   11   0
+I    12   12   0
-U    12   12   0
+I    13   13   0

Note:
After applying the filter delete = 0, the +U 12 12 1 is completely dropped.
Consequently, the only message about id = 12 that reaches the downstream Paimon table is -U 12 12 0.

Filter out logically-deleted rows and write into a partial-update table

CREATE TABLE testlog01 (
  id STRING PRIMARY KEY NOT ENFORCED,
  f1 STRING,
  delete INT
) WITH (
  'merge-engine' = 'partial-update',
  'partial-update.remove-record-on-delete' = 'true',
  'changelog-producer' = 'lookup'
);

INSERT INTO testlog01
SELECT * FROM testlog WHERE delete = 0;

Query the target table

SELECT * FROM testlog01;

What doesn't meet your expectations?

Expected result

op   id   f1   delete
+I   11   11   0
+I   12   12   0
-D   12   12   0
+I   13   13   0

(id=12 should delete because its last message is -U and no +U reaches the sink)

Actual result

op   id   f1   delete
+I   11   11   0
+I   12   12   0
-U   12   12   0
+U   12   12   0
+I    13   13   0

The stale row id=12 is still present, breaking data correctness.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions