-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Search before asking
- I searched in the issues and found nothing similar.
Paimon version
1.2.0
Compute Engine
flink 1.20
Minimal reproduce step
Create the source changelog table
CREATE TABLE testlog(
id STRING PRIMARY KEY NOT ENFORCED,
f1 STRING,
delete INT
) WITH (
'merge-engine' = 'deduplicate',
'changelog-producer' = 'lookup'
);
-- seed data
INSERT INTO testlog VALUES ('11', '11', 0), ('12', '12', 0);
-- update id=12 to be logically deleted
INSERT INTO testlog VALUES ('12', '12', 1), ('13', '13', 0);
Query source table
SELECT * FROM testlog;
op id f1 delete
+I 11 11 0
+I 12 12 0
-U 12 12 0
+U 12 12 1
+I 13 13 0
Query source table with filter
SELECT * FROM testlog WHERE delete = 0;
op id f1 delete
+I 11 11 0
+I 12 12 0
-U 12 12 0
+I 13 13 0
Note:
After applying the filter delete = 0, the +U 12 12 1 is completely dropped.
Consequently, the only message about id = 12 that reaches the downstream Paimon table is -U 12 12 0.
Filter out logically-deleted rows and write into a partial-update table
CREATE TABLE testlog01 (
id STRING PRIMARY KEY NOT ENFORCED,
f1 STRING,
delete INT
) WITH (
'merge-engine' = 'partial-update',
'partial-update.remove-record-on-delete' = 'true',
'changelog-producer' = 'lookup'
);
INSERT INTO testlog01
SELECT * FROM testlog WHERE delete = 0;
Query the target table
SELECT * FROM testlog01;
What doesn't meet your expectations?
Expected result
op id f1 delete
+I 11 11 0
+I 12 12 0
-D 12 12 0
+I 13 13 0
(id=12 should delete because its last message is -U and no +U reaches the sink)
Actual result
op id f1 delete
+I 11 11 0
+I 12 12 0
-U 12 12 0
+U 12 12 0
+I 13 13 0
The stale row id=12 is still present, breaking data correctness.
Anything else?
No response
Are you willing to submit a PR?
- I'm willing to submit a PR!
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working