New Feature - Append+Dedup Event Stream to create a completely deduplicated event stream of data #46865
williamkaper
started this conversation in
Connector Ideas and Features
Replies: 1 comment
-
I don't have the skill to add a new option to the CDK to try this in Postgres, but if the team would be open to doing it I would volunteer to help review and test it. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Today, there are two options for appending data. You can (incrementally or full) append all, which creates a duplicate on any update event, and you can append-dedup, which basically modifies the record using the PKey, leaving only 1 record per unique PKey based on the latest cursor / airbyte synced record.
What is missing is a way to create a deduplicated event stream, where airbyte basically keeps distinct PKEY + Cursor rows. Rows with the same PKEY + Cursor value would be de-duplicated, leaving the one with the OLDEST airbyte_extracted_at date, preserving a clean and noiseless event stream.
PROS:
CONS:
Beta Was this translation helpful? Give feedback.
All reactions