You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/guides/duplicate-data.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,5 +16,8 @@ Segment deduplicates on the event's `messageId`, _not_ on the contents of the e
16
16
## Warehouse deduplication
17
17
Duplicate events that are more than 24 hours apart from one another deduplicate in the Warehouse. Segment deduplicates messages going into a Warehouse based on the `messageId`, which is the `id` column in a Segment Warehouse.
18
18
19
+
## Profiles Sync deduplication
20
+
Segment deduplicates Profiles Sync messages based on the `messageId`, which is the `id` column in a Segment Warehouse. Duplicate Profiles Sync events that are more than 24 hours apart from one another deduplicate in the Warehouse.
21
+
19
22
## Data Lake deduplication
20
23
To ensure clean data in your Data Lake, Segment removes duplicate events at the time your Data Lake ingests data. The Data Lake deduplication process dedupes the data the Data Lake syncs within the last 7 days with Segment deduping the data based on the `messageId`.
0 commit comments