Skip to content

Commit 0986e58

Browse files
committed
consolidating redundant info
1 parent e8700d2 commit 0986e58

File tree

1 file changed

+1
-4
lines changed

1 file changed

+1
-4
lines changed

src/guides/duplicate-data.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,7 @@ Segment deduplicates on the event's `messageId`, _not_ on the contents of the e
1414
> Keep in mind that Segment's libraries all generate `messageId`s for each event payload, with the exception of the Segment HTTP API, which assigns each event a unique `messageId` when the message is ingested. You can override these default generated IDs and manually assign a `messageId` if necessary.
1515
1616
## Warehouse deduplication
17-
Duplicate events that are more than 24 hours apart from one another deduplicate in the Warehouse. Segment deduplicates messages going into a Warehouse based on the `messageId`, which is the `id` column in a Segment Warehouse.
18-
19-
## Profiles Sync deduplication
20-
Segment deduplicates Profiles Sync messages based on the `messageId`, which is the `id` column in a Segment Warehouse. Duplicate Profiles Sync events that are more than 24 hours apart from one another deduplicate in the Warehouse.
17+
Duplicate events that are more than 24 hours apart from one another deduplicate in the Warehouse. Segment deduplicates messages going into a Warehouse ([including Profiles Sync data](/docs/profiles/profiles-sync/)) based on the `messageId`, which is the `id` column in a Segment Warehouse.
2118

2219
## Data Lake deduplication
2320
To ensure clean data in your Data Lake, Segment removes duplicate events at the time your Data Lake ingests data. The Data Lake deduplication process dedupes the data the Data Lake syncs within the last 7 days with Segment deduping the data based on the `messageId`.

0 commit comments

Comments
 (0)