Skip to content

Commit c28836d

Browse files
committed
First pass
1 parent 0cb8f3c commit c28836d

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/guides/duplicate-data.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22
title: Handling Duplicate Data
33
---
44

5-
Segment guarantees that 99% of your data won't have duplicates within a 24 hour look-back window. Warehouses and Data Lakes also have their own secondary deduplication process to ensure you store clean data.
5+
Segment guarantees that 99% of your data won't have duplicates within a 24 hour or longer look-back window. Warehouses and Data Lakes also have their own secondary deduplication process to ensure you store clean data.
66

77
## 99% deduplication
88

9-
Segment has a special deduplication service that sits behind the `api.segment.com` endpoint and attempts to drop 99% of duplicate data. Segment stores 24 hours worth of event `messageId`s, allowing Segment to deduplicate any data that appears within a 24 hour rolling window.
9+
Segment has a special deduplication service that sits behind the `api.segment.com` endpoint and attempts to drop 99% of duplicate data. Segment stores at least 24 hours worth of event `messageId`s, allowing Segment to deduplicate any data that appears within a 24 hour rolling window.
1010

1111
Segment deduplicates on the event's `messageId`, _not_ on the contents of the event payload. Segment doesn't have a built-in way to deduplicate data over periods longer than 24 hours or for events that don't generate `messageId`s.
1212

0 commit comments

Comments
 (0)