Skip to content

Commit 5ca4bc1

Browse files
committed
feat: de-dupe audit logs in collection pipeline
This change adjusts the NATS stream configuration to support a 10 minute de-duplication window. The NATS message ID has been set to the audit log ID since the ID will be unique across all audit logs.
1 parent 81d55c1 commit 5ca4bc1

File tree

2 files changed

+16
-1
lines changed

2 files changed

+16
-1
lines changed

config/components/nats-streams/audit-stream.yaml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,11 @@ spec:
3434
allowDirect: true
3535

3636
# Deduplication window - prevents duplicate messages
37-
duplicateWindow: 2m
37+
# Extended to 10 minutes to handle webhook retries with exponential backoff
38+
# and Vector restarts. NATS uses message IDs (set to Kubernetes auditID) to
39+
# detect duplicates within this window, providing pipeline-level de-duplication
40+
# before events reach ClickHouse.
41+
duplicateWindow: 10m
3842

3943
# Maximum number of consumers
4044
maxConsumers: 10

config/components/vector-sidecar/vector-sidecar-hr.yaml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,17 @@ spec:
121121
healthcheck:
122122
enabled: true
123123

124+
# NATS message ID for de-duplication
125+
# Uses the Kubernetes auditID as the NATS message ID to enable
126+
# JetStream's duplicate detection within the duplicateWindow (10m)
127+
# This prevents duplicate events from webhook retries or Vector restarts
128+
message_id:
129+
type: vrl
130+
source: |
131+
# Use the Kubernetes audit ID as the NATS message ID
132+
# NATS will reject duplicates within the duplicateWindow
133+
to_string!(.auditID)
134+
124135
# Buffer for durability
125136
# 10GB disk buffer to survive NATS outages
126137
buffer:

0 commit comments

Comments
 (0)