-
Notifications
You must be signed in to change notification settings - Fork 121
Anonymize user id for outgoing links data. #261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: staging
Are you sure you want to change the base?
Conversation
|
@sbansal1999 is attempting to deploy a commit to the Databuddy OSS Team on Vercel. A member of the Team first needs to authorize it. |
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Walkthrough
Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryApplied anonymization to outgoing link events using daily rotating salt, matching the pattern already implemented for track events in PR #260.
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant basket.ts
participant event-service.ts
participant security.ts
participant Redis
participant Kafka
Client->>basket.ts: POST /batch (outgoing_link)
basket.ts->>basket.ts: processOutgoingLinkData()
basket.ts->>security.ts: getDailySalt()
security.ts->>Redis: get salt:currentDay
alt Salt exists
Redis-->>security.ts: return cached salt
else Salt missing
security.ts->>security.ts: generate new salt
security.ts->>Redis: setex salt:currentDay
Redis-->>security.ts: OK
end
security.ts-->>basket.ts: return salt
basket.ts->>basket.ts: sanitizeString(anonymousId)
alt anonymousId is truthy
basket.ts->>security.ts: saltAnonymousId(anonymousId, salt)
security.ts->>security.ts: sha256(anonymousId + salt)
security.ts-->>basket.ts: hashed anonymousId
end
basket.ts->>basket.ts: build CustomOutgoingLink object
basket.ts->>event-service.ts: insertOutgoingLinksBatch()
event-service.ts->>Kafka: sendEventBatch("analytics-outgoing-links")
Kafka-->>event-service.ts: success
event-service.ts-->>basket.ts: success
basket.ts-->>Client: 200 OK
Note over Client,Kafka: Single event path (POST / or /px.jpg)
Client->>event-service.ts: insertOutgoingLink()
event-service.ts->>security.ts: checkDuplicate()
security.ts->>Redis: set dedup:outgoing_link:eventId
Redis-->>security.ts: result
security.ts-->>event-service.ts: isDuplicate
alt not duplicate
event-service.ts->>security.ts: getDailySalt()
security.ts-->>event-service.ts: salt
event-service.ts->>event-service.ts: sanitizeString(anonymousId)
alt anonymousId is truthy
event-service.ts->>security.ts: saltAnonymousId(anonymousId, salt)
security.ts-->>event-service.ts: hashed anonymousId
end
event-service.ts->>Kafka: sendEvent("analytics-outgoing-links")
Kafka-->>event-service.ts: success
end
event-service.ts-->>Client: success
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (4)
-
apps/basket/src/routes/basket.ts, line 145 (link)logic: missing imports for
getDailySaltandsaltAnonymousId- this will cause runtime error -
apps/basket/src/routes/basket.ts, line 151 (link)logic: missing imports for
saltAnonymousId- this will cause runtime error -
apps/basket/src/routes/basket.ts, line 1-37 (link)syntax: add missing imports from
@lib/security -
apps/basket/src/routes/basket.ts, line 139-163 (link)logic: inconsistent anonymization -
processOutgoingLinkData(used by/batchendpoint) anonymizes user IDs, butinsertOutgoingLinkinevent-service.ts(used by/px.jpgand/endpoints) doesn't anonymize. this creates a data inconsistency where some outgoing link events have anonymized IDs and others don't. should the anonymization logic be moved to insertOutgoingLink in event-service.ts to ensure all endpoints anonymize consistently?
1 file reviewed, 4 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/basket/src/routes/basket.ts (1)
139-163: Inconsistent anonymization across event processing paths creates data correlation issues.The salting of
anonymous_idis applied inconsistently across different event ingestion routes:
insertTrackEvent(single event): applies saltingprocessTrackEventData(batch): does not apply saltinginsertOutgoingLink(single event): does not apply saltingprocessOutgoingLinkData(batch): applies saltingThis means the same user tracked via different routes will have different
anonymous_idvalues—some salted, some not—breaking cross-event correlation and analytics capabilities.Fix: Apply salting consistently across all anonymous_id handling, regardless of ingestion route (single vs. batch) and event type. Alternatively, if different handling is intentional, document the rationale in code comments.
📜 Review details
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
apps/basket/src/routes/basket.ts
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{ts,tsx,js,jsx,vue}
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.{ts,tsx,js,jsx,vue}: Never block paste in<input>or<textarea>elements
Enter submits focused text input; in<textarea>, ⌘/Ctrl+Enter submits; Enter adds newline
Compatible with password managers and 2FA; allow pasting one-time codes
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.{ts,tsx,js,jsx}: Trim values to handle text expansion and trailing spaces in form submissions
URL reflects state (deep-link filters/tabs/pagination/expanded panels); prefer libraries likenuqs
Back/Forward buttons restore scroll position
Delay first tooltip in a group; subsequent peers have no delay
Use locale-aware formatting for dates, times, numbers, and currency
Batch layout reads/writes; avoid unnecessary reflows/repaints
Virtualize large lists using libraries likevirtua
**/*.{ts,tsx,js,jsx}: Don't useaccessKeyattribute on any HTML element.
Don't setaria-hidden="true"on focusable elements.
Don't add ARIA roles, states, and properties to elements that don't support them.
Don't use distracting elements like<marquee>or<blink>.
Only use thescopeprop on<th>elements.
Don't assign non-interactive ARIA roles to interactive HTML elements.
Make sure label elements have text content and are associated with an input.
Don't assign interactive ARIA roles to non-interactive HTML elements.
Don't assigntabIndexto non-interactive HTML elements.
Don't use positive integers fortabIndexproperty.
Don't include "image", "picture", or "photo" in img alt prop.
Don't use explicit role property that's the same as the implicit/default role.
Make static elements with click handlers use a valid role attribute.
Always include atitleelement for SVG elements.
Give all elements requiring alt text meaningful information for screen readers.
Make sure anchors have content that's accessible to screen readers.
AssigntabIndexto non-interactive HTML elements witharia-activedescendant.
Include all required ARIA attributes for elements with ARIA roles.
Make sure ARIA properties are valid for the element's supported roles.
Always include atypeattribute for button elements.
Make elements with interactive roles and handlers focusable.
Give heading elements content that's accessible to screen readers (not hidden witharia-hidden).
Always include...
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,js,jsx,css,scss,sass,less}
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.{ts,tsx,js,jsx,css,scss,sass,less}: During drag operations, disable text selection and setinerton dragged element and containers
Animations must be interruptible and input-driven; avoid autoplay
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,jsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
**/*.{ts,tsx,jsx}: Use semantic elements instead of role attributes in JSX.
Don't use unnecessary fragments.
Don't pass children as props.
Don't use the return value of React.render.
Make sure all dependencies are correctly specified in React hooks.
Make sure all React hooks are called from the top level of component functions.
Don't forget key props in iterators and collection literals.
Don't destructure props inside JSX components in Solid projects.
Don't define React components inside other components.
Don't use event handlers on non-interactive elements.
Don't assign to React component props.
Don't use bothchildrenanddangerouslySetInnerHTMLprops on the same element.
Don't use dangerous JSX props.
Don't use Array index in keys.
Don't insert comments as text nodes.
Don't assign JSX properties multiple times.
Don't add extra closing tags for components without children.
Use<>...</>instead of<Fragment>...</Fragment>.
Watch out for possible "wrong" semicolons inside JSX elements.
Make sure void (self-closing) elements don't have children.
Don't usetarget="_blank"withoutrel="noopener".
Don't use<img>elements in Next.js projects.
Don't use<head>elements in Next.js projects.
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
**/*.{ts,tsx}: Don't use primitive type aliases or misleading types.
Don't use empty type parameters in type aliases and interfaces.
Don't use any or unknown as type constraints.
Don't return a value from a function with the return type 'void'.
Don't use the TypeScript directive @ts-ignore.
Don't use TypeScript enums.
Don't add type annotations to variables, parameters, and class properties that are initialized with literal expressions.
Don't use TypeScript namespaces.
Don't use non-null assertions with the!postfix operator.
Don't use parameter properties in class constructors.
Don't use user-defined types.
Useas constinstead of literal types and type annotations.
Use eitherT[]orArray<T>consistently.
Initialize each enum member value explicitly.
Useexport typefor types.
Useimport typefor types.
Make sure all enum members are literal values.
Don't use TypeScript const enum.
Don't declare empty interfaces.
Don't let variables evolve into any type through reassignments.
Don't use the any type.
Don't misuse the non-null assertion operator (!) in TypeScript files.
Don't use implicit any type on variable declarations.
Don't merge interfaces and classes unsafely.
Don't use overload signatures that aren't next to each other.
Use the namespace keyword instead of the module keyword to declare TypeScript namespaces.
Use consistent accessibility modifiers on class properties and methods.
Use function types instead of object types with call signatures.
Don't use void type outside of generic or return types.
**/*.{ts,tsx}: Do NOT use types 'any', 'unknown' or 'never'. Use proper explicit types
Suffix functions with 'Action' in types, like 'type Test = { testAction }'
Files:
apps/basket/src/routes/basket.ts
!(**/pages/_document.{ts,tsx,jsx})**/*.{ts,tsx,jsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Don't import next/document outside of pages/_document.jsx in Next.js projects.
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,html,css}
📄 CodeRabbit inference engine (.cursor/rules/use-bun-instead-of-node-vite-npm-pnpm.mdc)
Use
bun build <file.html|file.ts|file.css>instead ofwebpackoresbuildfor bundling
Files:
apps/basket/src/routes/basket.ts
**/*.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/01-MUST-DO.mdc)
**/*.{js,jsx,ts,tsx}: Split off components, utils, and reusable code to ensure better loading speed and less complexity
Use lower-case-like-this naming convention for variables, functions, and identifiers
NEVER add placeholders, mock data, or anything similar to production code
Use Dayjs for date handling, NEVER use date-fns. Use Tanstack query for hooks, NEVER use SWR
Use json.stringify() when adding debugging code
Never use barrel exports or create index files
Files:
apps/basket/src/routes/basket.ts
**/*.{js,ts}
📄 CodeRabbit inference engine (.cursor/rules/01-MUST-DO.mdc)
Handle complex data transformations independently of React. Keep modules decoupled from React for improved modularity and testability
Files:
apps/basket/src/routes/basket.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Greptile Review
|
@greptile review this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 files reviewed, 3 comments
apps/basket/src/routes/basket.ts
Outdated
| let anonymousId = sanitizeString( | ||
| linkData.anonymousId, | ||
| VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH | ||
| ); | ||
| anonymousId = saltAnonymousId(anonymousId, salt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing empty check before salting anonymous ID. The insertTrackEvent function in event-service.ts (lines 64-70) checks if anonymousId is truthy before calling saltAnonymousId(). Without this check, an empty string will be hashed with the salt, creating a hash of just the salt, which could lead to data quality issues.
| let anonymousId = sanitizeString( | |
| linkData.anonymousId, | |
| VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH | |
| ); | |
| anonymousId = saltAnonymousId(anonymousId, salt); | |
| let anonymousId = sanitizeString( | |
| linkData.anonymousId, | |
| VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH | |
| ); | |
| if (anonymousId) { | |
| anonymousId = saltAnonymousId(anonymousId, salt); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/basket/src/routes/basket.ts
Line: 152:156
Comment:
Missing empty check before salting anonymous ID. The `insertTrackEvent` function in `event-service.ts` (lines 64-70) checks if `anonymousId` is truthy before calling `saltAnonymousId()`. Without this check, an empty string will be hashed with the salt, creating a hash of just the salt, which could lead to data quality issues.
```suggestion
let anonymousId = sanitizeString(
linkData.anonymousId,
VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH
);
if (anonymousId) {
anonymousId = saltAnonymousId(anonymousId, salt);
}
```
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (2)
Prompt To Fix With AIThis is a comment left during a code review.
Path: apps/basket/src/routes/basket.ts
Line: 70:74
Comment:
Missing empty check before salting anonymous ID in `processTrackEventData`. The `insertTrackEvent` function in `event-service.ts` (lines 64-70) checks if `anonymousId` is truthy before calling `saltAnonymousId()`. This function should follow the same pattern for consistency.
```suggestion
let anonymousId = sanitizeString(
trackData.anonymousId,
VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH
);
if (anonymousId) {
anonymousId = saltAnonymousId(anonymousId, salt);
}
```
How can I resolve this? If you propose a fix, please make it concise.
This creates a data inconsistency where:
This will make it impossible to correlate events properly in analytics and violates the stated goal of anonymizing user IDs for outgoing links. Recommendation: The anonymization logic should be moved into Prompt To Fix With AIThis is a comment left during a code review.
Path: apps/basket/src/routes/basket.ts
Line: 144:168
Comment:
Critical inconsistency: This PR only anonymizes user IDs for outgoing links coming through the `/batch` endpoint. However, outgoing link events from the `/px.jpg` endpoint (line 198) and `/` endpoint (line 443) call `insertOutgoingLink()` directly, which does NOT anonymize the `anonymous_id` (see `event-service.ts` lines 207-210).
This creates a data inconsistency where:
- Events from `/batch` → anonymized IDs (hashed with daily salt)
- Events from `/px.jpg` and `/` → raw, non-anonymized IDs
This will make it impossible to correlate events properly in analytics and violates the stated goal of anonymizing user IDs for outgoing links.
**Recommendation**: The anonymization logic should be moved into `insertOutgoingLink()` in `event-service.ts` to ensure all endpoints handle user IDs consistently, similar to how `insertTrackEvent()` already handles anonymization.
How can I resolve this? If you propose a fix, please make it concise. |
|
check greptile's comments if possible @sbansal1999 |
|
@greptile review this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
apps/basket/src/routes/basket.ts, line 146-172 (link)logic: Incomplete anonymization:
insertOutgoingLinkinevent-service.ts:183-226doesn't anonymizeanonymousId(it only sanitizes on lines 207-210). This creates inconsistent behavior where:/batchendpoint → usesprocessOutgoingLinkData→ anonymizes ✓/and/px.jpgendpoints → useinsertOutgoingLink→ doesn't anonymize ✗
This means outgoing link data sent via single events still has raw
anonymousIdvalues, defeating the purpose of this PR.// In event-service.ts insertOutgoingLink, add: const salt = await getDailySalt(); let anonymousId = sanitizeString( linkData.anonymousId, VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH ); if (anonymousId) { anonymousId = saltAnonymousId(anonymousId, salt); }
1 file reviewed, 1 comment
|
@greptile review this PR |
|
@izadoesdev this can be merged now |
Description
Anonymize user id for outgoing links data.
Checklist
Summary by CodeRabbit
New Features
Bug Fixes
Behavior
✏️ Tip: You can customize this high-level summary in your review settings.
Note
Strengthens privacy for outgoing link events by salting
anonymous_idwith a daily salt and avoiding salting empty IDs.event-service.tsandroutes/basket.ts, deriveanonymous_idfromanonymousIdonly after retrieving the daily salt and applysaltAnonymousIdconditionally when the ID existsprocessOutgoingLinkDatais nowasyncto fetch the daily salt; outgoing link event construction updated accordinglyWritten by Cursor Bugbot for commit b886d1b. This will update automatically on new commits. Configure here.