-
Notifications
You must be signed in to change notification settings - Fork 121
feat: anonymize user id for batch events processing. #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@sbansal1999 is attempting to deploy a commit to the Databuddy OSS Team on Vercel. A member of the Team first needs to authorize it. |
WalkthroughThe basket route now incorporates security enhancements for anonymous user tracking by introducing salt-based ID hashing. During track event processing, a daily salt is retrieved and applied to sanitized anonymous IDs alongside existing geolocation and user-agent data collection. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (9)**/*.{ts,tsx,js,jsx,vue}📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
Files:
**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
Files:
**/*.{ts,tsx,js,jsx,css,scss,sass,less}📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
Files:
**/*.{ts,tsx,jsx}📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Files:
**/*.{ts,tsx}📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Files:
!(**/pages/_document.{ts,tsx,jsx})**/*.{ts,tsx,jsx}📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Files:
**/*.{ts,tsx,html,css}📄 CodeRabbit inference engine (.cursor/rules/use-bun-instead-of-node-vite-npm-pnpm.mdc)
Files:
**/*.{js,jsx,ts,tsx}📄 CodeRabbit inference engine (.cursor/rules/01-MUST-DO.mdc)
Files:
**/*.{js,ts}📄 CodeRabbit inference engine (.cursor/rules/01-MUST-DO.mdc)
Files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds salt-based anonymization to the Key Changes:
Critical Issue:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant BatchEndpoint
participant processTrackEventData
participant processOutgoingLinkData
participant getDailySalt
participant saltAnonymousId
participant DB
Client->>BatchEndpoint: POST /batch [track events, outgoing_link events]
loop For each track event
BatchEndpoint->>processTrackEventData: process track event
processTrackEventData->>getDailySalt: fetch daily salt
getDailySalt-->>processTrackEventData: return salt
processTrackEventData->>saltAnonymousId: anonymize(anonymousId, salt)
saltAnonymousId-->>processTrackEventData: hashed anonymousId
processTrackEventData-->>BatchEndpoint: anonymized track event
end
loop For each outgoing_link event
BatchEndpoint->>processOutgoingLinkData: process outgoing link
Note over processOutgoingLinkData: ⚠️ Missing: getDailySalt + saltAnonymousId
processOutgoingLinkData-->>BatchEndpoint: NON-anonymized link event
end
BatchEndpoint->>DB: insert batch (track events + link events)
DB-->>BatchEndpoint: success
BatchEndpoint-->>Client: batch response
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (2)
-
apps/basket/src/routes/basket.ts, line 144-156 (link)logic:
processOutgoingLinkDatashould also anonymizeanonymous_idwith salt, consistent withprocessTrackEventDataandinsertOutgoingLink -
apps/basket/src/routes/basket.ts, line 595 (link)logic: after updating
processOutgoingLinkDatato accept salt parameter, pass salt here (fetch it before the loop, similar to howprocessTrackEventDatadoes internally)Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
1 file reviewed, 2 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/basket/src/routes/basket.ts (2)
144-163: Inconsistent anonymization: outgoing links not salted.The
processOutgoingLinkDatafunction directly useslinkData.anonymousIdwithout applying the same salt-based anonymization that was added toprocessTrackEventData. This creates an inconsistency in how anonymous IDs are handled across different event types.If anonymous IDs should be consistently anonymized for privacy/security reasons, apply the same salting logic here. If outgoing links intentionally require different handling, please document why.
🔎 Proposed fix to add consistent anonymization
function processOutgoingLinkData( linkData: any, - clientId: string + clientId: string, + salt: string ): CustomOutgoingLink { const timestamp = parseTimestamp(linkData.timestamp); + + let anonymousId = sanitizeString( + linkData.anonymousId, + VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH + ); + anonymousId = saltAnonymousId(anonymousId, salt); return { id: randomUUID(), client_id: clientId, - anonymous_id: sanitizeString( - linkData.anonymousId, - VALIDATION_LIMITS.SHORT_STRING_MAX_LENGTH - ), + anonymous_id: anonymousId, session_id: validateSessionId(linkData.sessionId), href: sanitizeString(linkData.href, VALIDATION_LIMITS.PATH_MAX_LENGTH), text: sanitizeString(linkData.text, VALIDATION_LIMITS.TEXT_MAX_LENGTH), properties: parseProperties(linkData.properties), timestamp, }; }Then update the call site at line 595 to pass the salt (you'll need to retrieve it similarly to how it's done in
processTrackEventData).
40-45: Use explicit types instead ofany.The
trackDataparameter uses theanytype, which violates the coding guideline: "Do NOT use types 'any', 'unknown' or 'never'. Use proper explicit types". Define a proper interface or type for the track event data structure.As per coding guidelines.
🔎 Suggested type definition
Create a type definition for the track event data:
type TrackEventData = { eventId?: string; anonymousId: string; name: string; sessionId: string; timestamp?: number; sessionStartTime?: number; referrer?: string; path?: string; title?: string; screen_resolution?: string; viewport_size?: string; language?: string; timezone?: string; connection_type?: string; rtt?: number; downlink?: number; time_on_page?: number; scroll_depth?: number; interaction_count?: number; page_count?: number; utm_source?: string; utm_medium?: string; utm_campaign?: string; utm_term?: string; utm_content?: string; load_time?: number; dom_ready_time?: number; dom_interactive?: number; ttfb?: number; connection_time?: number; render_time?: number; redirect_time?: number; domain_lookup_time?: number; properties?: unknown; };Then update the function signature:
function processTrackEventData( - trackData: any, + trackData: TrackEventData, clientId: string, userAgent: string, ip: string ): Promise<AnalyticsEvent> {
📜 Review details
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
apps/basket/src/routes/basket.ts
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{ts,tsx,js,jsx,vue}
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.{ts,tsx,js,jsx,vue}: Never block paste in<input>or<textarea>elements
Enter submits focused text input; in<textarea>, ⌘/Ctrl+Enter submits; Enter adds newline
Compatible with password managers and 2FA; allow pasting one-time codes
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.{ts,tsx,js,jsx}: Trim values to handle text expansion and trailing spaces in form submissions
URL reflects state (deep-link filters/tabs/pagination/expanded panels); prefer libraries likenuqs
Back/Forward buttons restore scroll position
Delay first tooltip in a group; subsequent peers have no delay
Use locale-aware formatting for dates, times, numbers, and currency
Batch layout reads/writes; avoid unnecessary reflows/repaints
Virtualize large lists using libraries likevirtua
**/*.{ts,tsx,js,jsx}: Don't useaccessKeyattribute on any HTML element.
Don't setaria-hidden="true"on focusable elements.
Don't add ARIA roles, states, and properties to elements that don't support them.
Don't use distracting elements like<marquee>or<blink>.
Only use thescopeprop on<th>elements.
Don't assign non-interactive ARIA roles to interactive HTML elements.
Make sure label elements have text content and are associated with an input.
Don't assign interactive ARIA roles to non-interactive HTML elements.
Don't assigntabIndexto non-interactive HTML elements.
Don't use positive integers fortabIndexproperty.
Don't include "image", "picture", or "photo" in img alt prop.
Don't use explicit role property that's the same as the implicit/default role.
Make static elements with click handlers use a valid role attribute.
Always include atitleelement for SVG elements.
Give all elements requiring alt text meaningful information for screen readers.
Make sure anchors have content that's accessible to screen readers.
AssigntabIndexto non-interactive HTML elements witharia-activedescendant.
Include all required ARIA attributes for elements with ARIA roles.
Make sure ARIA properties are valid for the element's supported roles.
Always include atypeattribute for button elements.
Make elements with interactive roles and handlers focusable.
Give heading elements content that's accessible to screen readers (not hidden witharia-hidden).
Always include...
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,js,jsx,css,scss,sass,less}
📄 CodeRabbit inference engine (.cursor/rules/guidelines.mdc)
**/*.{ts,tsx,js,jsx,css,scss,sass,less}: During drag operations, disable text selection and setinerton dragged element and containers
Animations must be interruptible and input-driven; avoid autoplay
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,jsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
**/*.{ts,tsx,jsx}: Use semantic elements instead of role attributes in JSX.
Don't use unnecessary fragments.
Don't pass children as props.
Don't use the return value of React.render.
Make sure all dependencies are correctly specified in React hooks.
Make sure all React hooks are called from the top level of component functions.
Don't forget key props in iterators and collection literals.
Don't destructure props inside JSX components in Solid projects.
Don't define React components inside other components.
Don't use event handlers on non-interactive elements.
Don't assign to React component props.
Don't use bothchildrenanddangerouslySetInnerHTMLprops on the same element.
Don't use dangerous JSX props.
Don't use Array index in keys.
Don't insert comments as text nodes.
Don't assign JSX properties multiple times.
Don't add extra closing tags for components without children.
Use<>...</>instead of<Fragment>...</Fragment>.
Watch out for possible "wrong" semicolons inside JSX elements.
Make sure void (self-closing) elements don't have children.
Don't usetarget="_blank"withoutrel="noopener".
Don't use<img>elements in Next.js projects.
Don't use<head>elements in Next.js projects.
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
**/*.{ts,tsx}: Don't use primitive type aliases or misleading types.
Don't use empty type parameters in type aliases and interfaces.
Don't use any or unknown as type constraints.
Don't return a value from a function with the return type 'void'.
Don't use the TypeScript directive @ts-ignore.
Don't use TypeScript enums.
Don't add type annotations to variables, parameters, and class properties that are initialized with literal expressions.
Don't use TypeScript namespaces.
Don't use non-null assertions with the!postfix operator.
Don't use parameter properties in class constructors.
Don't use user-defined types.
Useas constinstead of literal types and type annotations.
Use eitherT[]orArray<T>consistently.
Initialize each enum member value explicitly.
Useexport typefor types.
Useimport typefor types.
Make sure all enum members are literal values.
Don't use TypeScript const enum.
Don't declare empty interfaces.
Don't let variables evolve into any type through reassignments.
Don't use the any type.
Don't misuse the non-null assertion operator (!) in TypeScript files.
Don't use implicit any type on variable declarations.
Don't merge interfaces and classes unsafely.
Don't use overload signatures that aren't next to each other.
Use the namespace keyword instead of the module keyword to declare TypeScript namespaces.
Use consistent accessibility modifiers on class properties and methods.
Use function types instead of object types with call signatures.
Don't use void type outside of generic or return types.
**/*.{ts,tsx}: Do NOT use types 'any', 'unknown' or 'never'. Use proper explicit types
Suffix functions with 'Action' in types, like 'type Test = { testAction }'
Files:
apps/basket/src/routes/basket.ts
!(**/pages/_document.{ts,tsx,jsx})**/*.{ts,tsx,jsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Don't import next/document outside of pages/_document.jsx in Next.js projects.
Files:
apps/basket/src/routes/basket.ts
**/*.{ts,tsx,html,css}
📄 CodeRabbit inference engine (.cursor/rules/use-bun-instead-of-node-vite-npm-pnpm.mdc)
Use
bun build <file.html|file.ts|file.css>instead ofwebpackoresbuildfor bundling
Files:
apps/basket/src/routes/basket.ts
**/*.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/01-MUST-DO.mdc)
**/*.{js,jsx,ts,tsx}: Split off components, utils, and reusable code to ensure better loading speed and less complexity
Use lower-case-like-this naming convention for variables, functions, and identifiers
NEVER add placeholders, mock data, or anything similar to production code
Use Dayjs for date handling, NEVER use date-fns. Use Tanstack query for hooks, NEVER use SWR
Use json.stringify() when adding debugging code
Never use barrel exports or create index files
Files:
apps/basket/src/routes/basket.ts
**/*.{js,ts}
📄 CodeRabbit inference engine (.cursor/rules/01-MUST-DO.mdc)
Handle complex data transformations independently of React. Keep modules decoupled from React for improved modularity and testability
Files:
apps/basket/src/routes/basket.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Greptile Review
- GitHub Check: Run Tests
🔇 Additional comments (4)
apps/basket/src/routes/basket.ts (4)
18-18: LGTM: Security imports added correctly.The imports for
getDailySaltandsaltAnonymousIdare properly structured and aligned with the PR's objective to anonymize user IDs.
83-83: LGTM: Salted anonymous ID correctly applied.The
anonymous_idfield now uses the salted value, successfully implementing the anonymization feature as intended.
49-53: The salt retrieval is already optimized with Redis caching.The
getDailySalt()function is wrapped with thecacheable()utility, which caches the result in Redis. Since this function takes no arguments, the cache key is deterministic and consistent across all calls within the same batch. The first call will fetch from Redis (or generate and cache the value), and all subsequent calls will be cached hits. No changes needed.
70-74: No changes needed—edge case handling is already properly implemented.The code already handles edge cases gracefully:
sanitizeStringreturns an empty string (not undefined) when input is invalid, sosaltAnonymousIdalways receives a valid string.saltAnonymousIdincludes try-catch error handling with a fallback hash that still produces a valid output if the primary hashing operation fails.- Empty strings are hashed without error, maintaining the data flow integrity.
Description
Anonymize user id for batch events processing similar to how it's being done in insertTrackEvent.
Checklist
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.