Skip to content

feat(google_adwords_remarketing_lists): add hashing validation for audience data#5045

Open
shekhar-rudder wants to merge 1 commit intodevelopfrom
feat/int-6064-add-hashing-validation-for-google-ads-remarketing-lists
Open

feat(google_adwords_remarketing_lists): add hashing validation for audience data#5045
shekhar-rudder wants to merge 1 commit intodevelopfrom
feat/int-6064-add-hashing-validation-for-google-ads-remarketing-lists

Conversation

@shekhar-rudder
Copy link
Member

Summary

  • Add validateHashingConsistency checks for hashable fields (email, phone, etc.) in both EventStream and rETL record flows for the Google Ads Remarketing Lists destination
  • Pass workspaceId and destinationId through the call chain to enable hashing validation (detects pre-hashed data sent with hashing enabled, and plaintext data sent with hashing disabled)
  • Add integration tests covering pre-hashed input with isHashRequired: true and plaintext input with isHashRequired: false scenarios

Test plan

  • Existing router tests pass
  • New test: pre-hashed email/phone with isHashRequired: true returns 400 with appropriate error when AUDIENCE_HASHING_VALIDATION_ENABLED=true
  • New test: plaintext email/phone with isHashRequired: false returns 400 with appropriate error when AUDIENCE_HASHING_VALIDATION_ENABLED=true
  • TypeScript types updated to include ID on destination and workspaceId on metadata

🤖 Generated with Claude Code

@shekhar-rudder shekhar-rudder requested a review from a team as a code owner March 16, 2026 11:38
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 16, 2026

Note

.coderabbit.yaml has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)
Validation error: Unrecognized key(s) in object: 'auto_resolve_threads'
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f429c0e0-049e-4fd6-9909-edae30d39411

📥 Commits

Reviewing files that changed from the base of the PR and between 4f9ad79 and 944d804.

⛔ Files ignored due to path filters (2)
  • test/integrations/destinations/google_adwords_remarketing_lists/router/data.ts is excluded by !**/test/**
  • test/integrations/destinations/google_adwords_remarketing_lists/router/record.ts is excluded by !**/test/**
📒 Files selected for processing (4)
  • src/v0/destinations/google_adwords_remarketing_lists/recordTransform.ts
  • src/v0/destinations/google_adwords_remarketing_lists/transform.ts
  • src/v0/destinations/google_adwords_remarketing_lists/types.ts
  • src/v0/destinations/google_adwords_remarketing_lists/util.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/v0/destinations/google_adwords_remarketing_lists/recordTransform.ts

Summary by CodeRabbit

  • New Features
    • Payloads and processing now include workspace context and destination identifiers for more accurate, scoped handling.
  • Bug Fixes
    • Added hashing consistency validation for Google Ads Remarketing Lists identifiers to improve data accuracy.
  • Chores
    • Tightened metadata and destination typing to enforce workspace and destination ID presence.

Walkthrough

Threads workspaceId and destination ID through Google Ads Remarketing Lists transform and record pipelines, updates type signatures, and adds per-field hashing-consistency runtime validation during identifier population.

Changes

Cohort / File(s) Summary
Types
src/v0/destinations/google_adwords_remarketing_lists/types.ts
Added ID: string to destination shapes and narrowed metadata to { workspaceId: string }.
Transform & Record Transform
src/v0/destinations/google_adwords_remarketing_lists/transform.ts, src/v0/destinations/google_adwords_remarketing_lists/recordTransform.ts
Threaded workspaceId through createPayload, processEvent, process, and processRecordEventArray call paths; updated signatures and metadata typing to carry workspace context.
Identifier Population & Validation
src/v0/destinations/google_adwords_remarketing_lists/util.ts
Extended populateIdentifiers and populateIdentifiersForRecordEvent signatures to accept workspaceId and destinationId; added per-field validateHashingConsistency checks using an audienceDest context derived from workspaceId, destinationId, type, and config.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the primary change: adding hashing validation for audience data in the Google Ads Remarketing Lists destination.
Description check ✅ Passed The description provides clear explanation of changes, test plans, and technical implementation details, covering most template requirements comprehensively.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/int-6064-add-hashing-validation-for-google-ads-remarketing-lists
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…dience data

Add validateHashingConsistency checks for hashable fields (email, phone, etc.)
in both EventStream and rETL record flows. Passes workspaceId and destinationId
to enable consistent hashing validation across the destination.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shekhar-rudder shekhar-rudder force-pushed the feat/int-6064-add-hashing-validation-for-google-ads-remarketing-lists branch from 4f9ad79 to 944d804 Compare March 16, 2026 11:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/v0/destinations/google_adwords_remarketing_lists/util.ts (1)

99-103: Consider extracting duplicated hash-field validation into a shared helper.

The same hashAttributes.forEach(...) + validateHashingConsistency(...) block appears twice. A local helper would reduce drift risk and keep behavior aligned across both flows.

Also applies to: 148-152

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/v0/destinations/google_adwords_remarketing_lists/util.ts` around lines 99
- 103, Extract the repeated block that iterates hashAttributes and calls
validateHashingConsistency into a small helper (e.g.,
validateElementHashAttributes or validateHashAttributesForElement) and call it
from both places where the duplication occurs; the helper should accept the
element and audienceDest (and reference the existing hashAttributes and
validateHashingConsistency) and perform the same guard (if element[field]) and
String conversion before calling validateHashingConsistency to preserve existing
behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/v0/destinations/google_adwords_remarketing_lists/recordTransform.ts`:
- Around line 114-115: The code currently reads a single workspaceId from
metadata (workspaceId = metadata.workspaceId) and reuses it for the whole batch,
causing cross-workspace mixing when building identifiers; change the logic in
recordTransform to derive the workspace context per record (use each
record/event's metadata.workspaceId) or pre-group the batch by
metadata.workspaceId before calling the identifier-building code (the places to
update are the workspaceId read and the identifier construction block that
currently reuses workspaceId around the 135-140 region), ensuring each
identifier is built with the correct per-record workspaceId rather than a single
batch-wide value.

In `@src/v0/destinations/google_adwords_remarketing_lists/transform.ts`:
- Line 107: The processEvent function's metadata type is too narrow—update the
metadata type used by processEvent to include the secret field (e.g., change
from { workspaceId: string } to { workspaceId: string; secret: string } or an
appropriate Metadata interface) so that getAccessToken(metadata, 'access_token')
can safely destructure metadata.secret; ensure any related type aliases or
parameter signatures for processEvent reflect this widened shape.

In `@src/v0/destinations/google_adwords_remarketing_lists/types.ts`:
- Around line 40-42: The RecordInput.metadata type is too narrow (only {
workspaceId: string }) but getAccessToken expects a secret map; update the
metadata shape in types.ts to include the secret property (e.g., metadata: {
workspaceId: string; secret: Record<string, string | undefined> } or a suitably
named SecretMap type) so calls in recordTransform (getAccessToken usage) and
transform that destructure { secret } and index by accessTokenKey are correctly
typed; ensure any existing references are adjusted for the new metadata shape.

---

Nitpick comments:
In `@src/v0/destinations/google_adwords_remarketing_lists/util.ts`:
- Around line 99-103: Extract the repeated block that iterates hashAttributes
and calls validateHashingConsistency into a small helper (e.g.,
validateElementHashAttributes or validateHashAttributesForElement) and call it
from both places where the duplication occurs; the helper should accept the
element and audienceDest (and reference the existing hashAttributes and
validateHashingConsistency) and perform the same guard (if element[field]) and
String conversion before calling validateHashingConsistency to preserve existing
behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f67ab2b0-2010-4a6c-8b57-d2aa4d0acf4a

📥 Commits

Reviewing files that changed from the base of the PR and between b9b3411 and 4f9ad79.

⛔ Files ignored due to path filters (2)
  • test/integrations/destinations/google_adwords_remarketing_lists/router/data.ts is excluded by !**/test/**
  • test/integrations/destinations/google_adwords_remarketing_lists/router/record.ts is excluded by !**/test/**
📒 Files selected for processing (4)
  • src/v0/destinations/google_adwords_remarketing_lists/recordTransform.ts
  • src/v0/destinations/google_adwords_remarketing_lists/transform.ts
  • src/v0/destinations/google_adwords_remarketing_lists/types.ts
  • src/v0/destinations/google_adwords_remarketing_lists/util.ts

Comment on lines +114 to +115
const workspaceId = metadata.workspaceId;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Workspace context is taken from only the first event in the batch.

At Line 114 and Line 135-140, a single workspaceId is reused for all records. This can mis-attribute hashing inconsistency validation for mixed-workspace batches.

Please derive/pass workspace context per record (or pre-group by workspace before building identifiers).

Also applies to: 135-140

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/v0/destinations/google_adwords_remarketing_lists/recordTransform.ts`
around lines 114 - 115, The code currently reads a single workspaceId from
metadata (workspaceId = metadata.workspaceId) and reuses it for the whole batch,
causing cross-workspace mixing when building identifiers; change the logic in
recordTransform to derive the workspace context per record (use each
record/event's metadata.workspaceId) or pre-group the batch by
metadata.workspaceId before calling the identifier-building code (the places to
update are the workspaceId read and the identifier construction block that
currently reuses workspaceId around the 135-140 region), ensuring each
identifier is built with the correct per-record workspaceId rather than a single
batch-wide value.


const processEvent = async (
metadata: Record<string, unknown>,
metadata: { workspaceId: string },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd "transform.ts" -p "*google_adwords_remarketing_lists*"

Repository: rudderlabs/rudder-transformer

Length of output: 194


🏁 Script executed:

fd "transform.ts" | grep "google_adwords_remarketing_lists"

Repository: rudderlabs/rudder-transformer

Length of output: 210


🏁 Script executed:

wc -l "src/v0/destinations/google_adwords_remarketing_lists/transform.ts"

Repository: rudderlabs/rudder-transformer

Length of output: 142


🏁 Script executed:

cat -n "src/v0/destinations/google_adwords_remarketing_lists/transform.ts"

Repository: rudderlabs/rudder-transformer

Length of output: 7717


🏁 Script executed:

rg "getAccessToken" -A 5 -B 2 "src/v0/util/index.js" | head -50

Repository: rudderlabs/rudder-transformer

Length of output: 699


🏁 Script executed:

rg "getAccessToken" -B 5 "const getAccessToken" src/v0/util/index.js | head -30

Repository: rudderlabs/rudder-transformer

Length of output: 1021


🏁 Script executed:

rg "const getAccessToken = " src/v0/util/index.js -A 10

Repository: rudderlabs/rudder-transformer

Length of output: 549


🏁 Script executed:

rg "simpleProcessRouterDest" src/v0/util/index.js -A 20 | head -40

Repository: rudderlabs/rudder-transformer

Length of output: 1393


processEvent metadata type omits the secret field required by getAccessToken.

At line 107, the function is typed as { workspaceId: string }, but line 130 calls getAccessToken(metadata, 'access_token'), which destructures metadata.secret and accesses the token key. Widen the type to match the actual metadata shape.

🔧 Proposed fix
 const processEvent = async (
-  metadata: { workspaceId: string },
+  metadata: { workspaceId: string; [key: string]: unknown },
   message: Message,
   destination: GARLDestination,
 ) => {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
metadata: { workspaceId: string },
const processEvent = async (
metadata: { workspaceId: string; [key: string]: unknown },
message: Message,
destination: GARLDestination,
) => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/v0/destinations/google_adwords_remarketing_lists/transform.ts` at line
107, The processEvent function's metadata type is too narrow—update the metadata
type used by processEvent to include the secret field (e.g., change from {
workspaceId: string } to { workspaceId: string; secret: string } or an
appropriate Metadata interface) so that getAccessToken(metadata, 'access_token')
can safely destructure metadata.secret; ensure any related type aliases or
parameter signatures for processEvent reflect this widened shape.

Comment on lines +40 to +42
metadata: {
workspaceId: string
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify metadata typing vs downstream metadata key usage in GARL flow.
rg -nP --type=ts 'metadata:\s*\{\s*workspaceId:\s*string' src/v0/destinations/google_adwords_remarketing_lists/types.ts
rg -nP --type=ts 'getAccessToken\(\s*metadata\s*,\s*["'\'']access_token["'\'']\s*\)' src/v0/destinations/google_adwords_remarketing_lists

Repository: rudderlabs/rudder-transformer

Length of output: 348


🏁 Script executed:

#!/bin/bash
# Find all metadata key accesses in google_adwords_remarketing_lists destination
rg -n 'metadata\.' src/v0/destinations/google_adwords_remarketing_lists/ --type=ts | head -30

Repository: rudderlabs/rudder-transformer

Length of output: 348


🏁 Script executed:

#!/bin/bash
# Check where metadata is passed from and what keys it contains
rg -n 'RecordInput' src/v0/destinations/google_adwords_remarketing_lists/types.ts -A 5 -B 5

Repository: rudderlabs/rudder-transformer

Length of output: 393


🏁 Script executed:

#!/bin/bash
# Check the actual types.ts file to see the complete metadata typing and context
cat src/v0/destinations/google_adwords_remarketing_lists/types.ts

Repository: rudderlabs/rudder-transformer

Length of output: 1491


🏁 Script executed:

#!/bin/bash
# Find getAccessToken function definition
rg -n 'function getAccessToken|const getAccessToken|export.*getAccessToken' --type=ts -A 3 | head -20

Repository: rudderlabs/rudder-transformer

Length of output: 55


🏁 Script executed:

#!/bin/bash
# Search for all metadata accesses in GARL destination files more comprehensively
rg -n 'metadata\[|metadata\.' src/v0/destinations/google_adwords_remarketing_lists/ --type=ts

Repository: rudderlabs/rudder-transformer

Length of output: 348


🏁 Script executed:

#!/bin/bash
# Find where getAccessToken is imported from
rg -n 'import.*getAccessToken|from.*getAccessToken' src/v0/destinations/google_adwords_remarketing_lists/ --type=ts

Repository: rudderlabs/rudder-transformer

Length of output: 55


🏁 Script executed:

#!/bin/bash
# Search for getAccessToken function definition across codebase
rg -n 'function getAccessToken|const getAccessToken.*=|export.*getAccessToken' --type=ts -A 5

Repository: rudderlabs/rudder-transformer

Length of output: 55


🏁 Script executed:

#!/bin/bash
# View top of recordTransform.ts to see imports
head -20 src/v0/destinations/google_adwords_remarketing_lists/recordTransform.ts

Repository: rudderlabs/rudder-transformer

Length of output: 698


🏁 Script executed:

#!/bin/bash
# View top of transform.ts to see imports
head -20 src/v0/destinations/google_adwords_remarketing_lists/transform.ts

Repository: rudderlabs/rudder-transformer

Length of output: 806


🏁 Script executed:

#!/bin/bash
# Broader search for getAccessToken anywhere in codebase
rg 'getAccessToken' --type=ts | head -20

Repository: rudderlabs/rudder-transformer

Length of output: 514


🏁 Script executed:

#!/bin/bash
# Find getAccessToken in util directory
rg 'getAccessToken' src/v0/util --type=ts -A 5 | head -30

Repository: rudderlabs/rudder-transformer

Length of output: 55


🏁 Script executed:

#!/bin/bash
# List files in util directory to understand structure
ls -la src/v0/util/ | head -20

Repository: rudderlabs/rudder-transformer

Length of output: 1205


🏁 Script executed:

#!/bin/bash
# Search for getAccessToken without type filter
rg 'getAccessToken' src/v0/util -A 5

Repository: rudderlabs/rudder-transformer

Length of output: 933


🏁 Script executed:

#!/bin/bash
# Check the util/index.js file
wc -l src/v0/util/index.js

Repository: rudderlabs/rudder-transformer

Length of output: 98


🏁 Script executed:

#!/bin/bash
# View full getAccessToken implementation
rg 'const getAccessToken = ' src/v0/util/index.js -A 20

Repository: rudderlabs/rudder-transformer

Length of output: 1132


RecordInput.metadata is over-constrained and hides required auth properties.

At Line 40-42, typing metadata as only { workspaceId: string } conflicts with downstream usage. The getAccessToken function (imported from src/v0/util/index.js and called in recordTransform.ts:112 and transform.ts:130) destructures { secret } from metadata and accesses secret[accessTokenKey] to retrieve the access token. This required secret property is not represented in the type definition, weakening type-safety and allowing invalid call construction that fails at runtime.

🔧 Proposed fix
   metadata: {
     workspaceId: string;
+    [key: string]: unknown;
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/v0/destinations/google_adwords_remarketing_lists/types.ts` around lines
40 - 42, The RecordInput.metadata type is too narrow (only { workspaceId: string
}) but getAccessToken expects a secret map; update the metadata shape in
types.ts to include the secret property (e.g., metadata: { workspaceId: string;
secret: Record<string, string | undefined> } or a suitably named SecretMap type)
so calls in recordTransform (getAccessToken usage) and transform that
destructure { secret } and index by accessTokenKey are correctly typed; ensure
any existing references are adjusted for the new metadata shape.

@sonarqubecloud
Copy link

identifiers?: Record<string, string | number>;
};
metadata: Record<string, unknown>;
metadata: {
Copy link
Contributor

@maheshkutty maheshkutty Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants