Skip to content

Commit 98abb0e

Browse files
lagheecursoragentgithub-actions[bot]claudejonathanKingston
authored
Migrate surrogates and tracker blocking scripts from native Apple (#2322)
* feat: Add tracker-stats feature for consolidated content blocking - Add TrackerStats feature class for surrogate loading - Add TrackerResolver for tracker matching logic (extracted from Apple legacy scripts) - Update ContentFeature.log to route logs to native on Apple platforms - Enables Xcode console visibility for C-S-S feature logs This consolidates functionality from Apple's contentblockerrules.js, contentblocker.js, and surrogates.js into a single C-S-S feature. * Rename trackerStats to trackerProtection Rename the feature, files, directories, classes, type definitions, message schemas, and all references from trackerStats/TrackerStats/ tracker-stats to trackerProtection/TrackerProtection/tracker-protection. The name trackerProtection better describes the feature's responsibility: tracker detection, surrogate injection, and stats reporting — not just statistics collection. * Bundle tracker-surrogates as C-S-S dependency Add @duckduckgo/tracker-surrogates as a dependency and import surrogate definitions at build time. This eliminates the need for: - Swift-side createSurrogateFunctions parsing of surrogates.txt - $SURROGATES$ template replacement in the apple.js entry point - Native CDN fetch of surrogates.txt for Apple platforms - Runtime surrogate data passing via args Surrogates are now bundled into the C-S-S build output as a JS module that maps pattern names to callable functions. A build script (buildSurrogates.js) generates this module from the tracker-surrogates package. The tracker-protection feature imports surrogates directly instead of receiving them via the entry point args. * Port contentblockerrules.js resource interception to tracker-protection Add XHR, fetch, Image.prototype.src, and expanded MutationObserver interception from the old contentblockerrules.js into the trackerProtection C-S-S feature. This ensures non-script resources (tracking pixels, XHR beacons, fetch calls, iframes, links) are detected and reported to native for privacy dashboard tracker counts. - _setupXHRInterception: wraps XMLHttpRequest.prototype.open/send - _setupFetchInterception: wraps window.fetch - _setupImageSrcInterception: overrides Image.prototype.src descriptor - _setupMutationObserver: now watches for IMG elements too - _processPageOnLoad: scans scripts, links, images, iframes on load - _checkAndReport: report-only path for non-script resources - destroy: restores all intercepted prototypes/descriptors * Replace async CTL check with synchronous feature setting CTL (Click-to-Load) state is now read from feature settings at init time instead of making an async request through the messaging bridge. This eliminates the isCTLEnabled request/response message schemas and makes the surrogate loading path fully synchronous, avoiding timing issues where the browser could start executing the original script before the async CTL check resolves. The native side provides ctlEnabled as a feature setting at config construction time. * Add reference tests, CNAME support, and fix TrackerResolver bugs Add domain-matching reference tests from @duckduckgo/privacy-reference-tests as a dev dependency. Run 122 reference test cases against the JS TrackerResolver (including CNAME-based tracker detection). Add CNAME resolution support to TrackerResolver: - When a direct domain lookup fails, check the TDS cnames map for the exact request hostname - Resolve the CNAME target and walk up its domain hierarchy to find a matching tracker - Use the resolved tracker's domain for entity/first-party lookups - Rewrite the request URL with the resolved domain for rule matching (tracker rules reference the canonical domain, not the CNAME alias) Fix two additional bugs found by reference tests: - Subdomain matching in rule exceptions: exception domains like 'ignore.test' now correctly match subdomains like 'sub.ignore.test' - Unknown rule actions: rules with unsupported action values now fall through to the tracker's default action instead of being treated as blocks * Add Playwright integration tests for tracker-protection Add end-to-end tests covering: - Tracker detection from dynamically added scripts - Surrogate loading for matching rules - Non-tracker URL handling (no false positives) - Feature disabled state (no messages sent) - Allowed tracker reporting (blocked=false) Register tests in the apple project of playwright.config.js. Add test page, enabled and disabled config files. Note: These tests require a properly configured Playwright environment with webkit message handler mocking (CI or local with xvfb). * Fix lint and TypeScript errors - Use WeakMap/WeakSet for XHR and Image instance tracking instead of dynamic properties, avoiding this-aliasing and TS property errors - Add @ts-nocheck to auto-generated surrogates.js - Exclude surrogates.js from TypeScript checking in tsconfig.json - Fix unused variable and null-check warnings in tests - Remove unused escaped variable from buildSurrogates.js * Auto-format: prettier + stylelint fix * Add missing version field to tracker-protection test configs * Address review feedback - Fix prevalence lookup: use entity from TrackerMatch result instead of tracker.owner, which doesn't have prevalence. The resolver now includes the resolved entity object in TrackerMatch. - Fix CTL notification ordering: determine willLoadSurrogate before sending trackerDetected notification so isSurrogate accurately reflects whether the surrogate will actually be loaded (respecting CTL gating). - Fix fetch interception: handle URL object arguments (input instanceof URL) in addition to string and Request objects. - Fix host vs hostname: use hostname consistently to avoid port numbers breaking domain matching against unprotected/allowlist entries. * Auto-format: prettier + stylelint fix * Use resolver action instead of rule check for surrogate detection Check result.action === 'redirect' instead of Boolean(result.matchedRule?.surrogate) to determine if a surrogate will be loaded. The resolver already verifies the surrogate function exists in the bundle before setting action to 'redirect' — using the rule name alone could falsely report surrogates for rules that reference non-existent surrogate functions. * Observe documentElement for full head+body coverage Use document.documentElement instead of document.body as the MutationObserver root. With subtree: true, this covers scripts added to both <head> and <body>. Previously, scripts dynamically inserted into <head> (common for analytics/ad loaders) were missed. Also use _checkAndBlock for scripts in _processPageOnLoad so late-discovered scripts get surrogate treatment. * Fix nested script detection and surrogate notification accuracy - MutationObserver now traverses descendants of container elements (e.g., a div containing nested scripts) via querySelectorAll. Previously only direct addedNodes were checked, missing scripts inserted as children of appended containers. - _loadSurrogate now returns boolean indicating success. The surrogateInjected notification is only sent when the surrogate actually executed successfully, not on early exits or exceptions. - _processAddedNode uses early returns for direct script/image nodes to avoid redundant querySelectorAll on leaf elements. * Add strict TypeScript enforcement and CNAME tests for tracker-protection - Add tracker-protection files to CORE_FILES set for strict mode checking - Add 5 comprehensive CNAME resolution unit tests - Fix TypeScript strict mode violations with proper type definitions - Create detailed typedefs (TrackerRule, Tracker, RuleOptions, etc.) - Add RequestData type for internal request handling - Fix XHR.open overload signature compatibility - All 831 unit tests passing Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Add integration tests for tracker-protection edge cases - Add XHR error detection test (validates network interception) - Add CTL surrogate bypass test (validates CTL feature gating) - Add unprotected domain test (validates unprotected domain logic) - Fix isUnprotectedDomain to handle single-part domains like "localhost" - Add exact-match check before subdomain walking - Follows same pattern as matchHostname in utils.js - Prevents accidentally matching TLDs while supporting localhost All 8 integration tests passing, 833 unit tests passing. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Merge remote-tracking branch 'origin/main' into kmC/content-script-architecture-ed9b * Apply prettier formatting to tracker-protection unit tests * Apply prettier formatting to surrogates.js * Fix surrogates build to produce Prettier-formatted output The buildSurrogates.js script now runs Prettier on the generated file as part of the build step. This ensures the committed surrogates.js matches what CI produces during the clean tree check, which rebuilds all artifacts and verifies no files changed. * Fix allowlist blocked flag, fetch error protection, load readyState, and build ordering - Allowlisted trackers now correctly report blocked: false in both _checkAndReport and _checkAndBlock (previously the allowlist check happened after the blocked flag was set) - Fetch interception wraps checkAndReport in try/catch to prevent tracker detection errors from breaking the original fetch call - Load event listener checks document.readyState === 'complete' to handle the case where init runs after the page has already loaded, ensuring _processPageOnLoad (links, images, iframes) still executes - Sort file list in buildSurrogates.js for consistent cross-platform output (Windows readdirSync returns different order than Linux/macOS) * Report non-tracker third-party requests for privacy dashboard When the tracker resolver returns no match for a URL, check if it's a third-party request (different eTLD+1 from the page) and report it with reason 'thirdPartyRequest'. This restores the 'also loaded' section in the privacy dashboard that shows non-tracking third-party requests (CDNs, APIs, etc.). First-party requests (same eTLD+1) are not reported. * Remove surrogate deduplication to allow repeated loading The _loadedSurrogates Set prevented the same surrogate pattern from being loaded more than once. This broke sites/tests that need the same surrogate to execute multiple times (e.g., the surrogates test page deletes window.ga between tests and expects the analytics.js surrogate to re-execute on subsequent script insertions). Surrogate functions are idempotent (they set window globals like window.ga, window.adsbygoogle, etc.), so executing them multiple times is harmless. * Fix third-party request reporting test, consolidate script scanning, improve eTLD+1 heuristic - Update 'ignores non-tracker URLs' test to validate thirdPartyRequest reporting behavior instead of expecting zero messages - Consolidate _processExistingElements into _scanExistingScripts and _processPageOnLoad to remove duplicated script scanning - Add TWO_PART_TLDS set and _getApproxETLDp1 helper to correctly handle multi-part TLDs like .co.uk, .com.au in third-party request reporting * Filter non-HTTP URLs and allow surrogate re-execution for same URL - Skip data:, blob:, about: and other non-HTTP(S) URLs in both _checkAndReport and _checkAndBlock to prevent data URIs from appearing as third-party requests in the privacy dashboard - Remove _seenUrls deduplication check from _checkAndBlock so surrogates re-execute when the same URL appears in multiple script elements (matching old surrogates.js behavior) * Add integration tests for data URI filtering and surrogate re-execution - Test that data: and blob: URLs are silently ignored (no trackerDetected) - Test that adding the same tracker script URL twice triggers two surrogateInjected messages, verifying re-execution works * Skip surrogate injection when script has integrity attribute When a script element has an integrity (SRI) attribute, the surrogate content won't match the expected hash. Respect SRI by skipping surrogate injection in this case — the tracker is still detected and reported as blocked, but no surrogate is loaded. This improves on production behavior where the surrogate incorrectly loads despite the integrity mismatch. * Set ruleException reason for allowlisted trackers, use dispatchEvent for surrogate load - When a tracker is allowlisted (via privacy config trackerAllowlist), set reason to 'matched rule - exception' so the native side maps it to .ruleException and the dashboard shows 'loaded to prevent site breakage' instead of 'also loaded' - Use dispatchEvent(new Event('load')) instead of direct onload() call in _loadSurrogate to also trigger addEventListener listeners - Add integration test for allowlisted tracker reason mapping * Apply suggestion from @laghee * Pin reference tests and clean lockfile deps * Replace bundled surrogates with native-provided surrogates.txt parsing Back out build-time surrogate bundling in favor of native fetch + pass at init. C-S-S now reads surrogates text from the trackerProtection config settings and parses the surrogates.txt wire format into callable functions at runtime. Removed: - Generated surrogates.js (bundled surrogate function map) - buildSurrogates.js build script and build-surrogates npm script - @duckduckgo/tracker-surrogates dependency Added: - surrogates-parser.js: parses surrogates.txt format into Record<string, () => void> using new Function() - Unit tests for parser (11 specs) - surrogates setting field in integration test configs Also: - Split unprotected domain matching (exact vs wildcard) - Extract reason string constants from tracker-resolver - Report all cross-hostname third-party requests - Add surrogates-parser to strict-core check list - Remove deleted surrogates.js from tsconfig exclude Made-with: Cursor * Prettier * Add parity fixes and tests for tracker-protection migration - Entity affiliation lookup for non-tracker third-party requests (P0-5): C-S-S now checks TDS entity data in _reportThirdPartyRequest so affiliated non-tracker requests are classified as 'first party' (maps to ownedByFirstParty on native side). - New getEntityAffiliation() public method on TrackerResolver. - Unit tests for P0-2/3/4/5/6 (entity affiliation, tracker reasons, metadata fidelity). - Surrogates parser robustness tests for P1-3/4/5 (CRLF, large payloads, whitespace-only input). Made-with: Cursor * Fix pageUrl iframe drop risk and affiliated non-tracker routing - Introduce REASON_AFFILIATED_THIRD_PARTY ('thirdPartyRequestOwnedByFirstParty') for non-tracker requests whose entity matches the page entity. This lets native routing distinguish them from actual trackers, avoiding ad-click attribution, blocked-tracker stats, and FB callback side effects. - Integration tests for pageUrl contract (top-frame URL for tracker and surrogate events) and non-tracker third-party payload verification. Made-with: Cursor * Replace runtime surrogates parsing with build-time static module Eliminates runtime `new Function` in surrogate loading (App Store compliance). Surrogates are now generated at build time from @duckduckgo/tracker-surrogates into a static JS module that tracker-protection.js imports directly. - Add build-time generation script with esbuild validation - Add CI guard script (check-surrogates) wired into lint chain - Wire prebuild/pretest hooks so generated file is always present - Delete surrogates-parser.js and its unit tests - Pin @duckduckgo/tracker-surrogates to immutable commit SHA Native still passes settings.surrogates for now; C-S-S ignores it. Apple cleanup will be a follow-up PR. Made-with: Cursor * Remove generated surrogates from lint check * Cleanup after switch to bundling * Add pre-merge trackerProtection test hardening Cover under-tested interception paths and lock CTL semantics: - fetch(URL), fetch(Request), Image.src descriptor detection tests - CTL enabled: assert trackerDetected then surrogateInjected in order - CTL disabled: add blocked===true assertion (legacy parity contract) Made-with: Cursor * Address bot comment * Fix test error, avoid linting generated surrogates * Add build-time surrogate syntax validation Co-authored-by: Kate Manning <laghee@users.noreply.github.com> * Fix surrogates check import for Windows paths Co-authored-by: Kate Manning <laghee@users.noreply.github.com> * Add real-surrogate E2E and MutationObserver img parity tests - Parameterized tests for analytics.js, gtm.js, gpt.js: assert blocked, surrogateInjected, and expected global defined in page - DOM-appended <img> test exercises MutationObserver interception path (distinct from Image.src descriptor path already covered) - New config fixture with real Google domains mapped to bundled surrogate file names Made-with: Cursor * Restore content-feature.js from main and regenerate lockfile after rebase Co-authored-by: Jonathan Kingston <jonathanKingston@users.noreply.github.com> * Auto release workflow status (#2436) * fix: remove branch name gate from semver release workflow The job-level `if` condition required the PR head branch to be literally `release-major`, `release-minor`, or `release-patch`. No PRs ever use those branch names — they use feature branches — so the workflow was unconditionally skipped on every merge. The label-based check in the first step already filters to only `semver-major`/`semver-minor` labeled PRs, making the branch gate redundant and broken. Removing it restores the intended flow: 1. PR merges to main → workflow fires 2. Step 1 checks for semver-major/minor label → skips if absent 3. Step 2 dispatches build.yml with the appropriate version bump Co-authored-by: Jonathan Kingston <jonathanKingston@users.noreply.github.com> * fix: gate on semver-major/semver-minor labels instead of branch names Replace the broken branch-name check (release-major/minor/patch) with label checks using labels.*.name. This matches the semver-* labels applied by the semver-label workflow and avoids allocating a runner for every merged PR. Co-authored-by: Jonathan Kingston <jonathanKingston@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Jonathan Kingston <jonathanKingston@users.noreply.github.com> * Customize rows after import (#2401) * Add getCustomizeStepRows event * Fix build errors. * Run prettier * Fix integration tests * Remove extra unwrapping. * Make linter happy. * Clear timeout if response was received * Use onConfigUpdate subcription instead of introducing a callback * Fix rebase errors * Fix linter errors * introduce a guard for nextStepDefs[key] * remove unnecessary test * Extend "Given onConfigUpdate behavior" UT with additional state verification * Allow running "Given onConfigUpdate behavior" test on macOS * Add new onboarding test to onboarding.v4.spec.js * Make prettier happy. * refactor duplicated code * Migrate trackerData from feature settings to runtime args - Update tracker-protection.js to read from args.trackerData (object) instead of getFeatureSetting('trackerData') (JSON string) - Add trackerData to LoadArgs typedef in content-scope-features.js - Convert 7 tracker-protection test fixtures from stringified JSON to proper object format - Update ResultsCollector to extract trackerData object from config and pass via $USER_PREFERENCES$ - Add unit tests verifying args.trackerData is used (not settings) This prepares for the native side to pass trackerData directly in args rather than encoding it in privacy config feature settings. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Address bot comment, fix failing tests * Update tracker-protection tests to use withUserPreferences for trackerData Tests now explicitly pass trackerData via withUserPreferences() instead of relying on bridging code. This matches production behavior where native apps pass trackerData in $USER_PREFERENCES$ via ContentScopeProperties.trackerData. Changes: - Add tracker-data-fixtures.js with factory functions for test data - Update all 24 tests to call withUserPreferences({ trackerData }) before load() - Remove trackerData from config files (now passed via withUserPreferences only) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Fix privacy config drift * Add surrogate entity metadata to tracker events --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Kate Manning <laghee@users.noreply.github.com> Co-authored-by: Jonathan Kingston <jonathanKingston@users.noreply.github.com> Co-authored-by: Jonathan Kingston <jkingston@duckduckgo.com> Co-authored-by: Adam Horvath <horviadam@gmail.com>
1 parent 6ce6fbb commit 98abb0e

31 files changed

+2855
-61
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@ Sources/ContentScopeScripts/dist/pages/*
1616
!Sources/ContentScopeScripts/dist/pages/
1717
!Sources/ContentScopeScripts/dist/pages/.gitkeep
1818

19+
# Build-time generated surrogates (regenerated by npm run build-surrogates)
20+
injected/src/features/tracker-protection/surrogates-generated.js
21+
1922
# Test output files (generated during tests)
2023
injected/unit-test/fixtures/page-context/output/
2124

.prettierignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ build/**/*
22
docs/**/*
33
!injected/docs/**/*
44
injected/src/types
5+
injected/src/features/tracker-protection/surrogates-generated.js
56
special-pages/pages/**/types
67
injected/integration-test/extension/contentScope.js
78
**/*.json
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
{
2+
"readme": "Config for tracker-protection integration tests with an allowlist entry for tracker.example on localhost. TrackerData is passed via withUserPreferences().",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "enabled",
7+
"exceptions": [],
8+
"settings": {
9+
"blockingEnabled": true,
10+
"ctlEnabled": true,
11+
"allowlist": {
12+
"tracker.example": [
13+
{
14+
"rule": "tracker\\.example/pixel\\.js",
15+
"domains": ["<all>"]
16+
}
17+
]
18+
},
19+
"tempUnprotectedDomains": [],
20+
"userUnprotectedDomains": []
21+
}
22+
}
23+
},
24+
"unprotectedTemporary": []
25+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"readme": "Test CTL action-prefix gating for non-fb surrogate when ctlEnabled is false. TrackerData is passed via withUserPreferences().",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "enabled",
7+
"exceptions": [],
8+
"settings": {
9+
"blockingEnabled": true,
10+
"ctlEnabled": false,
11+
"allowlist": {},
12+
"tempUnprotectedDomains": [],
13+
"userUnprotectedDomains": []
14+
}
15+
}
16+
},
17+
"unprotectedTemporary": []
18+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"readme": "Test CTL surrogate bypass when ctlEnabled is false. TrackerData is passed via withUserPreferences().",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "enabled",
7+
"exceptions": [],
8+
"settings": {
9+
"blockingEnabled": true,
10+
"ctlEnabled": false,
11+
"allowlist": {},
12+
"tempUnprotectedDomains": [],
13+
"userUnprotectedDomains": []
14+
}
15+
}
16+
},
17+
"unprotectedTemporary": []
18+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"readme": "Test CTL surrogate injection when ctlEnabled is true. TrackerData is passed via withUserPreferences().",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "enabled",
7+
"exceptions": [],
8+
"settings": {
9+
"blockingEnabled": true,
10+
"ctlEnabled": true,
11+
"allowlist": {},
12+
"tempUnprotectedDomains": [],
13+
"userUnprotectedDomains": []
14+
}
15+
}
16+
},
17+
"unprotectedTemporary": []
18+
}
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"readme": "Config for tracker-protection integration tests with the feature disabled.",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "disabled",
7+
"exceptions": [],
8+
"settings": {}
9+
}
10+
},
11+
"unprotectedTemporary": []
12+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"readme": "Config for real-surrogate E2E tests. TrackerData is passed via withUserPreferences().",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "enabled",
7+
"exceptions": [],
8+
"settings": {
9+
"blockingEnabled": true,
10+
"ctlEnabled": false,
11+
"allowlist": {},
12+
"tempUnprotectedDomains": [],
13+
"userUnprotectedDomains": []
14+
}
15+
}
16+
},
17+
"unprotectedTemporary": []
18+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"readme": "Test unprotected domain behavior - reports but doesn't block. TrackerData is passed via withUserPreferences().",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "enabled",
7+
"exceptions": [],
8+
"settings": {
9+
"blockingEnabled": true,
10+
"ctlEnabled": true,
11+
"allowlist": {},
12+
"tempUnprotectedDomains": ["localhost"],
13+
"userUnprotectedDomains": []
14+
}
15+
}
16+
},
17+
"unprotectedTemporary": []
18+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"readme": "Config for tracker-protection integration tests. TrackerData is passed via withUserPreferences().",
3+
"version": 1,
4+
"features": {
5+
"trackerProtection": {
6+
"state": "enabled",
7+
"exceptions": [],
8+
"settings": {
9+
"blockingEnabled": true,
10+
"ctlEnabled": true,
11+
"allowlist": {},
12+
"tempUnprotectedDomains": [],
13+
"userUnprotectedDomains": []
14+
}
15+
}
16+
},
17+
"unprotectedTemporary": []
18+
}

0 commit comments

Comments
 (0)