-
Notifications
You must be signed in to change notification settings - Fork 216
feat: move posthog-rrweb into the JS monorepo #2815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 No Changeset FoundThis PR doesn't include a changeset. A changeset (and the release label) is required to release a new version. How to add a changesetRun this command and follow the prompts: pnpm changesetRemember: Never use |
|
Skipped: This PR changes more files than the configured file change limit: ( |
| return customHref | ||
| } | ||
| // note: using `new URL` is slower. See #1434 or https://jsbench.me/uqlud17rxo/1 | ||
| a.setAttribute('href', customHref) |
Check failure
Code scanning / CodeQL
DOM text reinterpreted as HTML High
DOM text
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 13 days ago
In general, to fix “DOM text reinterpreted as HTML” issues, you prevent raw, untrusted DOM text from being fed directly into DOM APIs that interpret it as HTML/URLs, or you sanitize/validate it first. Here, the problematic operation is using an attacker-controllable string as the href attribute on a DOM anchor to normalize URLs. The best fix that does not change current functionality materially is to (1) keep using the anchor for URL resolution, but (2) reject or neutralize obviously dangerous URL schemes such as javascript:, vbscript:, and data:/blob: where appropriate before calling setAttribute, and return a safe placeholder (e.g. empty string or about:blank) instead. This keeps snapshotting functional for normal URLs but avoids treating hostile input as a navigable URL.
Concretely, we can introduce a small helper to check whether a URL has an allowed scheme, and use it in getHref right before a.setAttribute('href', customHref). We should:
- Add a
SAFE_HREF_PATTERNor similar that matches allowed URL schemes (e.g.http,https,mailto, maybeftp) or relative URLs without a scheme. - Normalize
customHref(trim it), and if it is non-empty and fails the safe pattern, return a benign, non-navigating URL string such as'about:blank'(or'') instead of setting it on the anchor. - Keep existing early returns for
blob:anddata:if they are intentionally supported; or, if we want to be stricter, we can remove/exclude them from allowed schemes. To minimally affect behavior, we will keep those short-circuits but ensure any other unrecognized or scripting schemes are filtered. - Make no changes to how
transformAttributeandabsoluteToDocare called, so the external API remains the same.
All changes are localized to packages/rrweb/rrweb-snapshot/src/snapshot.ts, around the getHref function and just above it for new constants/helpers.
-
Copy modified lines R159-R160 -
Copy modified lines R172-R176 -
Copy modified line R178
| @@ -156,6 +156,8 @@ | ||
| return Boolean(el.tagName === 'svg' || (el as SVGElement).ownerSVGElement) | ||
| } | ||
|
|
||
| const SAFE_HREF_PATTERN = /^(?:[a-zA-Z][a-zA-Z0-9+.-]*:)?\/\/|^(?![a-zA-Z][a-zA-Z0-9+.-]*:)/ // absolute http(s)-style or relative URL | ||
|
|
||
| function getHref(doc: Document, customHref?: string) { | ||
| let a = cachedDocument.get(doc) | ||
| if (!a) { | ||
| @@ -167,8 +169,13 @@ | ||
| } else if (customHref.startsWith('blob:') || customHref.startsWith('data:')) { | ||
| return customHref | ||
| } | ||
| const trimmedHref = customHref.trim() | ||
| if (trimmedHref && !SAFE_HREF_PATTERN.test(trimmedHref)) { | ||
| // Disallow potentially dangerous schemes such as javascript:, vbscript:, etc. | ||
| return 'about:blank' | ||
| } | ||
| // note: using `new URL` is slower. See #1434 or https://jsbench.me/uqlud17rxo/1 | ||
| a.setAttribute('href', customHref) | ||
| a.setAttribute('href', trimmedHref) | ||
| return a.href | ||
| } | ||
|
|
| (!preserveWhiteSpace && | ||
| _serializedNode.type === NodeType.Text && | ||
| !_serializedNode.isStyle && | ||
| !_serializedNode.textContent.replace(/^\s+|\s+$/gm, '').length) |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
library input
This
regular expression
library input
| return (cssText || '').replace( | ||
| URL_IN_CSS_REF, | ||
| (origin: string, quote1: string, path1: string, quote2: string, path2: string, path3: string) => { | ||
| const filePath = path1 || path2 || path3 | ||
| const maybeQuote = quote1 || quote2 || '' | ||
| if (!filePath) { | ||
| return origin | ||
| } | ||
| if (URL_PROTOCOL_MATCH.test(filePath) || URL_WWW_MATCH.test(filePath)) { | ||
| return `url(${maybeQuote}${filePath}${maybeQuote})` | ||
| } | ||
| if (DATA_URI.test(filePath)) { | ||
| return `url(${maybeQuote}${filePath}${maybeQuote})` | ||
| } | ||
| if (filePath[0] === '/') { | ||
| return `url(${maybeQuote}${extractOrigin(href) + filePath}${maybeQuote})` | ||
| } | ||
| const stack = href.split('/') | ||
| const parts = filePath.split('/') | ||
| stack.pop() | ||
| for (const part of parts) { | ||
| if (part === '.') { | ||
| continue | ||
| } else if (part === '..') { | ||
| stack.pop() | ||
| } else { | ||
| stack.push(part) | ||
| } | ||
| } | ||
| return `url(${maybeQuote}${stack.join('/')}${maybeQuote})` | ||
| } | ||
| ) |
Check failure
Code scanning / CodeQL
Polynomial regular expression used on uncontrolled data High
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
This
regular expression
library input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
bd97789 to
bf9cb65
Compare
bf9cb65 to
93281db
Compare
93281db to
006e570
Compare
006e570 to
06be442
Compare
06be442 to
edc5e72
Compare
|
Size Change: +1.19 kB (+0.01%) Total Size: 14.9 MB
ℹ️ View Unchanged
|
edc5e72 to
f08d416
Compare
f08d416 to
ca49845
Compare
9da3f90 to
e95d759
Compare
e95d759 to
435af99
Compare
435af99 to
a07cc5c
Compare
a07cc5c to
999c8c7
Compare
999c8c7 to
13f31b5
Compare
rafaeelaudibert
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously didn't review the code, but it's not obvious to me how the release process is gonna work here
| "updateInternalDependencies": "patch", | ||
| "bumpVersionsWithWorkspaceProtocolOnly": true, | ||
| "ignore": [] | ||
| "ignore": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use changesets for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the release workflow will work well if we're not using changesets :thinking_face:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah! so, we don't need (i don't think) to release all these packages because we don't use them
we only ever import these in posthog for replay and for that we're using the main rrweb package.
so, we should get a subset of the total number of rrweb packages released and then be able to import those in the main posthog app
(i'm assuming this won't work the way i expect and will need a few tries to get right)
(TL;DR rrweb publishes tonnes of packages but we don't need to release them all i believe)
| "@posthog/rrweb-plugin-console-record": "workspace:*", | ||
| "@posthog/rrweb-record": "workspace:*", | ||
| "@posthog/rrweb-types": "workspace:*", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably deserves a patch bump?
| ## Hello internal PostHog folk | ||
|
|
||
| We build this and publish it to NPM so that we can use it in posthog-js | ||
|
|
||
| If you want to contribute a change back to upstream rrweb | ||
| then you need to open a person fork and contribute from there |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain why do we need to do this versus just using the upstream? mention they're slow to take upstreams, etc, etc. etc.
13f31b5 to
2b0fd37
Compare
2b0fd37 to
8c34cfb
Compare
|
Too many files changed for review. |
8c34cfb to
edf29e6
Compare
|
Too many files changed for review. |
1 similar comment
|
Too many files changed for review. |

Problem
with posthog-rrweb
we have to publish npm packages first
then test them in posthog-js
then release that
Changes
silly
now we've converted it to pnpm
we can move it into the mono-repo
and we only need to publish npm packages because we use them in the posthog-app
Release info Sub-libraries affected
Libraries affected
Checklist
If releasing new changes
pnpm changesetto generate a changeset file