Skip to content

Commit d8c320c

Browse files
committed
Describe JavaScript escaping strategy in detail
Explain caveats and hint at workarounds.
1 parent a7495dd commit d8c320c

File tree

1 file changed

+42
-1
lines changed

1 file changed

+42
-1
lines changed

src/wp-includes/html-api/class-wp-html-tag-processor.php

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3845,7 +3845,48 @@ public function set_modifiable_text( string $plaintext_content ): bool {
38453845
false !== stripos( $plaintext_content, '<script' )
38463846
) {
38473847
/*
3848-
* JavaScript can be safely escaped.
3848+
* JavaScript can be safely escaped with a few exceptions. This is achieved by
3849+
* replacing dangerous sequences like "<script" and "</script" with a form
3850+
* using a Unicode escape sequence "<\u0073cript>" and "</\u0073cript>".
3851+
*
3852+
* `<script` and `</script` appear in JavaScript source code in limited places,
3853+
* all of which support a Unicode escape sequence on the "s" character.
3854+
* JavaScript identifiers, string literals, template literals, and RegExp
3855+
* literals all support Unicode escape sequences, meaning that the escaped form
3856+
* is indistinguishable from the unescaped form when the JavaScript
3857+
* is evaluated.
3858+
*
3859+
* There are a few exceptions where the escaped form can be detected:
3860+
*
3861+
* - The escaped form would appear in any JavaScript comments.
3862+
* - “Raw” strings via `String.raw()` or the `raw` property of the first
3863+
* argument to a tagged template literal exposes the raw form, revealing any
3864+
* escaping that has been applied.
3865+
* - The `source` property of a RegExp object reveals an escaped form the of
3866+
* the pattern.
3867+
*
3868+
* For JavaScript that needs to avoid these issues, workarounds may
3869+
* be available. For example:
3870+
*
3871+
* // Instead of this:
3872+
* const rawStringWillBeEscaped = String.raw`</script>`;
3873+
*
3874+
* // This is a safe alternative:
3875+
* const rawStringWillBePreserved = String.raw`</scr` + String.raw`ipt>`;
3876+
*
3877+
* After escaping, the JavaScript result looks like this:
3878+
*
3879+
* const rawStringWillBeEscaped = String.raw`</\u0073cript>`;
3880+
* // Evaluates to `'</\\u0073cript>'`.
3881+
*
3882+
* const rawStringWillBePreserved = String.raw`</scr` + String.raw`ipt>`;
3883+
* // Evaluates to `'</script>'`.
3884+
*
3885+
* Escaping is applied only where strictly necessary, reducing the likelyhood
3886+
* that observable differences manifest in the escaped JavaScript.
3887+
*
3888+
* This escaping strategy strikes will make ALL JavaScript safe to embed in
3889+
* HTML in a way that is completely transparent in most cases.
38493890
*/
38503891
if ( $this->is_javascript_script_tag() ) {
38513892
$plaintext_content = preg_replace_callback(

0 commit comments

Comments
 (0)