Skip to content

Commit 614916d

Browse files
committed
Describe JavaScript escaping strategy in detail
1 parent a7495dd commit 614916d

File tree

1 file changed

+41
-1
lines changed

1 file changed

+41
-1
lines changed

src/wp-includes/html-api/class-wp-html-tag-processor.php

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3845,7 +3845,47 @@ public function set_modifiable_text( string $plaintext_content ): bool {
38453845
false !== stripos( $plaintext_content, '<script' )
38463846
) {
38473847
/*
3848-
* JavaScript can be safely escaped.
3848+
* JavaScript can be safely escaped with a few exceptions. This is achieved by
3849+
* replacing dangerous sequences like "<script" and "</script" with a form
3850+
* using a Unicode escape sequence "<\u0073cript>" and "</\u0073cript>".
3851+
*
3852+
* `<script` and `</script` appear in JavaScript source code in limited places,
3853+
* all of which support a Unicode escape sequence on the "s" character.
3854+
* JavaScript identifiers, string literals, template literals, and RegExp
3855+
* literals all support Unicode escape sequences, meaning that the escaped form
3856+
* is indistinguishable from the unescaped form when the JavaScript
3857+
* is evaluated.
3858+
*
3859+
* There are a few exceptions where the escaped form can be detected:
3860+
*
3861+
* - The escaped form would appear in any JavaScript comments.
3862+
* - “Raw” strings via `String.raw()` or the `raw` property of the first
3863+
* argument to a tagged template literal exposes the raw form, revealing any
3864+
* escaping that has been applied.
3865+
* - The `source` property of a RegExp object reveals an escaped form the of
3866+
* the pattern.
3867+
*
3868+
* For JavaScript that needs to avoid these issues, workarounds may
3869+
* be available. For example:
3870+
*
3871+
* // Instead of:
3872+
* const rawStringWillBeEscaped = String.raw`</script>`;
3873+
*
3874+
* // This will yield the same result with no escaping required:
3875+
* const rawStringWillBePreserved = String.raw`</scr` + String.raw`ipt>`;
3876+
*
3877+
* // After the escaping has been applied and the JavaScript evaluated,
3878+
* // these are the resulting values:
3879+
* rawStringWillBeEscaped; // "</\\u0073cript>"
3880+
* rawStringWillBePreserve; // "</script>"
3881+
*
3882+
*
3883+
* Escaping is applied only where strictly necessary, reducing the likelyhood
3884+
* that observable differences manifest in the escaped JavaScript.
3885+
*
3886+
* The alternatives are to reject JavaScript that could be safely escaped in
3887+
* a majority of cases or to relax restrictions in ways that produce dangerous
3888+
* or broken HTML documents, neither are desirable.
38493889
*/
38503890
if ( $this->is_javascript_script_tag() ) {
38513891
$plaintext_content = preg_replace_callback(

0 commit comments

Comments
 (0)