Skip to content

HTML API: Improve script tag escape state processing #9397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: trunk
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
6ad9951
Move script data length checks to top of loop
sirreal Jul 10, 2025
b3b3177
Remove parser_state change in skip_script_data
sirreal Jul 10, 2025
ca16e0e
Remove more length checks
sirreal Jul 10, 2025
0456be7
Improve documentation
sirreal Jul 10, 2025
ea6f7d3
Improve comment explaining early return logic
sirreal Jul 31, 2025
4be62b9
Improve loop comment
sirreal Aug 6, 2025
df2affa
Add script tag processing tests
sirreal Aug 6, 2025
d0cbb00
Remove problematic tests
sirreal Aug 6, 2025
69f3bce
Merge branch 'html-api/improve-skip-script-data-len-checks' into html…
sirreal Aug 6, 2025
c509f9d
Revert "Remove problematic tests"
sirreal Aug 6, 2025
de91e09
Add test that reveals bad offset
sirreal Aug 6, 2025
f041a9c
Ensure the escaped state is not entered on abruptly closed comments
sirreal Aug 6, 2025
2b6833c
Add unclosed script tag test
sirreal Aug 6, 2025
bba0547
Prevent script close tag from being found
sirreal Aug 6, 2025
728d13f
Fix typos in explanatory comments
sirreal Aug 6, 2025
1b4478f
Reword explanatory comment.
dmsnell Aug 6, 2025
d22ef9a
Merge branch 'trunk' into html-api/ensure-script-data-states-parse-co…
sirreal Aug 7, 2025
360d896
fixup! Merge branch 'trunk' into html-api/ensure-script-data-states-p…
sirreal Aug 7, 2025
9fd074f
Update `<!--` comments
sirreal Aug 7, 2025
840f6aa
Add more tests and improve coverage
sirreal Aug 7, 2025
f7bcfb4
more tests more coverage
sirreal Aug 7, 2025
e9dd022
Improve language about abruptly closed comment
sirreal Aug 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 21 additions & 13 deletions src/wp-includes/html-api/class-wp-html-tag-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -1556,24 +1556,33 @@ private function skip_script_data(): bool {
}

/*
* Unlike with "-->", the "<!--" only transitions
* into the escaped mode if not already there.
*
* Inside the escaped modes it will be ignored; and
* should never break out of the double-escaped
* mode and back into the escaped mode.
*
* While this requires a mode change, it does not
* impact the parsing otherwise, so continue
* parsing after updating the state.
* "<!--" only transitions from _unescaped_ to _escaped_. This byte sequence is only
* significant in the _unescaped_ state and is ignored in any other state.
*/
if (
'unescaped' === $state &&
'!' === $html[ $at ] &&
'-' === $html[ $at + 1 ] &&
'-' === $html[ $at + 2 ]
) {
$at += 3;
$state = 'unescaped' === $state ? 'escaped' : $state;
$at += 3;

/*
* The parser is ready to enter the _escaped_ state, but may remain in the
* _unescaped_ state. This occurs when "<!--" is immediately followed by a
* sequence of 0 or more "-" followed by ">". This is similar to abruptly closed
* HTML comments like "<!-->" or "<!--->".
*
* Note that this check may advance the position significantly and requires a
* length check to prevent bad offsets on inputs like `<script><!---------`.
*/
$at += strspn( $html, '-', $at );
if ( $at < $doc_length && '>' === $html[ $at ] ) {
++$at;
continue;
}

$state = 'escaped';
continue;
}

Expand Down Expand Up @@ -1611,7 +1620,6 @@ private function skip_script_data(): bool {
$at += 6;
$c = $html[ $at ];
if ( ' ' !== $c && "\t" !== $c && "\r" !== $c && "\n" !== $c && '/' !== $c && '>' !== $c ) {
++$at;
continue;
}

Expand Down
41 changes: 28 additions & 13 deletions tests/phpunit/tests/html-api/wpHtmlTagProcessor.php
Original file line number Diff line number Diff line change
Expand Up @@ -2009,19 +2009,34 @@ public function test_script_tag_parsing( string $input, bool $closes ) {
/**
* Data provider.
*/
public static function data_script_tag(): array {
return array(
'Basic script tag' => array( '<script></script>', true ),
'Script with type attribute' => array( '<script type="text/javascript"></script>', true ),
'Script data escaped' => array( '<script><!--</script>', true ),
'Script data double-escaped exit (comment)' => array( '<script><!--<script>--></script>', true ),
'Script data double-escaped exit (closed)' => array( '<script><!--<script></script></script>', true ),
'Script data double-escaped exit (closed/truncated)' => array( '<script><!--<script></script </script>', true ),
'Script data no double-escape' => array( '<script><!-- --><script></script>', true ),

'Script tag with self-close flag (ignored)' => array( '<script />', false ),
'Script data double-escaped' => array( '<script><!--<script></script>', false ),
);
public static function data_script_tag(): Generator {

yield 'Basic script tag' => array( '<script></script>', true );
yield 'Script with type attribute' => array( '<script type="text/javascript"></script>', true );
yield 'Script data escaped' => array( '<script><!--</script>', true );
yield 'Script data double-escaped exit (comment)' => array( '<script><!--<script>--></script>', true );
yield 'Script data double-escaped exit (closed)' => array( '<script><!--<script></script></script>', true );
yield 'Script data double-escaped exit (closed/truncated)' =>
array( '<script><!--<script></script </script>', true );
yield 'Script data no double-escape' => array( '<script><!-- --><script></script>', true );
yield 'Script data no double-escape (short comment)' => array( '<script><!--><script></script>', true );
yield 'Script data almost double-escaped' => array( '<script><!--<script</script>', true );
yield 'Script data with complex JavaScript' => array(
'<script>
var x = 10;
x--;
x < 0 ? x += 100 : x = (x + 1) - 1;
</script>',
true,
);

yield 'Script tag with self-close flag (ignored)' => array( '<script />', false );
yield 'Script data double-escaped' => array( '<script><!--<script></script>', false );
yield 'Unclosed script in escaped state' => array( '<script><!--------------', false );
yield 'Unclosed script in double escaped state' => array( '<script><!--<script ', false );
yield 'Document end in closer start' => array( '<script></', false );
yield 'Document end in script closer' => array( '<script></script', false );
yield 'Document end in script closer with attributes' => array( '<script></script attr="val"', false );
}

/**
Expand Down
Loading