Skip to content

Commit 2d71c52

Browse files
committed
PHP 8.1 | Tokenizer/PHP: hotfix for overeager explicit octal notation backfill
Follow up on 3481 and 3552. While working on PHPCompatibility/PHPCSUtils, I found another instance where the explicit octal notation backfill is overeager. PHP natively will tokenize invalid octals, like `0o91` and `T_LNUMBER` + `T_STRING` in all PHP versions, but with the backfill in place, this would no longer be the case and on PHP < 8.1, this would now be tokenized as `T_LNUMBER`, making tokenization across PHP versions unpredictable and inconsistent. Fixed now. Including tests.
1 parent 2596a15 commit 2d71c52

File tree

3 files changed

+59
-3
lines changed

3 files changed

+59
-3
lines changed

src/Tokenizers/PHP.php

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -732,16 +732,32 @@ protected function tokenize($string)
732732
&& $tokens[($stackPtr + 1)][0] === T_STRING
733733
&& strtolower($tokens[($stackPtr + 1)][1][0]) === 'o'
734734
&& $tokens[($stackPtr + 1)][1][1] !== '_')
735+
&& preg_match('`^(o[0-7]+(?:_[0-7]+)?)([0-9_]*)$`i', $tokens[($stackPtr + 1)][1], $matches) === 1
735736
) {
736737
$finalTokens[$newStackPtr] = [
737738
'code' => T_LNUMBER,
738739
'type' => 'T_LNUMBER',
739-
'content' => $token[1] .= $tokens[($stackPtr + 1)][1],
740+
'content' => $token[1] .= $matches[1],
740741
];
741-
$stackPtr++;
742742
$newStackPtr++;
743+
744+
if (isset($matches[2]) === true && $matches[2] !== '') {
745+
$type = 'T_LNUMBER';
746+
if ($matches[2][0] === '_') {
747+
$type = 'T_STRING';
748+
}
749+
750+
$finalTokens[$newStackPtr] = [
751+
'code' => constant($type),
752+
'type' => $type,
753+
'content' => $matches[2],
754+
];
755+
$newStackPtr++;
756+
}
757+
758+
$stackPtr++;
743759
continue;
744-
}
760+
}//end if
745761

746762
/*
747763
PHP 8.1 introduced two dedicated tokens for the & character.

tests/Core/Tokenizer/BackfillExplicitOctalNotationTest.inc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,15 @@ $foo = 0o_137;
1414

1515
/* testInvalid2 */
1616
$foo = 0O_41;
17+
18+
/* testInvalid3 */
19+
$foo = 0o91;
20+
21+
/* testInvalid4 */
22+
$foo = 0O282;
23+
24+
/* testInvalid5 */
25+
$foo = 0o28_2;
26+
27+
/* testInvalid6 */
28+
$foo = 0o2_82;

tests/Core/Tokenizer/BackfillExplicitOctalNotationTest.php

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,34 @@ public function dataExplicitOctalNotation()
8383
'value' => '0',
8484
],
8585
],
86+
[
87+
[
88+
'marker' => '/* testInvalid3 */',
89+
'type' => 'T_LNUMBER',
90+
'value' => '0',
91+
],
92+
],
93+
[
94+
[
95+
'marker' => '/* testInvalid4 */',
96+
'type' => 'T_LNUMBER',
97+
'value' => '0O2',
98+
],
99+
],
100+
[
101+
[
102+
'marker' => '/* testInvalid5 */',
103+
'type' => 'T_LNUMBER',
104+
'value' => '0o2',
105+
],
106+
],
107+
[
108+
[
109+
'marker' => '/* testInvalid6 */',
110+
'type' => 'T_LNUMBER',
111+
'value' => '0o2',
112+
],
113+
],
86114
];
87115

88116
}//end dataExplicitOctalNotation()

0 commit comments

Comments
 (0)