Skip to content

RegexArrayShapeMatcher - Support preg_quote()'d patterns #3233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions src/Type/Php/PregMatchParameterOutTypeExtension.php
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,12 @@ public function getParameterOutTypeFromFunctionCall(FunctionReflection $function
return null;
}

$patternType = $scope->getType($patternArg->value);
$flagsType = null;
if ($flagsArg !== null) {
$flagsType = $scope->getType($flagsArg->value);
}

return $this->regexShapeMatcher->matchType($patternType, $flagsType, TrinaryLogic::createMaybe());
return $this->regexShapeMatcher->matchExpr($patternArg->value, $flagsType, TrinaryLogic::createMaybe(), $scope);
}

}
3 changes: 1 addition & 2 deletions src/Type/Php/PregMatchTypeSpecifyingExtension.php
Original file line number Diff line number Diff line change
Expand Up @@ -48,13 +48,12 @@ public function specifyTypes(FunctionReflection $functionReflection, FuncCall $n
return new SpecifiedTypes();
}

$patternType = $scope->getType($patternArg->value);
$flagsType = null;
if ($flagsArg !== null) {
$flagsType = $scope->getType($flagsArg->value);
}

$matchedType = $this->regexShapeMatcher->matchType($patternType, $flagsType, TrinaryLogic::createFromBoolean($context->true()));
$matchedType = $this->regexShapeMatcher->matchExpr($patternArg->value, $flagsType, TrinaryLogic::createFromBoolean($context->true()), $scope);
if ($matchedType === null) {
return new SpecifiedTypes();
}
Expand Down
68 changes: 68 additions & 0 deletions src/Type/Php/RegexArrayShapeMatcher.php
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
use Hoa\Compiler\Llk\TreeNode;
use Hoa\Exception\Exception;
use Hoa\File\Read;
use PhpParser\Node\Expr;
use PhpParser\Node\Name;
use PHPStan\Analyser\Scope;
use PHPStan\Php\PhpVersion;
use PHPStan\TrinaryLogic;
use PHPStan\Type\Constant\ConstantArrayType;
Expand Down Expand Up @@ -45,7 +48,20 @@ public function __construct(
{
}

public function matchExpr(Expr $patternExpr, ?Type $flagsType, TrinaryLogic $wasMatched, Scope $scope): ?Type
{
return $this->matchPatternType($this->getPatternType($patternExpr, $scope), $flagsType, $wasMatched);
}

/**
* @deprecated use matchExpr() instead for a more precise result
*/
public function matchType(Type $patternType, ?Type $flagsType, TrinaryLogic $wasMatched): ?Type
{
return $this->matchPatternType($patternType, $flagsType, $wasMatched);
}

private function matchPatternType(Type $patternType, ?Type $flagsType, TrinaryLogic $wasMatched): ?Type
{
if ($wasMatched->no()) {
return new ConstantArrayType([], []);
Expand Down Expand Up @@ -484,4 +500,56 @@ private function walkRegexAst(
}
}

private function getPatternType(Expr $patternExpr, Scope $scope): Type
{
if ($patternExpr instanceof Expr\BinaryOp\Concat) {
return $this->resolvePatternConcat($patternExpr, $scope);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it, what's happening here? Shouldn't this be done instead by improving MutatingScope::resolveType? By introducing dynamic return type extension for preg_quote or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to explain as I feel this is my fault :D

I think you can't reasonably treat preg quote's return value as a constant string (empty or "x") in any other context than here when trying to get a constant pattern string so that you have a regex to parse.

As such it feels appropriate to me to hack this in here but maybe there are better ways 🤷🏻‍♂️

Copy link
Contributor Author

@staabm staabm Jul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its pretty unlikely that we know the args to preg_quote() at static analysis time in a real world use-case.

Please also look at the added test-cases. These wouldn't work with a return type extension, as the extension couldn't produce a constant-string for these cases.

So while adding a preg_quote return type extension could make sense in the future, it would not help with some of the examples at hand. The ArrayShapeMatcher just can ignore the preg_quote() parts and still analyse the shape, as long as all other parts of a concatenation are constant.

This assumption only works for the ArrayShapeMatcher therefore its not implemented for the common case in the Scope

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, in that case the code needs some comments. I did not understand at all why you are returning an empty string for a preg_quote call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👍

}

return $scope->getType($patternExpr);
}

/**
* Ignores preg_quote() calls in the concatenation as these are not relevant for array-shape matching.
*
* This assumption only works for the ArrayShapeMatcher therefore it is not implemented for the common case in Scope.
*
* see https://github.com/phpstan/phpstan-src/pull/3233#discussion_r1676938085
*/
private function resolvePatternConcat(Expr\BinaryOp\Concat $concat, Scope $scope): Type
{
if (
$concat->left instanceof Expr\FuncCall
&& $concat->left->name instanceof Name
&& $concat->left->name->toLowerString() === 'preg_quote'
) {
$left = new ConstantStringType('');
} elseif ($concat->left instanceof Expr\BinaryOp\Concat) {
$left = $this->resolvePatternConcat($concat->left, $scope);
} else {
$left = $scope->getType($concat->left);
}

if (
$concat->right instanceof Expr\FuncCall
&& $concat->right->name instanceof Name
&& $concat->right->name->toLowerString() === 'preg_quote'
) {
$right = new ConstantStringType('');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing.. If you do smth like https://3v4l.org/1J7nFF you see the first call triggers a warning but this code will assume it's fine even tho it is a detectable mistake. If you're evaluating the pattern anyway, you could check the first character and try to run preq_quote on it. If it doesn't come out escaped then any preg quote call that is inline in the concatenated string forming the pattern should have that character passed in as second argument to ensure the regex cannot be broken.

I hope I'm making sense @staabm .. Could be another PR tho for sure as it's smth for a new rule I guess. It sucks a bit that we can't warn from here tho because that would warn also for free in composer/pcre and others integrating with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this sounds useful. please create a new issue with a reproducer

} elseif ($concat->right instanceof Expr\BinaryOp\Concat) {
$right = $this->resolvePatternConcat($concat->right, $scope);
} else {
$right = $scope->getType($concat->right);
}

$strings = [];
foreach ($left->getConstantStrings() as $leftString) {
foreach ($right->getConstantStrings() as $rightString) {
$strings[] = new ConstantStringType($leftString->getValue() . $rightString->getValue());
}
}

return TypeCombinator::union(...$strings);
}

}
44 changes: 44 additions & 0 deletions tests/PHPStan/Analyser/nsrt/preg_match_shapes.php
Original file line number Diff line number Diff line change
Expand Up @@ -393,3 +393,47 @@ function unmatchedAsNullWithMandatoryGroup(string $s): void {
assertType('array{}|array{0: string, currency: string, 1: string}', $matches);
}

function (string $s): void {
if (preg_match('{' . preg_quote('xxx') . '(z)}', $s, $matches)) {
assertType('array{string, string}', $matches);
} else {
assertType('array{}', $matches);
}
assertType('array{}|array{string, string}', $matches);
};

function (string $s): void {
if (preg_match('{' . preg_quote($s) . '(z)}', $s, $matches)) {
assertType('array{string, string}', $matches);
} else {
assertType('array{}', $matches);
}
assertType('array{}|array{string, string}', $matches);
};

function (string $s): void {
if (preg_match('/' . preg_quote($s, '/') . '(\d)/', $s, $matches)) {
assertType('array{string, string}', $matches);
} else {
assertType('array{}', $matches);
}
assertType('array{}|array{string, string}', $matches);
};

function (string $s): void {
if (preg_match('{' . preg_quote($s) . '(z)' . preg_quote($s) . '(?:abc)(def)?}', $s, $matches)) {
assertType('array{0: string, 1: string, 2?: string}', $matches);
} else {
assertType('array{}', $matches);
}
assertType('array{}|array{0: string, 1: string, 2?: string}', $matches);
};

function (string $s, $mixed): void {
if (preg_match('{' . preg_quote($s) . '(z)' . preg_quote($s) . '(?:abc)'. $mixed .'(def)?}', $s, $matches)) {
assertType('array<string>', $matches);
} else {
assertType('array{}', $matches);
}
assertType('array<string>', $matches);
};
Loading