Skip to content

JIT/non-JIT match difference with newlines when anchored, firstline #323

@addisoncrump

Description

@addisoncrump

Found with #322.

The regex \n does not match the same strings when used with and without JIT when the anchored and firstline options are enabled.

sh-5.2$ xxd newline_crash 
00000000: 0000 0000 0001 0080 0a                   .........
sh-5.2$ ./pcre2_fuzzer newline_crash
Encountered failure while performing match errorcode comparison; context:
Pattern/sample string (hex encoded): 0a
Compile options 80100100 never_backslash_c,anchored,firstline
Match options 00002000
Non-JIT'd operation emitted an error: no match
JIT'd operation did not emit an error.
1 matches discovered by JIT'd regex:
Match 0 (hex encoded): 0a

Documentation suggests that the JIT implementation is correct here (emphasis mine):

PCRE2_FIRSTLINE: If this option is set, the start of an unanchored pattern match must be before or at the first newline in the subject string following the start of matching, though the matched text may continue over the newline.

PCRE2_ANCHORED: If this bit is set, the pattern is forced to be "anchored", that is, it is constrained to match only at the first matching point in the string that is being searched (the "subject string"). This effect can also be achieved by appropriate constructs in the pattern itself, which is the only way to do it in Perl.

Not sure where to add the testcase here. This behaviour only appears when both anchored and firstline are set.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions