Skip to content

Commit 52041d8

Browse files
committed
Fix misbehaviour of pcre2_match() and pcre2_dfa_match() when PCRE2_FIRSTLINE was set for an anchored pattern.
1 parent 88b1c47 commit 52041d8

File tree

11 files changed

+78
-13
lines changed

11 files changed

+78
-13
lines changed

ChangeLog

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,10 @@ above because \b and \B are defined in terms of \w.
142142
option, and (?aP) also sets (?aT) so that (?-aP) disables all ASCII
143143
restrictions on POSIX classes.
144144

145+
37. If PCRE2_FIRSTLINE was set on an anchored pattern, pcre2_match() and
146+
pcre2_dfa_match() misbehaved. PCRE2_FIRSTLINE is now ignored for anchored
147+
patterns.
148+
145149

146150
Version 10.42 11-December-2022
147151
------------------------------

doc/html/pcre2api.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1686,7 +1686,7 @@ <h1>pcre2api man page</h1>
16861686
PCRE2_USE_OFFSET_LIMIT, which provides a more general limiting facility. If
16871687
PCRE2_FIRSTLINE is set with an offset limit, a match must occur in the first
16881688
line and also within the offset limit. In other words, whichever limit comes
1689-
first is used.
1689+
first is used. This option has no effect for anchored patterns.
16901690
<pre>
16911691
PCRE2_LITERAL
16921692
</pre>
@@ -2021,7 +2021,7 @@ <h1>pcre2api man page</h1>
20212021
</pre>
20222022
This option forces all the POSIX character classes, including [:digit:] and
20232023
[:xdigit:], to match only ASCII characters, even when PCRE2_UCP is set. It can
2024-
be changed within a pattern by means of the (?aP) option setting, but note that
2024+
be changed within a pattern by means of the (?aP) option setting, but note that
20252025
this also sets PCRE2_EXTRA_ASCII_DIGIT in order to ensure that (?-aP) unsets
20262026
all ASCII restrictions for POSIX classes.
20272027
<pre>
@@ -4140,7 +4140,7 @@ <h1>pcre2api man page</h1>
41404140
</P>
41414141
<br><a name="SEC43" href="#TOC1">REVISION</a><br>
41424142
<P>
4143-
Last updated: 12 October 2023
4143+
Last updated: 11 November 2023
41444144
<br>
41454145
Copyright &copy; 1997-2023 University of Cambridge.
41464146
<br>

doc/pcre2.txt

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1653,7 +1653,8 @@ COMPILING A PATTERN
16531653
greater than 3. See also PCRE2_USE_OFFSET_LIMIT, which provides a more
16541654
general limiting facility. If PCRE2_FIRSTLINE is set with an offset
16551655
limit, a match must occur in the first line and also within the offset
1656-
limit. In other words, whichever limit comes first is used.
1656+
limit. In other words, whichever limit comes first is used. This option
1657+
has no effect for anchored patterns.
16571658

16581659
PCRE2_LITERAL
16591660

@@ -3975,11 +3976,11 @@ AUTHOR
39753976

39763977
REVISION
39773978

3978-
Last updated: 12 October 2023
3979+
Last updated: 11 November 2023
39793980
Copyright (c) 1997-2023 University of Cambridge.
39803981

39813982

3982-
PCRE2 10.43 12 October 2023 PCRE2API(3)
3983+
PCRE2 10.43 11 November 2023 PCRE2API(3)
39833984
------------------------------------------------------------------------------
39843985

39853986

doc/pcre2api.3

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.TH PCRE2API 3 "12 October 2023" "PCRE2 10.43"
1+
.TH PCRE2API 3 "11 November 2023" "PCRE2 10.43"
22
.SH NAME
33
PCRE2 - Perl-compatible regular expressions (revised API)
44
.sp
@@ -1628,7 +1628,7 @@ PCRE2_FIRSTLINE if \fIstartoffset\fP is greater than 3. See also
16281628
PCRE2_USE_OFFSET_LIMIT, which provides a more general limiting facility. If
16291629
PCRE2_FIRSTLINE is set with an offset limit, a match must occur in the first
16301630
line and also within the offset limit. In other words, whichever limit comes
1631-
first is used.
1631+
first is used. This option has no effect for anchored patterns.
16321632
.sp
16331633
PCRE2_LITERAL
16341634
.sp
@@ -1979,7 +1979,7 @@ a pattern by means of the (?aT) option setting.
19791979
.sp
19801980
This option forces all the POSIX character classes, including [:digit:] and
19811981
[:xdigit:], to match only ASCII characters, even when PCRE2_UCP is set. It can
1982-
be changed within a pattern by means of the (?aP) option setting, but note that
1982+
be changed within a pattern by means of the (?aP) option setting, but note that
19831983
this also sets PCRE2_EXTRA_ASCII_DIGIT in order to ensure that (?-aP) unsets
19841984
all ASCII restrictions for POSIX classes.
19851985
.sp
@@ -4148,6 +4148,6 @@ Cambridge, England.
41484148
.rs
41494149
.sp
41504150
.nf
4151-
Last updated: 12 October 2023
4151+
Last updated: 11 November 2023
41524152
Copyright (c) 1997-2023 University of Cambridge.
41534153
.fi

doc/pcre2demo.3

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.TH PCRE2DEMO 3 "12 October 2023" "PCRE2 10.43-DEV"
1+
.TH PCRE2DEMO 3 "11 November 2023" "PCRE2 10.43-DEV"
22
.\"AUTOMATICALLY GENERATED BY PrepareRelease - do not EDIT!
33
.SH NAME
44
// - A demonstration C program for PCRE2 - //

src/pcre2_dfa_match.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3443,7 +3443,7 @@ anchored = (options & (PCRE2_ANCHORED|PCRE2_DFA_RESTART)) != 0 ||
34433443
where to start. */
34443444

34453445
startline = (re->flags & PCRE2_STARTLINE) != 0;
3446-
firstline = (re->overall_options & PCRE2_FIRSTLINE) != 0;
3446+
firstline = !anchored && (re->overall_options & PCRE2_FIRSTLINE) != 0;
34473447
bumpalong_limit = end_subject;
34483448

34493449
/* Initialize and set up the fixed fields in the callout block, with a pointer

src/pcre2_match.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6836,7 +6836,7 @@ if (mcontext == NULL)
68366836
else mb->memctl = mcontext->memctl;
68376837

68386838
anchored = ((re->overall_options | options) & PCRE2_ANCHORED) != 0;
6839-
firstline = (re->overall_options & PCRE2_FIRSTLINE) != 0;
6839+
firstline = !anchored && (re->overall_options & PCRE2_FIRSTLINE) != 0;
68406840
startline = (re->flags & PCRE2_STARTLINE) != 0;
68416841
bumpalong_limit = (mcontext->offset_limit == PCRE2_UNSET)?
68426842
true_end_subject : subject + mcontext->offset_limit;

testdata/testinput2

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6028,4 +6028,17 @@ a)"xI
60286028

60296029
# --------
60306030

6031+
/
6032+
/anchored, firstline
6033+
\x0a
6034+
6035+
/
6036+
/anchored,firstline,no_start_optimize
6037+
\x0a
6038+
6039+
/
6040+
/firstline
6041+
\x0a
6042+
abc\x0adef
6043+
60316044
# End of testinput2

testdata/testinput6

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5026,4 +5026,17 @@
50265026
/c*+/
50275027
ab\=ph,offset=2
50285028

5029+
/
5030+
/anchored, firstline
5031+
\x0a
5032+
5033+
/
5034+
/anchored,firstline,no_start_optimize
5035+
\x0a
5036+
5037+
/
5038+
/firstline
5039+
\x0a
5040+
abc\x0adef
5041+
50295042
# End of testinput6

testdata/testoutput2

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17901,6 +17901,23 @@ No match
1790117901

1790217902
# --------
1790317903

17904+
/
17905+
/anchored, firstline
17906+
\x0a
17907+
0: \x0a
17908+
17909+
/
17910+
/anchored,firstline,no_start_optimize
17911+
\x0a
17912+
0: \x0a
17913+
17914+
/
17915+
/firstline
17916+
\x0a
17917+
0: \x0a
17918+
abc\x0adef
17919+
0: \x0a
17920+
1790417921
# End of testinput2
1790517922
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
1790617923
Error -62: bad serialized data

0 commit comments

Comments
 (0)