Skip to content

Commit c06a4b8

Browse files
authored
pcre2posix: additional updates for recent changes (#338)
* pcre2posix: make code warning free and update ChangeLog Somehow previous fix was not ammended to include this code change, take the opportunity to update ChangeLog and do other cleanup so it will be at least worth a PR. Those found responsible have been sacked * pcre2posix: fix crash on recent regerror code Since 0710ce2 (pcre2posix: avoid snprintf quirks in regerror (#333), 2023-11-15), a call for snprintf was replaced by a pair of strncpy and buf[errbuf_size - 1] = 0, but it didn't account for the case where errbuf_size == 0. Make the code conditional to mimic the original logic and avoid crashing. * doc: document that the POSIX interface is not POSIX compatible POSIX 1003.1-2008 requires that regoff_t be at least as large as ssize_t or ptrdiff_t, but we use int and therefore any match is restricted to what that can hold, even in 64-bit architectures.
1 parent 13a933e commit c06a4b8

File tree

4 files changed

+23
-11
lines changed

4 files changed

+23
-11
lines changed

ChangeLog

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -142,16 +142,19 @@ above because \b and \B are defined in terms of \w.
142142
option, and (?aP) also sets (?aT) so that (?-aP) disables all ASCII
143143
restrictions on POSIX classes.
144144

145-
37. If PCRE2_FIRSTLINE was set on an anchored pattern, pcre2_match() and
146-
pcre2_dfa_match() misbehaved. PCRE2_FIRSTLINE is now ignored for anchored
145+
37. If PCRE2_FIRSTLINE was set on an anchored pattern, pcre2_match() and
146+
pcre2_dfa_match() misbehaved. PCRE2_FIRSTLINE is now ignored for anchored
147147
patterns.
148148

149-
38. Add a test for ridiculous ovector offset values to the substring extraction
149+
38. Add a test for ridiculous ovector offset values to the substring extraction
150150
functions.
151151

152-
39. Make OP_REVERSE use IMM2_SIZE for its data instead of LINK_SIZE, for
152+
39. Make OP_REVERSE use IMM2_SIZE for its data instead of LINK_SIZE, for
153153
consistency with OP_VREVERSE.
154154

155+
40. In some legacy environments with a pre C99 snprintf, pcre2_regerror could
156+
return an incorrect value when the provided buffer was too small.
157+
155158

156159
Version 10.42 11-December-2022
157160
------------------------------

HACKING

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -742,7 +742,7 @@ different (but fixed) length.
742742
Variable-length backward assertions whose maximum matching length is limited
743743
are also supported. For such assertions, the first opcode inside each branch is
744744
OP_VREVERSE, followed by the minimum and maximum lengths for that branch,
745-
unless these happen to be equal, in which case OP_REVERSE is used. These
745+
unless these happen to be equal, in which case OP_REVERSE is used. These
746746
IMM2_SIZE values occupy two code units each in 8-bit mode, and 1 code unit in
747747
16/32 bit modes.
748748

doc/pcre2posix.3

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,8 @@ documentation for a description of PCRE2's native API, which contains much
3232
additional functionality.
3333
.P
3434
\fBIMPORTANT NOTE\fP: The functions described here are NOT thread-safe, and
35-
should not be used in multi-threaded applications. Use the native API instead.
35+
should not be used in multi-threaded applications. They are also limited to
36+
processing subjects that are not bigger than 2GB. Use the native API instead.
3637
.P
3738
These functions are wrapper functions that ultimately call the PCRE2 native
3839
API. Their prototypes are defined in the \fBpcre2posix.h\fP header file, and
@@ -74,7 +75,7 @@ captured substrings. It also defines some constants whose names start with
7475
.sp
7576
Note that these functions are just POSIX-style wrappers for PCRE2's native API.
7677
They do not give POSIX regular expression behaviour, and they are not
77-
thread-safe.
78+
thread-safe or even POSIX compatible.
7879
.P
7980
Those POSIX option bits that can reasonably be mapped to PCRE2 native options
8081
have been implemented. In addition, the option REG_EXTENDED is defined with the
@@ -298,6 +299,10 @@ entire portion of \fIstring\fP that was matched; subsequent elements relate to
298299
the capturing subpatterns of the regular expression. Unused entries in the
299300
array have both structure members set to -1.
300301
.P
302+
\fIregmatch_t\fP as well as the \fIregoff_t\fP typedef it uses are defined in
303+
\fBpcre2posix.h\fP and are not warranted to have the same size or layout as other
304+
similarly named types from other libraries that provide POSIX-style matching.
305+
.P
301306
A successful match yields a zero return; various error codes are defined in the
302307
header file, of which REG_NOMATCH is the "expected" failure code.
303308
.

src/pcre2posix.c

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ static int message_len(const char *message, int offset)
168168
char buf[12];
169169

170170
/* 11 magic number comes from the format below */
171-
return strlen(message) + 11 + snprintf(buf, sizeof(buf), "%d", offset);
171+
return (int)strlen(message) + 11 + snprintf(buf, sizeof(buf), "%d", offset);
172172
}
173173

174174
/*************************************************
@@ -198,9 +198,13 @@ if (preg != NULL && (int)preg->re_erroffset != -1)
198198
}
199199
else
200200
{
201-
ret = len = strlen(message);
202-
strncpy(errbuf, message, errbuf_size);
203-
if (errbuf_size <= len) errbuf[errbuf_size - 1] = '\0';
201+
len = strlen(message);
202+
if (errbuf_size != 0)
203+
{
204+
strncpy(errbuf, message, errbuf_size);
205+
if (errbuf_size <= len) errbuf[errbuf_size - 1] = '\0';
206+
}
207+
ret = (int)len;
204208
}
205209

206210
do {

0 commit comments

Comments
 (0)