diff --git a/reference/pcre/book.xml b/reference/pcre/book.xml index 808f92d13c2f..86cda9587085 100644 --- a/reference/pcre/book.xml +++ b/reference/pcre/book.xml @@ -9,41 +9,32 @@ &reftitle.intro; - The syntax for patterns used in these functions closely resembles - Perl. The expression must be enclosed in the delimiters, a - forward slash (/), for example. Delimiters can be any - non-alphanumeric, non-whitespace ASCII character except the backslash (\) and the - null byte. If the delimiter character has to be used in the - expression itself, it needs to be escaped by backslash. - Perl-style (), {}, [], and <> matching delimiters may also be used. + This extension integrates regular expression pattern matching support into + PHP. It is based on the free and open-source + PCRE2 library. + This library implements regular expression pattern matching + using syntax and semantics compatible with Perl, + with just a few differences. See Pattern Syntax - for detailed explanation. + for a detailed usage explanation. - The ending delimiter may be followed by various modifiers that - affect the matching. - See Pattern - Modifiers. + To improve performance, the extension caches compiled regular expressions. + Each thread has its own dedicated cache, capable of holding up to 4096 + expressions. - - This extension maintains a global per-thread cache of compiled regular - expressions (up to 4096). + When the cache reaches capacity, it automatically removes the oldest + entry to make room for new oneones, following a "First-In, First-Out" (FIFO) + policy. The cache size is not configurable. - You should be aware of some limitations of PCRE. Read &url.pcre.man; for more info. + There are some size and other limitations + in PCRE2 that can occasionally be relevant. - - - The PCRE library is a set of functions that implement regular - expression pattern matching using the same syntax and semantics - as Perl 5, with just a few differences (see below). The current - implementation corresponds to Perl 5.005. - &reference.pcre.setup; diff --git a/reference/pcre/configure.xml b/reference/pcre/configure.xml index 5ce103488977..f888fe4d1265 100644 --- a/reference/pcre/configure.xml +++ b/reference/pcre/configure.xml @@ -3,27 +3,27 @@
&reftitle.install; - The PCRE extension is a core PHP extension, so it is always enabled. - By default, this extension is compiled using the bundled PCRE - library. Alternatively, an external PCRE library can be used by - passing in the - configuration option where DIR is the location of - PCRE's include and library files. It is recommended to use PCRE 8.10 or newer; - as of PHP 7.3.0, PCRE2 is required. + The PCRE extension is a core PHP extension and is always enabled. - PCRE's just-in-time compilation is supported by default, which - can be disabled with the - configuration option as of PHP 7.0.12. + The extension uses a bundled version (by default) of the PCRE2 library. + An external PCRE2 library can be used instead by using the + configuration + option. The minimum version supported is 10.30. + + + PCRE's just-in-time (JIT) compilation is enabled by default. + It can be disabled by using the + configuration option. &windows.builtin; - PCRE is an active project and as it changes so does the PHP + PCRE2 is an active project and as it changes so does the PHP functionality that relies upon it. It is possible that certain parts - of the PHP documentation is outdated, in that it may not cover the - newest features that PCRE provides. For a list of changes, see the - PCRE library changelog - and also the following bundled PCRE history: + of the PHP documentation is outdated. For a list of changes, see the + PCRE2 library changelog. + Also, if using the bundled library, refer to the following bundled PCRE library + history: @@ -37,6 +37,21 @@ + + 8.5.0 (upcoming) + 10.46 + + + + 8.4.0 + 10.44 + + + + 8.3.0 + 10.42 + + 8.2.0 10.40 diff --git a/reference/pcre/pattern.differences.xml b/reference/pcre/pattern.differences.xml index 290c3d849444..b6c9b6da58dc 100644 --- a/reference/pcre/pattern.differences.xml +++ b/reference/pcre/pattern.differences.xml @@ -5,133 +5,11 @@ Perl Differences Differences From Perl - The differences described here are with respect to Perl 5.005. - - - - By default, a whitespace character is any character that - the C library function isspace() recognizes, though it is - possible to compile PCRE with alternative character type - tables. Normally isspace() matches space, formfeed, newline, - carriage return, horizontal tab, and vertical tab. Perl 5 no - longer includes vertical tab in its set of whitespace characters. - The \v escape that was in the Perl documentation for - a long time was never in fact recognized. However, the character - itself was treated as whitespace at least up to 5.002. - In 5.004 and 5.005 it does not match \s. - - - - - PCRE does not allow repeat quantifiers on lookahead - assertions. Perl permits them, but they do not mean what you - might think. For example, (?!a){3} does not assert that the - next three characters are not "a". It just asserts that the - next character is not "a" three times. - - - - - Capturing subpatterns that occur inside negative - lookahead assertions are counted, but their entries in the - offsets vector are never set. Perl sets its numerical - variables from any such patterns that are matched before the - assertion fails to match something (thereby succeeding), but - only if the negative lookahead assertion contains just one - branch. - - - - - Though binary zero characters are supported in the subject string, - they are not allowed in a pattern string because it is passed as a - normal C string, terminated by zero. The escape sequence "\x00" can - be used in the pattern to represent a binary zero. - - - - - The following Perl escape sequences are not supported: - \l, \u, \L, \U. In fact these are implemented by - Perl's general string-handling and are not part of its - pattern matching engine. - - - - - The Perl \G assertion is not supported as it is not - relevant to single pattern matches. - - - - - Fairly obviously, PCRE does not support the (?{code}) and (??{code}) - construction. However, there is support for recursive patterns. - - - - - There are at the time of writing some oddities in Perl - 5.005_02 concerned with the settings of captured strings - when part of a pattern is repeated. For example, matching - "aba" against the pattern /^(a(b)?)+$/ sets $2 to the value - "b", but matching "aabbaa" against /^(aa(bb)?)+$/ leaves $2 - unset. However, if the pattern is changed to - /^(aa(b(b))?)+$/ then $2 (and $3) get set. - In Perl 5.004 $2 is set in both cases, and that is also &true; - of PCRE. If in the future Perl changes to a consistent state - that is different, PCRE may change to follow. - - - - - Another as yet unresolved discrepancy is that in Perl - 5.005_02 the pattern /^(a)?(?(1)a|b)+$/ matches the string - "a", whereas in PCRE it does not. However, in both Perl and - PCRE /^(a)?a/ matched against "a" leaves $1 unset. - - - - - PCRE provides some extensions to the Perl regular - expression facilities: - - - - Although lookbehind assertions must match fixed length - strings, each alternative branch of a lookbehind assertion - can match a different length of string. Perl 5.005 requires - them all to have the same length. - - - - - If PCRE_DOLLAR_ENDONLY - is set and PCRE_MULTILINE is - not set, the $ meta-character matches only at the very end of the - string. - - - - - If PCRE_EXTRA is - set, a backslash followed by a letter with no special meaning is - faulted. - - - - - If PCRE_UNGREEDY is - set, the greediness of the repetition quantifiers is inverted, - that is, by default they are not greedy, but if followed by a - question mark they are. - - - - - - - + Both Perl and PCRE2 are continually changing. Refer to PCRE2's + latest documentation covering the + differences between PCRE2 + and Perl. The version of the PCRE2 library in-use is also + a relevant factor.