Skip to content

Commit 498bef0

Browse files
authored
pcre2test: improve binmode (#778)
Report parameters for OP_[V]REVERSE, avoid ambiguous back references, simplify code and update related documentation.
1 parent 09013d2 commit 498bef0

15 files changed

+213
-215
lines changed

doc/html/pcre2pattern.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3147,8 +3147,7 @@ <h3>
31473147
(?(VERSION&#62;=10.4)yes|no)
31483148
</pre>
31493149
This pattern matches "yes" if the PCRE2 version is greater or equal to 10.4, or
3150-
"no" otherwise. The fractional part of the version number may not contain more
3151-
than two digits.
3150+
"no" otherwise. The fractional part of the version number could be ommited.
31523151
</p>
31533152
<h3>
31543153
Assertion conditions
@@ -4184,7 +4183,7 @@ <h2><a name="SEC33" href="#TOC1">AUTHOR</a></h2>
41844183
</p>
41854184
<h2><a name="SEC34" href="#TOC1">REVISION</a></h2>
41864185
<p>
4187-
Last updated: 29 August 2025
4186+
Last updated: 2 September 2025
41884187
<br>
41894188
Copyright &copy; 1997-2024 University of Cambridge.
41904189
<br>

doc/html/pcre2syntax.html

Lines changed: 27 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -650,22 +650,26 @@ <h2><a name="SEC29" href="#TOC1">CONDITIONAL PATTERNS</a></h2>
650650
(?(condition)yes-pattern)
651651
(?(condition)yes-pattern|no-pattern)
652652

653-
(?(n) absolute reference condition
654-
(?(+n) relative reference condition (PCRE2 extension)
655-
(?(-n) relative reference condition (PCRE2 extension)
656-
(?(&#60;name&#62;) named reference condition (Perl)
657-
(?('name') named reference condition (Perl)
658-
(?(name) named reference condition (PCRE2, deprecated)
659-
(?(R) overall recursion condition
660-
(?(Rn) specific numbered group recursion condition
661-
(?(R&name) specific named group recursion condition
662-
(?(DEFINE) define groups for reference
663-
(?(VERSION[&#62;]=n.m) test PCRE2 version
664-
(?(assert) assertion condition
653+
(?(n) absolute reference condition
654+
(?(+n) relative reference condition (PCRE2 extension)
655+
(?(-n) relative reference condition (PCRE2 extension)
656+
(?(&#60;name&#62;) named reference condition (Perl)
657+
(?('name') named reference condition (Perl)
658+
(?(name) named reference condition (PCRE2, deprecated)
659+
(?(R) overall recursion condition
660+
(?(Rn) specific numbered group recursion condition
661+
(?(R&name) specific named group recursion condition
662+
(?(DEFINE) define groups for reference
663+
(?(VERSION[&#62;]=n[.m]) test PCRE2 version
664+
(?(assert) assertion condition
665665
</pre>
666666
Note the ambiguity of (?(R) and (?(Rn) which might be named reference
667667
conditions or recursion tests. Such a condition is interpreted as a reference
668668
condition if the relevant named group exists.
669+
<br>
670+
<br>
671+
The parts within brackets for the VERSION conditional syntax could be ommited.
672+
The fractional part of the version number defaults to 0 in that case.
669673
</p>
670674
<h2><a name="SEC30" href="#TOC1">BACKTRACKING CONTROL</a></h2>
671675
<p>
@@ -727,16 +731,16 @@ <h2><a name="SEC32" href="#TOC1">REPLACEMENT STRINGS</a></h2>
727731
1. Backslash is an escape character, and the forms described in "ESCAPED
728732
CHARACTERS" above are recognized. Also:
729733
<pre>
730-
\Q...\E can be used to suppress interpretation
731-
\l force the next character to lower case
732-
\u force the next character to upper case
733-
\L force subsequent characters to lower case
734-
\U force subsequent characters to upper case
735-
\u\L force next character to upper case, then all lower
736-
\l\U force next character to lower case, then all upper
737-
\E end \L or \U case forcing
738-
\b backspace character (note: as in character class in pattern)
739-
\v vertical tab character (note: not the same as in a pattern)
734+
\Q...\E can be used to suppress interpretation
735+
\l force the next character to lower case
736+
\u force the next character to upper case
737+
\L force subsequent characters to lower case
738+
\U force subsequent characters to upper case
739+
\u\L force next character to upper case, then all lower
740+
\l\U force next character to lower case, then all upper
741+
\E end \L or \U case forcing
742+
\b backspace character (note: as in character class in pattern)
743+
\v vertical tab character (note: not the same as in a pattern)
740744
</pre>
741745
2. The Python form \g&#60;n&#62;, where the angle brackets are part of the syntax and
742746
<i>n</i> is either a group name or a number, is recognized as an alternative way
@@ -767,7 +771,7 @@ <h2><a name="SEC34" href="#TOC1">AUTHOR</a></h2>
767771
</p>
768772
<h2><a name="SEC35" href="#TOC1">REVISION</a></h2>
769773
<p>
770-
Last updated: 28 March 2025
774+
Last updated: 2 September 2025
771775
<br>
772776
Copyright &copy; 1997-2024 University of Cambridge.
773777
<br>

doc/pcre2.txt

Lines changed: 34 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -9792,8 +9792,8 @@ CONDITIONAL GROUPS
97929792
(?(VERSION>=10.4)yes|no)
97939793

97949794
This pattern matches "yes" if the PCRE2 version is greater or equal to
9795-
10.4, or "no" otherwise. The fractional part of the version number may
9796-
not contain more than two digits.
9795+
10.4, or "no" otherwise. The fractional part of the version number
9796+
could be ommited.
97979797

97989798
Assertion conditions
97999799

@@ -10757,11 +10757,11 @@ AUTHOR
1075710757

1075810758
REVISION
1075910759

10760-
Last updated: 29 August 2025
10760+
Last updated: 2 September 2025
1076110761
Copyright (c) 1997-2024 University of Cambridge.
1076210762

1076310763

10764-
PCRE2 10.47-DEV 29 August 2025 PCRE2PATTERN(3)
10764+
PCRE2 10.47-DEV 2 September 2025 PCRE2PATTERN(3)
1076510765
------------------------------------------------------------------------------
1076610766

1076710767

@@ -12262,23 +12262,27 @@ CONDITIONAL PATTERNS
1226212262
(?(condition)yes-pattern)
1226312263
(?(condition)yes-pattern|no-pattern)
1226412264

12265-
(?(n) absolute reference condition
12266-
(?(+n) relative reference condition (PCRE2 extension)
12267-
(?(-n) relative reference condition (PCRE2 extension)
12268-
(?(<name>) named reference condition (Perl)
12269-
(?('name') named reference condition (Perl)
12270-
(?(name) named reference condition (PCRE2, deprecated)
12271-
(?(R) overall recursion condition
12272-
(?(Rn) specific numbered group recursion condition
12273-
(?(R&name) specific named group recursion condition
12274-
(?(DEFINE) define groups for reference
12275-
(?(VERSION[>]=n.m) test PCRE2 version
12276-
(?(assert) assertion condition
12265+
(?(n) absolute reference condition
12266+
(?(+n) relative reference condition (PCRE2 extension)
12267+
(?(-n) relative reference condition (PCRE2 extension)
12268+
(?(<name>) named reference condition (Perl)
12269+
(?('name') named reference condition (Perl)
12270+
(?(name) named reference condition (PCRE2, deprecated)
12271+
(?(R) overall recursion condition
12272+
(?(Rn) specific numbered group recursion condition
12273+
(?(R&name) specific named group recursion condition
12274+
(?(DEFINE) define groups for reference
12275+
(?(VERSION[>]=n[.m]) test PCRE2 version
12276+
(?(assert) assertion condition
1227712277

1227812278
Note the ambiguity of (?(R) and (?(Rn) which might be named reference
1227912279
conditions or recursion tests. Such a condition is interpreted as a
1228012280
reference condition if the relevant named group exists.
1228112281

12282+
The parts within brackets for the VERSION conditional syntax could be
12283+
ommited. The fractional part of the version number defaults to 0 in
12284+
that case.
12285+
1228212286

1228312287
BACKTRACKING CONTROL
1228412288

@@ -12342,20 +12346,19 @@ REPLACEMENT STRINGS
1234212346
1. Backslash is an escape character, and the forms described in "ES-
1234312347
CAPED CHARACTERS" above are recognized. Also:
1234412348

12345-
\Q...\E can be used to suppress interpretation
12346-
\l force the next character to lower case
12347-
\u force the next character to upper case
12348-
\L force subsequent characters to lower case
12349-
\U force subsequent characters to upper case
12350-
\u\L force next character to upper case, then all lower
12351-
\l\U force next character to lower case, then all upper
12352-
\E end \L or \U case forcing
12353-
\b backspace character (note: as in character class in pat-
12354-
tern)
12355-
\v vertical tab character (note: not the same as in a pattern)
12349+
\Q...\E can be used to suppress interpretation
12350+
\l force the next character to lower case
12351+
\u force the next character to upper case
12352+
\L force subsequent characters to lower case
12353+
\U force subsequent characters to upper case
12354+
\u\L force next character to upper case, then all lower
12355+
\l\U force next character to lower case, then all upper
12356+
\E end \L or \U case forcing
12357+
\b backspace character (note: as in character class in pattern)
12358+
\v vertical tab character (note: not the same as in a pattern)
1235612359

1235712360
2. The Python form \g<n>, where the angle brackets are part of the syn-
12358-
tax and n is either a group name or a number, is recognized as an al-
12361+
tax and n is either a group name or a number, is recognized as an al-
1235912362
ternative way of inserting the contents of a group, for example \g<3>.
1236012363

1236112364
3. Capture substitution supports the following additional forms:
@@ -12369,7 +12372,7 @@ REPLACEMENT STRINGS
1236912372

1237012373
SEE ALSO
1237112374

12372-
pcre2pattern(3), pcre2api(3), pcre2callout(3), pcre2matching(3),
12375+
pcre2pattern(3), pcre2api(3), pcre2callout(3), pcre2matching(3),
1237312376
pcre2(3).
1237412377

1237512378

@@ -12382,11 +12385,11 @@ AUTHOR
1238212385

1238312386
REVISION
1238412387

12385-
Last updated: 28 March 2025
12388+
Last updated: 2 September 2025
1238612389
Copyright (c) 1997-2024 University of Cambridge.
1238712390

1238812391

12389-
PCRE2 10.47-DEV 28 March 2025 PCRE2SYNTAX(3)
12392+
PCRE2 10.47-DEV 2 September 2025 PCRE2SYNTAX(3)
1239012393
------------------------------------------------------------------------------
1239112394

1239212395

doc/pcre2pattern.3

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.TH PCRE2PATTERN 3 "29 August 2025" "PCRE2 10.47-DEV"
1+
.TH PCRE2PATTERN 3 "2 September 2025" "PCRE2 10.47-DEV"
22
.SH NAME
33
PCRE2 - Perl-compatible regular expressions (revised API)
44
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
@@ -3186,8 +3186,7 @@ For example:
31863186
(?(VERSION>=10.4)yes|no)
31873187
.sp
31883188
This pattern matches "yes" if the PCRE2 version is greater or equal to 10.4, or
3189-
"no" otherwise. The fractional part of the version number may not contain more
3190-
than two digits.
3189+
"no" otherwise. The fractional part of the version number could be ommited.
31913190
.
31923191
.
31933192
.SS "Assertion conditions"
@@ -4231,6 +4230,6 @@ Cambridge, England.
42314230
.rs
42324231
.sp
42334232
.nf
4234-
Last updated: 29 August 2025
4233+
Last updated: 2 September 2025
42354234
Copyright (c) 1997-2024 University of Cambridge.
42364235
.fi

doc/pcre2syntax.3

Lines changed: 27 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.TH PCRE2SYNTAX 3 "28 March 2025" "PCRE2 10.47-DEV"
1+
.TH PCRE2SYNTAX 3 "2 September 2025" "PCRE2 10.47-DEV"
22
.SH NAME
33
PCRE2 - Perl-compatible regular expressions (revised API)
44
.SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
@@ -627,22 +627,25 @@ following ways:
627627
(?(condition)yes-pattern)
628628
(?(condition)yes-pattern|no-pattern)
629629
.sp
630-
(?(n) absolute reference condition
631-
(?(+n) relative reference condition (PCRE2 extension)
632-
(?(-n) relative reference condition (PCRE2 extension)
633-
(?(<name>) named reference condition (Perl)
634-
(?('name') named reference condition (Perl)
635-
(?(name) named reference condition (PCRE2, deprecated)
636-
(?(R) overall recursion condition
637-
(?(Rn) specific numbered group recursion condition
638-
(?(R&name) specific named group recursion condition
639-
(?(DEFINE) define groups for reference
640-
(?(VERSION[>]=n.m) test PCRE2 version
641-
(?(assert) assertion condition
630+
(?(n) absolute reference condition
631+
(?(+n) relative reference condition (PCRE2 extension)
632+
(?(-n) relative reference condition (PCRE2 extension)
633+
(?(<name>) named reference condition (Perl)
634+
(?('name') named reference condition (Perl)
635+
(?(name) named reference condition (PCRE2, deprecated)
636+
(?(R) overall recursion condition
637+
(?(Rn) specific numbered group recursion condition
638+
(?(R&name) specific named group recursion condition
639+
(?(DEFINE) define groups for reference
640+
(?(VERSION[>]=n[.m]) test PCRE2 version
641+
(?(assert) assertion condition
642642
.sp
643643
Note the ambiguity of (?(R) and (?(Rn) which might be named reference
644644
conditions or recursion tests. Such a condition is interpreted as a reference
645645
condition if the relevant named group exists.
646+
.sp
647+
The parts within brackets for the VERSION conditional syntax could be ommited.
648+
The fractional part of the version number defaults to 0 in that case.
646649
.
647650
.
648651
.SH "BACKTRACKING CONTROL"
@@ -708,16 +711,16 @@ there is additional interpretation:
708711
1. Backslash is an escape character, and the forms described in "ESCAPED
709712
CHARACTERS" above are recognized. Also:
710713
.sp
711-
\eQ...\eE can be used to suppress interpretation
712-
\el force the next character to lower case
713-
\eu force the next character to upper case
714-
\eL force subsequent characters to lower case
715-
\eU force subsequent characters to upper case
716-
\eu\eL force next character to upper case, then all lower
717-
\el\eU force next character to lower case, then all upper
718-
\eE end \eL or \eU case forcing
719-
\eb backspace character (note: as in character class in pattern)
720-
\ev vertical tab character (note: not the same as in a pattern)
714+
\eQ...\eE can be used to suppress interpretation
715+
\el force the next character to lower case
716+
\eu force the next character to upper case
717+
\eL force subsequent characters to lower case
718+
\eU force subsequent characters to upper case
719+
\eu\eL force next character to upper case, then all lower
720+
\el\eU force next character to lower case, then all upper
721+
\eE end \eL or \eU case forcing
722+
\eb backspace character (note: as in character class in pattern)
723+
\ev vertical tab character (note: not the same as in a pattern)
721724
.sp
722725
2. The Python form \eg<n>, where the angle brackets are part of the syntax and
723726
\fIn\fP is either a group name or a number, is recognized as an alternative way
@@ -753,6 +756,6 @@ Cambridge, England.
753756
.rs
754757
.sp
755758
.nf
756-
Last updated: 28 March 2025
759+
Last updated: 2 September 2025
757760
Copyright (c) 1997-2024 University of Cambridge.
758761
.fi

0 commit comments

Comments
 (0)