Skip to content

Commit ad5b086

Browse files
committed
maint: introduce LintMan to aid on tracking/updating values
Allow tagging the documentation with a `#define` value that could be then updated programmatically. Update the value for MAX_NAME_SIZE in pcre2limits.3 that was missing since ced3b0f (Increase name length to 128, 2024-03-11) and while at it, improve on its description and a tag for a related variable. For completeness, add also a tag to the same value in pcre2pattern.3 and the configuration for VMS that was missing since 6c670c7 (Update overlooked cmake update of name size to 128, 2024-03-11)
1 parent ead3652 commit ad5b086

File tree

6 files changed

+82
-5
lines changed

6 files changed

+82
-5
lines changed

doc/pcre2limits.3

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,13 @@ when PCRE2 is built; if not, the default is set to 250. An application can
4747
change this limit by calling pcre2_set_parens_nest_limit() to set the limit in
4848
a compile context.
4949
.P
50-
The maximum length of name for a named capture group is 32 code units, and the
51-
maximum number of such groups is 10000.
50+
The maximum length of the name for a named capture group as well as the number
51+
of such groups is configurable at build time. The maximum length of the name
52+
defaults to
53+
.\" DEFINE MAX_NAME_SIZE
54+
128 code units, and the maximum number of such groups to
55+
.\" DEFINE MAX_NAME_COUNT
56+
10000.
5257
.P
5358
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
5459
is 255 code units for the 8-bit library and 65535 code units for the 16-bit and

doc/pcre2pattern.3

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2015,8 +2015,9 @@ the naming of capture groups. This feature was not added to Perl until release
20152015
using the Python syntax. PCRE2 supports both the Perl and the Python syntax.
20162016
.P
20172017
In PCRE2, a capture group can be named in one of three ways: (?<name>...) or
2018-
(?'name'...) as in Perl, or (?P<name>...) as in Python. Names may be up to 128
2019-
code units long. When PCRE2_UTF is not set, they may contain only ASCII
2018+
(?'name'...) as in Perl, or (?P<name>...) as in Python. Names may be up to
2019+
.\" DEFINE MAX_NAME_SIZE
2020+
128 code units long. When PCRE2_UTF is not set, they may contain only ASCII
20202021
alphanumeric characters and underscores, but must start with a non-digit. When
20212022
PCRE2_UTF is set, the syntax of group names is extended to allow any Unicode
20222023
letter or Unicode decimal digit. In other words, group names must match one of

maint/CheckMan

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ while (scalar(@ARGV) > 0)
3939
^\.P\s*$|
4040
^\.PP\s*$|
4141
^\.\\"(?:\ HREF)?\s*$|
42+
^\.\\"\sDEFINE\s\w+$|
4243
^\.\\"\sHTML\s<a\shref="[^"]+?">\s*$|
4344
^\.\\"\sHTML\s<a\sname="[^"]+?"><\/a>\s*$|
4445
^\.\\"\s<\/a>\s*$|

maint/LintMan

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
#!/usr/bin/perl
2+
3+
use warnings;
4+
use strict;
5+
use Getopt::Long;
6+
use vars qw /$opt_verbose/;
7+
8+
# A script to scan PCRE2's man pages to check for values that might need to
9+
# be updatd to match the code.
10+
#
11+
# It updates numerical values after \" DEFINE <name> or errors if name is
12+
# not found.
13+
14+
my $file;
15+
my %defs;
16+
17+
foreach $file ("../src/config.h")
18+
{
19+
open (INCLUDE, $file) or die "Failed to open include $file\n";
20+
21+
while (<INCLUDE>)
22+
{
23+
next unless /^#define ([[:upper:]_\d]+)\s+(\d+)/a;
24+
$defs{$1} = $2;
25+
}
26+
27+
close(INCLUDE);
28+
}
29+
30+
GetOptions("verbose");
31+
while (scalar(@ARGV) > 0)
32+
{
33+
$file = shift @ARGV;
34+
35+
open my $fh, "+<", $file or die "Failed to open $file\n";
36+
37+
my @lines = <$fh>;
38+
my $updated = 0;
39+
40+
foreach my $index (0 .. $#lines)
41+
{
42+
if ($lines[$index] =~ /^\.\\"\sDEFINE\s([[:upper:]_\d]+)$/a)
43+
{
44+
my $l = $index + 1;
45+
die "Invalid DEFINE line $l of $file\n" unless defined $lines[$l];
46+
47+
my $key = $1;
48+
die "Bad DEFINE key $key line $l of $file\n" unless exists $defs{$key};
49+
50+
my $value = $defs{$key};
51+
if ($lines[$index + 1] !~ /^$value\b/)
52+
{
53+
$updated += $lines[$index + 1] =~ s/^\d+/$value/a;
54+
print "Updated $key in $file to $value\n" if $opt_verbose;
55+
}
56+
}
57+
}
58+
59+
if ($updated > 0)
60+
{
61+
seek($fh, 0, 0);
62+
print $fh @lines;
63+
truncate($fh, tell($fh));
64+
}
65+
close($fh);
66+
}

maint/README

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,10 @@ GenerateUcpTables.py
6060
GenerateCommon.py and Unicode data files. The generated file contains tables
6161
for looking up Unicode property names.
6262

63+
LintMan
64+
A Perl script to check and update magic numbers in the documentation that
65+
correspond to configurable settings in the codebase.
66+
6367
manifest-*
6468
Data files used to verify the contents of the distribution tarball and
6569
`make install` file lists.

vms/configure.com

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -905,7 +905,7 @@ sure both macros are undefined; an emulation function will then be used. */
905905
#define PCRE2_EXPORT
906906
#define LINK_SIZE 2
907907
#define MAX_NAME_COUNT 10000
908-
#define MAX_NAME_SIZE 32
908+
#define MAX_NAME_SIZE 128
909909
#define MATCH_LIMIT 10000000
910910
#define HEAP_LIMIT 20000000
911911
#define NEWLINE_DEFAULT 2

0 commit comments

Comments
 (0)