Skip to content

Commit 4f5a2ad

Browse files
authored
Update README.md (#672)
1 parent 81dced9 commit 4f5a2ad

File tree

2 files changed

+246
-41
lines changed

2 files changed

+246
-41
lines changed

LICENCE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
PCRE2 License
1+
PCRE2 Licence
22
=============
33

44
| SPDX-License-Identifier: | BSD-3-Clause WITH PCRE2-exception |

README.md

Lines changed: 245 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,261 @@
1-
# PCRE2 - Perl-Compatible Regular Expressions
1+
<picture>
2+
<source media="(prefers-color-scheme: dark)" width="100%" height="100px" srcset="https://raw.githubusercontent.com/PCRE2Project/pcre2/refs/heads/pages/pages/assets/pcre2-readme-dark.svg">
3+
<img alt="PCRE2: Perl-Compatible Regular Expressions" width="100%" height="100px" src="https://raw.githubusercontent.com/PCRE2Project/pcre2/refs/heads/pages/pages/assets/pcre2-readme-light.svg">
4+
</picture>
25

3-
The PCRE2 library is a set of C functions that implement regular expression
4-
pattern matching using the same syntax and semantics as Perl 5. PCRE2 has its
5-
own native API, as well as a set of wrapper functions that correspond to the
6-
POSIX regular expression API. The PCRE2 library is free, even for building
7-
proprietary software. It comes in three forms, for processing 8-bit, 16-bit,
8-
or 32-bit code units, in either literal or UTF encoding.
6+
## Overview
97

10-
PCRE2 was first released in 2015 to replace the API in the original PCRE
11-
library, which is now obsolete and no longer maintained. As well as a more
12-
flexible API, the code of PCRE2 has been much improved since the fork.
13-
14-
## Download
8+
The PCRE2 library is a set of C functions that implement **regular expression
9+
pattern matching**.
1510

16-
As well as downloading from the
17-
[GitHub site](https://github.com/PCRE2Project/pcre2), you can download PCRE2
18-
or the older, unmaintained PCRE1 library from an
19-
[*unofficial* mirror](https://sourceforge.net/projects/pcre/files/) at SourceForge.
11+
It is **self-contained and portable**, and designed to be **easy to embed** into existing
12+
projects and build systems, on almost **any platform** or build target.
2013

21-
You can check out the PCRE2 source code via Git or Subversion:
14+
The PCRE2 library is **free and open-source** (BSD licence), and permitted in proprietary software.
2215

16+
It supports Unicode matching and a very wide range of regular expression features. It accepts input in various character encodings, and optionally includes a highly **performant JIT matching engine**.
17+
18+
PCRE2 is **mature and highly-trusted**: bundled in dozens or hundreds of open-source and commercial products, such as Excel, Safari, Apache, and Git, and used as the basis for regular expressions in several programming languages including PHP and R.
19+
20+
<table border="0">
21+
<tbody>
22+
<tr>
23+
<th align="left">Website</th>
24+
<td>
25+
26+
https://pcre2project.github.io/pcre2/
27+
28+
</td>
29+
</tr>
30+
<tr>
31+
<th align="left">Distribution</th>
32+
<td>
33+
34+
[![GitHub Release](https://img.shields.io/github/v/release/PCRE2Project/pcre2?display_name=release&style=flat-square&label=Latest%20release&color=006094)](https://github.com/PCRE2Project/pcre2/releases)&nbsp;
35+
[![BSD licence](https://img.shields.io/badge/Licence-BSD%203--clause-006094?style=flat-square)](https://github.com/PCRE2Project/pcre2/blob/master/LICENCE.md)
36+
37+
</tr>
38+
</tr>
39+
<tr>
40+
<th align="left">Testing</th>
41+
<td>
42+
43+
[![Codecov](https://img.shields.io/codecov/c/github/PCRE2Project/pcre2?style=flat-square&logo=codecov&label=Coverage&color=009400)](https://app.codecov.io/gh/PCRE2Project/pcre2/components)&nbsp;
44+
[![Clang Sanitizers](https://img.shields.io/badge/Clang-Sanitizers-262D3A?style=flat-square&logo=llvm&color=006094)](https://github.com/PCRE2Project/pcre2/actions/workflows/dev.yml)&nbsp;
45+
[![Clang Static Analyzer](https://img.shields.io/badge/Clang-Static%20Analyzer-262D3A?style=flat-square&logo=llvm&color=006094)](https://github.com/PCRE2Project/pcre2/actions/workflows/clang-analyzer.yml)&nbsp;
46+
[![Valgrind](https://img.shields.io/badge/Valgrind-006094?style=flat-square)](https://github.com/PCRE2Project/pcre2/actions/workflows/dev.yml)&nbsp;
47+
[![Coverity Scan](https://img.shields.io/coverity/scan/pcre2?style=flat-square&label=Coverity&color=009400)](https://scan.coverity.com/projects/pcre2?tab=overview)&nbsp;
48+
[![CodeQL](https://img.shields.io/badge/GitHub-CodeQL-006094?style=flat-square)](https://github.com/PCRE2Project/pcre2/actions/workflows/codeql.yml)&nbsp;
49+
[![OSS-Fuzz](https://img.shields.io/badge/Google-OSS--Fuzz-006094?style=flat-square)](https://google.github.io/oss-fuzz/)&nbsp;
50+
[![OSSF-Scorecard Score](https://img.shields.io/ossf-scorecard/github.com/PCRE2Project/pcre2?style=flat-square&label=OSSF-Scorecard&color=009400)](https://scorecard.dev/viewer/?uri=github.com%2FPCRE2Project%2Fpcre2)&nbsp;
51+
52+
</td>
53+
</tr>
54+
<tr>
55+
<th align="left">Platforms</th>
56+
<td>Tested continuously on Linux, Windows, macOS, FreeBSD, Solaris;<br />
57+
x86, ARM, RISC-V, POWER, S390X; many others known to work
58+
</td>
59+
</tr>
60+
</tbody>
61+
</table>
62+
63+
## Quickstart
64+
65+
<picture>
66+
<source media="(prefers-color-scheme: dark)" width="787px" srcset="https://github.com/user-attachments/assets/1886bc4b-2e05-4827-af83-e4ed45f25ab1">
67+
<img width="787px" src="https://github.com/user-attachments/assets/7b90180e-276e-4202-b590-b72871cff91a">
68+
</picture>
69+
70+
<details>
71+
<summary>Show script</summary>
72+
73+
```bash session
74+
# Fetch PCRE2 with 'git clone', or use curl/wget to download a release.
75+
# Here, let's use git to check out a release tag:
76+
git clone https://github.com/PCRE2Project/pcre2.git ./pcre2 \
77+
--branch pcre2-$PCRE2_VERSION \
78+
-c advice.detachedHead=false --depth 1
79+
80+
# Now let's build PCRE2:
81+
(cd ./pcre2; \
82+
cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug -B build; \
83+
cmake --build build/)
84+
85+
# Great, PCRE2 is built.
86+
87+
# Here's a quick little demo to show how we can make use of PCRE2.
88+
# For a fuller example, see './pcre2/src/pcre2demo.c'.
89+
# Try this pre-prepared sample code:
90+
cat demo.c
91+
92+
----------------------------------------------------------------------
93+
File: demo.c
94+
----------------------------------------------------------------------
95+
/* Set PCRE2_CODE_UNIT_WIDTH to indicate we will use 8-bit input. */
96+
#define PCRE2_CODE_UNIT_WIDTH 8
97+
#include <pcre2.h>
98+
99+
#include <string.h> /* for strlen */
100+
#include <stdio.h> /* for printf */
101+
102+
int main(int argc, char* argv[]) {
103+
if (argc != 3) {
104+
fprintf(stderr, "Usage: %s <pattern> <subject>\n", argv[0]);
105+
return 1;
106+
}
107+
108+
const char *pattern = argv[1];
109+
const char *subject = argv[2];
110+
111+
/* Compile the pattern. */
112+
int error_number;
113+
PCRE2_SIZE error_offset;
114+
pcre2_code *re = pcre2_compile(
115+
pattern, /* the pattern */
116+
PCRE2_ZERO_TERMINATED, /* indicates pattern is zero-terminated */
117+
0, /* default options */
118+
&error_number, /* for error number */
119+
&error_offset, /* for error offset */
120+
NULL); /* use default compile context */
121+
if (re == NULL) {
122+
fprintf(stderr, "Invalid pattern: %s\n", argv[1]);
123+
return 1;
124+
}
125+
126+
/* Match the pattern against the subject text. */
127+
pcre2_match_data *match_data =
128+
pcre2_match_data_create_from_pattern(re, NULL);
129+
int rc = pcre2_match(
130+
re, /* the compiled pattern */
131+
subject, /* the subject text */
132+
strlen(subject), /* the length of the subject */
133+
0, /* start at offset 0 in the subject */
134+
0, /* default options */
135+
match_data, /* block for storing the result */
136+
NULL); /* use default match context */
137+
138+
/* Print the match result. */
139+
if (rc == PCRE2_ERROR_NOMATCH) {
140+
printf("No match\n");
141+
} else if (rc < 0) {
142+
fprintf(stderr, "Matching error\n");
143+
} else {
144+
PCRE2_SIZE *ovector = pcre2_get_ovector_pointer(match_data);
145+
printf("Found match: '%.*s'\n", (int)(ovector[1] - ovector[0]),
146+
subject + ovector[0]);
147+
}
148+
149+
pcre2_match_data_free(match_data); /* Free resources */
150+
pcre2_code_free(re);
151+
return 0;
152+
}
153+
----------------------------------------------------------------------
154+
155+
# Compile the demo:
156+
gcc -g -I./pcre2/build -L./pcre2/build demo.c -o demo -lpcre2-8
157+
158+
# Finally, run our demo:
159+
./demo 'c.t' 'dogs and cats'
160+
161+
# We fetched, built, and called PCRE2 successfully! :)
162+
```
163+
164+
</details>
165+
166+
---
167+
168+
The main ways of obtaining PCRE2 are:
169+
170+
1. Via Git clone:
171+
172+
```
23173
git clone https://github.com/PCRE2Project/pcre2.git
24-
svn co https://github.com/PCRE2Project/pcre2.git
174+
```
175+
176+
Please use a release tag in production, not the development branch!
177+
178+
2. Via download of the [release tarball](https://github.com/PCRE2Project/pcre2/releases/latest).
179+
180+
3. Finally, PCRE2 is also bundled by various downstream package managers (such as Linux distributions, or [vcpkg](https://vcpkg.io/)). These are provided by third parties, not the PCRE2 project.
181+
182+
The main ways of building PCRE2 are:
183+
184+
1. Via CMake (Linux/Windows/macOS, and others)
185+
186+
```
187+
cd pcre2/
188+
cmake -B build .
189+
cmake --build build/
190+
```
191+
192+
2. Via Autoconf (Linux/Unix)
193+
194+
```
195+
cd pcre2/
196+
./configure
197+
make
198+
```
199+
200+
See ["Platforms"](#platforms) below for links to more detailed build documentation.
201+
202+
## API Overview
203+
204+
The PCRE2 API supports strings in 8-bit, 16-bit, and 32-bit encodings, with or without UTF encoding. There is also EBCDIC support.
205+
206+
The default regular expression dialect closely matches the syntax and behaviour of Perl 5, with PCRE2-specific extensions. A wide variety of granular flags can be passed to the PCRE2 API to customise this to more closely follow other dialects such as JavaScript or Python.
207+
208+
The default matching engine uses a depth-first tree search with backtracking, which is highly feature-rich but has worst-case exponential time (PCRE2 allows aborting the match if a time limit is exceeded, expressed as a maximum number of steps in the tree search). The second matching engine uses a JIT for greatly improved performance, compiling the regular expression to a block of equivalent native machine code.
209+
210+
PCRE2 has a third matching engine, using a DFA engine which is generally slower, but has worst-case polynomial matching time and is able to find the POSIX-style "leftmost-longest" match.
211+
212+
There are accompanying utility functions for converting glob patterns and POSIX BRE/ERE patterns to PCRE2 regular expressions; and also for performing high-level regular expression operations such as search-and-replace with a powerful replacement string syntax.
213+
214+
As well as the PCRE2 API, the library also offers a POSIX-compatible `<regex.h>` header and `regexec()` function. However, this does not provide the ability to pass PCRE2 flags, so we recommend users consume the PCRE2 API if possible.
215+
216+
See the [full library and API documentation](https://pcre2project.github.io/pcre2/doc/html/index.html) for further details.
217+
218+
For third-party documentation, see further:
219+
220+
- A curated summary of changes for each PCRE release, and some excellent tutorials on PCRE2 on the
221+
[RexEgg website](http://www.rexegg.com/pcre-documentation.html).
222+
- Jan Goyvaerts' popular Regular-Expressions.info site includes [information about PCRE2](https://www.regular-expressions.info/pcre.html) as well as tutorials and highly detailed comparisons of PCRE2 to other regular expression dialects.
223+
- Jeffrey Friedl's book [_Mastering Regular Expressions_](https://regex.info/book.html) includes chapters on Perl and PCRE, and is available in print and online via O'Reilly Media.
224+
225+
## Platforms
226+
227+
PCRE2 is portable C code, and is likely to work on any system with a C99 compiler.
228+
229+
<dl>
230+
<dt>Operating systems</dt>
231+
<dd>
232+
Our continuous integration tests on <strong>Linux</strong> (GCC and Clang, glibc and musl), <strong>Windows</strong> (MSVC and MinGW-x64), and <strong>macOS</strong> (Clang), as well as <strong>FreeBSD</strong>, and <strong>Solaris</strong> (Oracle Studio <code>cc</code>).
233+
</dd>
234+
<dt>Processors</dt>
235+
<dd>
236+
PCRE2 is tested continuously on x86 (i686 and amd64), ARM 32- and 64-bit (armv7 and aarch64), RISC-V (riscv64), POWER (ppc64le), and the big-endian S390x.
237+
</dd>
238+
</dl>
239+
240+
Other systems are likely to work (including mobile, embedded platforms, and commercial UNIX systems), but these are not tested continuously by the PCRE2 maintainers. Users are encouraged to run the full PCRE2 test suite when compiling for any new platform. We are aware of working ports to VMS and z/OS (PCRE2 supports EBCDIC).
25241
26-
## Contributed Ports
242+
PCRE2 releases support CMake for building, and for UNIX platforms include a `./configure` script built by Autoconf. Build files for the Bazel build system and `zig build` are also included. Integrating PCRE2 with other systems can be done by including the `.c` files in an existing project.
27243
28-
If you just need the command-line PCRE2 tools on Windows, precompiled binary
29-
versions are available at this
30-
[Rexegg page](http://www.rexegg.com/pcregrep-pcretest.html).
244+
Please see the files [README](./README) and [NON-AUTOTOOLS-BUILD](./NON-AUTOTOOLS-BUILD) for full build documentation, as well as the man pages, including [`man pcre2/doc/pcre2build.3`](https://pcre2project.github.io/pcre2/doc/html/pcre2build.html).
31245
32-
A PCRE2 port for z/OS, a mainframe operating system which uses EBCDIC as its
33-
default character encoding, can be found at
34-
[http://www.cbttape.org](http://www.cbttape.org/) (File 939).
246+
## Licence
35247
36-
## Documentation
248+
PCRE2 is released under the **BSD 3-clause licence** with a PCRE2 Exception. It is open-source and also corporate-friendly.
37249
38-
You can read the PCRE2 documentation
39-
[here](https://PCRE2Project.github.io/pcre2/doc/html/index.html).
250+
- See [LICENCE](./LICENCE.md) for legal text.
251+
- See [AUTHORS](./AUTHORS.md) for details of the current maintainers of PCRE2 and acknowledgements of its contributors, including Philip Hazel, the original author.
40252
41-
Comparisons to Perl's regular expression semantics can be found in the
42-
community authored Wikipedia entry for PCRE.
253+
## Contributing & support
43254
44-
There is a curated summary of changes for each PCRE release, copies of
45-
documentation from older releases, and other useful information from the third
46-
party authored
47-
[RexEgg PCRE Documentation and Change Log page](http://www.rexegg.com/pcre-documentation.html).
255+
Join the community by reporting issues or asking questions via [GitHub issues](https://github.com/PCRE2Project/pcre2/issues). We welcome feedback and proposals.
48256
49-
## Contact
257+
Contributions ranging from bug fixes to feature requests are welcome, and can be made via GitHub pull requests.
50258
51-
To report a problem with the PCRE2 library, or to make a feature request, please
52-
use the PCRE2 GitHub issues tracker. There is a mailing list for discussion of
53-
PCRE2 issues and development at pcre2-dev@googlegroups.com, which is where any
54-
announcements will be made. You can browse the
55-
[list archives](https://groups.google.com/g/pcre2-dev).
259+
Please review our [SECURITY](./SECURITY.md) policy for information on reporting security issues.
56260
261+
Release announcements will be made via the [pcre2-dev@googlegroups.com](https://groups.google.com/g/pcre2-dev) mailing list, where you can also start discussions about PCRE2 issues and development. You can browse the [list archives](https://groups.google.com/g/pcre2-dev).

0 commit comments

Comments
 (0)