-
Notifications
You must be signed in to change notification settings - Fork 254
Expand file tree
/
Copy pathpcre2api.html
More file actions
4508 lines (4507 loc) · 206 KB
/
pcre2api.html
File metadata and controls
4508 lines (4507 loc) · 206 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<html>
<head>
<title>pcre2api specification</title>
</head>
<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
<h1>pcre2api man page</h1>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.
</p>
<p>
This page is part of the PCRE2 HTML documentation. It was generated
automatically from the original man page. If there is any nonsense in it,
please consult the man page, in case the conversion went wrong.
<br>
<ul>
<li><a name="TOC1" href="#SEC1">PCRE2 NATIVE API BASIC FUNCTIONS</a>
<li><a name="TOC2" href="#SEC2">PCRE2 NATIVE API AUXILIARY MATCH FUNCTIONS</a>
<li><a name="TOC3" href="#SEC3">PCRE2 NATIVE API GENERAL CONTEXT FUNCTIONS</a>
<li><a name="TOC4" href="#SEC4">PCRE2 NATIVE API COMPILE CONTEXT FUNCTIONS</a>
<li><a name="TOC5" href="#SEC5">PCRE2 NATIVE API MATCH CONTEXT FUNCTIONS</a>
<li><a name="TOC6" href="#SEC6">PCRE2 NATIVE API STRING EXTRACTION FUNCTIONS</a>
<li><a name="TOC7" href="#SEC7">PCRE2 NATIVE API STRING SUBSTITUTION FUNCTION</a>
<li><a name="TOC8" href="#SEC8">PCRE2 NATIVE API JIT FUNCTIONS</a>
<li><a name="TOC9" href="#SEC9">PCRE2 NATIVE API SERIALIZATION FUNCTIONS</a>
<li><a name="TOC10" href="#SEC10">PCRE2 NATIVE API AUXILIARY FUNCTIONS</a>
<li><a name="TOC11" href="#SEC11">PCRE2 NATIVE API OBSOLETE FUNCTIONS</a>
<li><a name="TOC12" href="#SEC12">PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a>
<li><a name="TOC13" href="#SEC13">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a>
<li><a name="TOC14" href="#SEC14">PCRE2 API OVERVIEW</a>
<li><a name="TOC15" href="#SEC15">STRING LENGTHS AND OFFSETS</a>
<li><a name="TOC16" href="#SEC16">NEWLINES</a>
<li><a name="TOC17" href="#SEC17">MULTITHREADING</a>
<li><a name="TOC18" href="#SEC18">PCRE2 CONTEXTS</a>
<li><a name="TOC19" href="#SEC19">CHECKING BUILD-TIME OPTIONS</a>
<li><a name="TOC20" href="#SEC20">COMPILING A PATTERN</a>
<li><a name="TOC21" href="#SEC21">JUST-IN-TIME (JIT) COMPILATION</a>
<li><a name="TOC22" href="#SEC22">LOCALE SUPPORT</a>
<li><a name="TOC23" href="#SEC23">INFORMATION ABOUT A COMPILED PATTERN</a>
<li><a name="TOC24" href="#SEC24">INFORMATION ABOUT A PATTERN'S CALLOUTS</a>
<li><a name="TOC25" href="#SEC25">SERIALIZATION AND PRECOMPILING</a>
<li><a name="TOC26" href="#SEC26">THE MATCH DATA BLOCK</a>
<li><a name="TOC27" href="#SEC27">MEMORY USE FOR MATCH DATA BLOCKS</a>
<li><a name="TOC28" href="#SEC28">MATCHING A PATTERN: THE TRADITIONAL FUNCTION</a>
<li><a name="TOC29" href="#SEC29">NEWLINE HANDLING WHEN MATCHING</a>
<li><a name="TOC30" href="#SEC30">HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS</a>
<li><a name="TOC31" href="#SEC31">OTHER INFORMATION ABOUT A MATCH</a>
<li><a name="TOC32" href="#SEC32">ERROR RETURNS FROM <b>pcre2_match()</b></a>
<li><a name="TOC33" href="#SEC33">OBTAINING A TEXTUAL ERROR MESSAGE</a>
<li><a name="TOC34" href="#SEC34">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a>
<li><a name="TOC35" href="#SEC35">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
<li><a name="TOC36" href="#SEC36">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
<li><a name="TOC37" href="#SEC37">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
<li><a name="TOC38" href="#SEC38">DUPLICATE CAPTURE GROUP NAMES</a>
<li><a name="TOC39" href="#SEC39">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a>
<li><a name="TOC40" href="#SEC40">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
<li><a name="TOC41" href="#SEC41">SEE ALSO</a>
<li><a name="TOC42" href="#SEC42">AUTHOR</a>
<li><a name="TOC43" href="#SEC43">REVISION</a>
</ul>
<p>
<b>#include <pcre2.h></b>
<br>
<br>
PCRE2 is a new API for PCRE, starting at release 10.0. This document contains a
description of all its native functions. See the
<a href="pcre2.html"><b>pcre2</b></a>
document for an overview of all the PCRE2 documentation.
</p>
<h2><a name="SEC1" href="#TOC1">PCRE2 NATIVE API BASIC FUNCTIONS</a></h2>
<p>
<b>pcre2_code *pcre2_compile(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
<b> uint32_t <i>options</i>, int *<i>errorcode</i>, PCRE2_SIZE *<i>erroroffset,</i></b>
<b> pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>void pcre2_code_free(pcre2_code *<i>code</i>);</b>
<br>
<br>
<b>pcre2_match_data *pcre2_match_data_create(uint32_t <i>ovecsize</i>,</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_match_data *pcre2_match_data_create_from_pattern(</b>
<b> const pcre2_code *<i>code</i>, pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>int pcre2_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>int pcre2_dfa_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>,</b>
<b> int *<i>workspace</i>, PCRE2_SIZE <i>wscount</i>);</b>
<br>
<br>
<b>void pcre2_match_data_free(pcre2_match_data *<i>match_data</i>);</b>
</p>
<h2><a name="SEC2" href="#TOC1">PCRE2 NATIVE API AUXILIARY MATCH FUNCTIONS</a></h2>
<p>
<b>PCRE2_SPTR pcre2_get_mark(pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>PCRE2_SIZE pcre2_get_match_data_size(pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>PCRE2_SIZE pcre2_get_match_data_heapframes_size(</b>
<b> pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>uint32_t pcre2_get_ovector_count(pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>PCRE2_SIZE *pcre2_get_ovector_pointer(pcre2_match_data *<i>match_data</i>);</b>
<br>
<br>
<b>PCRE2_SIZE pcre2_get_startchar(pcre2_match_data *<i>match_data</i>);</b>
</p>
<h2><a name="SEC3" href="#TOC1">PCRE2 NATIVE API GENERAL CONTEXT FUNCTIONS</a></h2>
<p>
<b>pcre2_general_context *pcre2_general_context_create(</b>
<b> void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
<b> void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
<br>
<br>
<b>pcre2_general_context *pcre2_general_context_copy(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_general_context_free(pcre2_general_context *<i>gcontext</i>);</b>
</p>
<h2><a name="SEC4" href="#TOC1">PCRE2 NATIVE API COMPILE CONTEXT FUNCTIONS</a></h2>
<p>
<b>pcre2_compile_context *pcre2_compile_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_compile_context *pcre2_compile_context_copy(</b>
<b> pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>void pcre2_compile_context_free(pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>int pcre2_set_bsr(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_character_tables(pcre2_compile_context *<i>ccontext</i>,</b>
<b> const uint8_t *<i>tables</i>);</b>
<br>
<br>
<b>int pcre2_set_compile_extra_options(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>extra_options</i>);</b>
<br>
<br>
<b>int pcre2_set_max_pattern_length(pcre2_compile_context *<i>ccontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_max_pattern_compiled_length(</b>
<b> pcre2_compile_context *<i>ccontext</i>, PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_max_varlookbehind(pcre2_compile_contest *<i>ccontext</i>,</b>
<b>" uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_newline(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_parens_nest_limit(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_compile_recursion_guard(pcre2_compile_context *<i>ccontext</i>,</b>
<b> int (*<i>guard_function</i>)(uint32_t, void *), void *<i>user_data</i>);</b>
<br>
<br>
<b>int pcre2_set_optimize(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>directive</i>);</b>
</p>
<h2><a name="SEC5" href="#TOC1">PCRE2 NATIVE API MATCH CONTEXT FUNCTIONS</a></h2>
<p>
<b>pcre2_match_context *pcre2_match_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_match_context *pcre2_match_context_copy(</b>
<b> pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>void pcre2_match_context_free(pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>int pcre2_set_callout(pcre2_match_context *<i>mcontext</i>,</b>
<b> int (*<i>callout_function</i>)(pcre2_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
<b>int pcre2_set_substitute_case_callout(pcre2_match_context *<i>mcontext</i>,</b>
<b> PCRE2_SIZE (*<i>callout_function</i>)(PCRE2_SPTR, PCRE2_SIZE,</b>
<b> PCRE2_UCHAR *, PCRE2_SIZE,</b>
<b> int, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
<b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_heap_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_depth_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
</p>
<h2><a name="SEC6" href="#TOC1">PCRE2 NATIVE API STRING EXTRACTION FUNCTIONS</a></h2>
<p>
<b>int pcre2_substring_copy_byname(pcre2_match_data *<i>match_data</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_UCHAR *<i>buffer</i>, PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>int pcre2_substring_copy_bynumber(pcre2_match_data *<i>match_data</i>,</b>
<b> uint32_t <i>number</i>, PCRE2_UCHAR *<i>buffer</i>,</b>
<b> PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>void pcre2_substring_free(PCRE2_UCHAR *<i>buffer</i>);</b>
<br>
<br>
<b>int pcre2_substring_get_byname(pcre2_match_data *<i>match_data</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_UCHAR **<i>bufferptr</i>, PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>int pcre2_substring_get_bynumber(pcre2_match_data *<i>match_data</i>,</b>
<b> uint32_t <i>number</i>, PCRE2_UCHAR **<i>bufferptr</i>,</b>
<b> PCRE2_SIZE *<i>bufflen</i>);</b>
<br>
<br>
<b>int pcre2_substring_length_byname(pcre2_match_data *<i>match_data</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_SIZE *<i>length</i>);</b>
<br>
<br>
<b>int pcre2_substring_length_bynumber(pcre2_match_data *<i>match_data</i>,</b>
<b> uint32_t <i>number</i>, PCRE2_SIZE *<i>length</i>);</b>
<br>
<br>
<b>int pcre2_substring_nametable_scan(const pcre2_code *<i>code</i>,</b>
<b> PCRE2_SPTR <i>name</i>, PCRE2_SPTR *<i>first</i>, PCRE2_SPTR *<i>last</i>);</b>
<br>
<br>
<b>int pcre2_substring_number_from_name(const pcre2_code *<i>code</i>,</b>
<b> PCRE2_SPTR <i>name</i>);</b>
<br>
<br>
<b>void pcre2_substring_list_free(PCRE2_UCHAR **<i>list</i>);</b>
<br>
<br>
<b>int pcre2_substring_list_get(pcre2_match_data *<i>match_data</i>,</b>
<b>" PCRE2_UCHAR ***<i>listptr</i>, PCRE2_SIZE **<i>lengthsptr</i>);</b>
</p>
<h2><a name="SEC7" href="#TOC1">PCRE2 NATIVE API STRING SUBSTITUTION FUNCTION</a></h2>
<p>
<b>int pcre2_substitute(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>, PCRE2_SPTR <i>replacement</i>,</b>
<b> PCRE2_SIZE <i>rlength</i>, PCRE2_UCHAR *<i>outputbuffer</i>,</b>
<b> PCRE2_SIZE *<i>outlengthptr</i>);</b>
</p>
<h2><a name="SEC8" href="#TOC1">PCRE2 NATIVE API JIT FUNCTIONS</a></h2>
<p>
<b>int pcre2_jit_compile(pcre2_code *<i>code</i>, uint32_t <i>options</i>);</b>
<br>
<br>
<b>int pcre2_jit_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
<b> PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
<b> uint32_t <i>options</i>, pcre2_match_data *<i>match_data</i>,</b>
<b> pcre2_match_context *<i>mcontext</i>);</b>
<br>
<br>
<b>void pcre2_jit_free_unused_memory(pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_jit_stack *pcre2_jit_stack_create(size_t <i>startsize</i>,</b>
<b> size_t <i>maxsize</i>, pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_jit_stack_assign(pcre2_match_context *<i>mcontext</i>,</b>
<b> pcre2_jit_callback <i>callback_function</i>, void *<i>callback_data</i>);</b>
<br>
<br>
<b>void pcre2_jit_stack_free(pcre2_jit_stack *<i>jit_stack</i>);</b>
</p>
<h2><a name="SEC9" href="#TOC1">PCRE2 NATIVE API SERIALIZATION FUNCTIONS</a></h2>
<p>
<b>int32_t pcre2_serialize_decode(pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, const uint8_t *<i>bytes</i>,</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>int32_t pcre2_serialize_encode(const pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, uint8_t **<i>serialized_bytes</i>,</b>
<b> PCRE2_SIZE *<i>serialized_size</i>, pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_serialize_free(uint8_t *<i>bytes</i>);</b>
<br>
<br>
<b>int32_t pcre2_serialize_get_number_of_codes(const uint8_t *<i>bytes</i>);</b>
</p>
<h2><a name="SEC10" href="#TOC1">PCRE2 NATIVE API AUXILIARY FUNCTIONS</a></h2>
<p>
<b>pcre2_code *pcre2_code_copy(const pcre2_code *<i>code</i>);</b>
<br>
<br>
<b>pcre2_code *pcre2_code_copy_with_tables(const pcre2_code *<i>code</i>);</b>
<br>
<br>
<b>int pcre2_get_error_message(int <i>errorcode</i>, PCRE2_UCHAR *<i>buffer</i>,</b>
<b> PCRE2_SIZE <i>bufflen</i>);</b>
<br>
<br>
<b>const uint8_t *pcre2_maketables(pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>void pcre2_maketables_free(pcre2_general_context *<i>gcontext</i>,</b>
<b> const uint8_t *<i>tables</i>);</b>
<br>
<br>
<b>int pcre2_pattern_info(const pcre2_code *<i>code</i>, uint32_t <i>what</i>,</b>
<b> void *<i>where</i>);</b>
<br>
<br>
<b>int pcre2_callout_enumerate(const pcre2_code *<i>code</i>,</b>
<b> int (*<i>callback</i>)(pcre2_callout_enumerate_block *, void *),</b>
<b> void *<i>user_data</i>);</b>
<br>
<br>
<b>int pcre2_config(uint32_t <i>what</i>, void *<i>where</i>);</b>
</p>
<h2><a name="SEC11" href="#TOC1">PCRE2 NATIVE API OBSOLETE FUNCTIONS</a></h2>
<p>
<b>int pcre2_set_recursion_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_recursion_memory_management(</b>
<b> pcre2_match_context *<i>mcontext</i>,</b>
<b> void *(*<i>private_malloc</i>)(size_t, void *),</b>
<b> void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
<br>
<br>
These functions became obsolete at release 10.30 and are retained only for
backward compatibility. They should not be used in new code. The first is
replaced by <b>pcre2_set_depth_limit()</b>; the second is no longer needed and
has no effect (it always returns zero).
</p>
<h2><a name="SEC12" href="#TOC1">PCRE2 EXPERIMENTAL PATTERN CONVERSION FUNCTIONS</a></h2>
<p>
<b>pcre2_convert_context *pcre2_convert_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_convert_context *pcre2_convert_context_copy(</b>
<b> pcre2_convert_context *<i>cvcontext</i>);</b>
<br>
<br>
<b>void pcre2_convert_context_free(pcre2_convert_context *<i>cvcontext</i>);</b>
<br>
<br>
<b>int pcre2_set_glob_escape(pcre2_convert_context *<i>cvcontext</i>,</b>
<b> uint32_t <i>escape_char</i>);</b>
<br>
<br>
<b>int pcre2_set_glob_separator(pcre2_convert_context *<i>cvcontext</i>,</b>
<b> uint32_t <i>separator_char</i>);</b>
<br>
<br>
<b>int pcre2_pattern_convert(PCRE2_SPTR <i>pattern</i>, PCRE2_SIZE <i>length</i>,</b>
<b> uint32_t <i>options</i>, PCRE2_UCHAR **<i>buffer</i>,</b>
<b> PCRE2_SIZE *<i>blength</i>, pcre2_convert_context *<i>cvcontext</i>);</b>
<br>
<br>
<b>void pcre2_converted_pattern_free(PCRE2_UCHAR *<i>converted_pattern</i>);</b>
<br>
<br>
These functions provide a way of converting non-PCRE2 patterns into
patterns that can be processed by <b>pcre2_compile()</b>. This facility is
experimental and may be changed in future releases. At present, "globs" and
POSIX basic and extended patterns can be converted. Details are given in the
<a href="pcre2convert.html"><b>pcre2convert</b></a>
documentation.
</p>
<h2><a name="SEC13" href="#TOC1">PCRE2 8-BIT, 16-BIT, AND 32-BIT LIBRARIES</a></h2>
<p>
There are three PCRE2 libraries, supporting 8-bit, 16-bit, and 32-bit code
units, respectively. However, there is just one header file, <b>pcre2.h</b>.
This contains the function prototypes and other definitions for all three
libraries. One, two, or all three can be installed simultaneously. On Unix-like
systems the libraries are called <b>libpcre2-8</b>, <b>libpcre2-16</b>, and
<b>libpcre2-32</b>, and they can also co-exist with the original PCRE libraries.
Every PCRE2 function comes in three different forms, one for each library, for
example:
<pre>
<b>pcre2_compile_8()</b>
<b>pcre2_compile_16()</b>
<b>pcre2_compile_32()</b>
</pre>
There are also three different sets of data types:
<pre>
<b>PCRE2_UCHAR8, PCRE2_UCHAR16, PCRE2_UCHAR32</b>
<b>PCRE2_SPTR8, PCRE2_SPTR16, PCRE2_SPTR32</b>
</pre>
The UCHAR types define unsigned code units of the appropriate widths.
For example, PCRE2_UCHAR16 is usually defined as `uint16_t'.
The SPTR types are pointers to constants of the equivalent UCHAR types,
that is, they are pointers to vectors of unsigned code units.
</p>
<p>
Character strings are passed to a PCRE2 library as sequences of unsigned
integers in code units of the appropriate width. The length of a string may
be given as a number of code units, or the string may be specified as
zero-terminated.
</p>
<p>
Many applications use only one code unit width. For their convenience, macros
are defined whose names are the generic forms such as <b>pcre2_compile()</b> and
PCRE2_SPTR. These macros use the value of the macro PCRE2_CODE_UNIT_WIDTH to
generate the appropriate width-specific function and macro names.
PCRE2_CODE_UNIT_WIDTH is not defined by default. An application must define it
to be 8, 16, or 32 before including <b>pcre2.h</b> in order to make use of the
generic names.
</p>
<p>
Applications that use more than one code unit width can be linked with more
than one PCRE2 library, but must define PCRE2_CODE_UNIT_WIDTH to be 0 before
including <b>pcre2.h</b>, and then use the real function names. Any code that is
to be included in an environment where the value of PCRE2_CODE_UNIT_WIDTH is
unknown should also use the real function names. (Unfortunately, it is not
possible in C code to save and restore the value of a macro.)
</p>
<p>
If PCRE2_CODE_UNIT_WIDTH is not defined before including <b>pcre2.h</b>, a
compiler error occurs.
</p>
<p>
When using multiple libraries in an application, you must take care when
processing any particular pattern to use only functions from a single library.
For example, if you want to run a match using a pattern that was compiled with
<b>pcre2_compile_16()</b>, you must do so with <b>pcre2_match_16()</b>, not
<b>pcre2_match_8()</b> or <b>pcre2_match_32()</b>.
</p>
<p>
In the function summaries above, and in the rest of this document and other
PCRE2 documents, functions and data types are described using their generic
names, without the _8, _16, or _32 suffix.
</p>
<h2><a name="SEC14" href="#TOC1">PCRE2 API OVERVIEW</a></h2>
<p>
PCRE2 has its own native API, which is described in this document. There are
also some wrapper functions for the 8-bit library that correspond to the
POSIX regular expression API, but they do not give access to all the
functionality of PCRE2 and they are not thread-safe. They are described in the
<a href="pcre2posix.html"><b>pcre2posix</b></a>
documentation. Both these APIs define a set of C function calls.
</p>
<p>
The native API C data types, function prototypes, option values, and error
codes are defined in the header file <b>pcre2.h</b>, which also contains
definitions of PCRE2_MAJOR and PCRE2_MINOR, the major and minor release numbers
for the library. Applications can use these to include support for different
releases of PCRE2.
</p>
<p>
In a Windows environment, if you want to statically link an application program
against a non-dll PCRE2 library, you must define PCRE2_STATIC before including
<b>pcre2.h</b>.
</p>
<p>
The functions <b>pcre2_compile()</b> and <b>pcre2_match()</b> are used for
compiling and matching regular expressions in a Perl-compatible manner. A
sample program that demonstrates the simplest way of using them is provided in
the file called <i>pcre2demo.c</i> in the PCRE2 source distribution. A listing
of this program is given in the
<a href="pcre2demo.html"><b>pcre2demo</b></a>
documentation, and the
<a href="pcre2sample.html"><b>pcre2sample</b></a>
documentation describes how to compile and run it.
</p>
<p>
The compiling and matching functions recognize various options that are passed
as bits in an options argument. There are also some more complicated parameters
such as custom memory management functions and resource limits that are passed
in "contexts" (which are just memory blocks, described below). Simple
applications do not need to make use of contexts.
</p>
<p>
Just-in-time (JIT) compiler support is an optional feature of PCRE2 that can be
built in appropriate hardware environments. It greatly speeds up the matching
performance of many patterns. Programs can request that it be used if
available by calling <b>pcre2_jit_compile()</b> after a pattern has been
successfully compiled by <b>pcre2_compile()</b>. This does nothing if JIT
support is not available.
</p>
<p>
More complicated programs might need to make use of the specialist functions
<b>pcre2_jit_stack_create()</b>, <b>pcre2_jit_stack_free()</b>, and
<b>pcre2_jit_stack_assign()</b> in order to control the JIT code's memory usage.
</p>
<p>
JIT matching is automatically used by <b>pcre2_match()</b> if it is available,
unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
matching, which gives improved performance at the expense of less sanity
checking. The JIT-specific functions are discussed in the
<a href="pcre2jit.html"><b>pcre2jit</b></a>
documentation.
</p>
<p>
A second matching function, <b>pcre2_dfa_match()</b>, which is not
Perl-compatible, is also provided. This uses a different algorithm for the
matching. The alternative algorithm finds all possible matches (at a given
point in the subject), and scans the subject just once (unless there are
lookaround assertions). However, this algorithm does not return captured
substrings. A description of the two matching algorithms and their advantages
and disadvantages is given in the
<a href="pcre2matching.html"><b>pcre2matching</b></a>
documentation. There is no JIT support for <b>pcre2_dfa_match()</b>.
</p>
<p>
In addition to the main compiling and matching functions, there are convenience
functions for extracting captured substrings from a subject string that has
been matched by <b>pcre2_match()</b>. They are:
<pre>
<b>pcre2_substring_copy_byname()</b>
<b>pcre2_substring_copy_bynumber()</b>
<b>pcre2_substring_get_byname()</b>
<b>pcre2_substring_get_bynumber()</b>
<b>pcre2_substring_list_get()</b>
<b>pcre2_substring_length_byname()</b>
<b>pcre2_substring_length_bynumber()</b>
<b>pcre2_substring_nametable_scan()</b>
<b>pcre2_substring_number_from_name()</b>
</pre>
<b>pcre2_substring_free()</b> and <b>pcre2_substring_list_free()</b> are also
provided, to free memory used for extracted strings. If either of these
functions is called with a NULL argument, the function returns immediately
without doing anything.
</p>
<p>
The function <b>pcre2_substitute()</b> can be called to match a pattern and
return a copy of the subject string with substitutions for parts that were
matched.
</p>
<p>
Functions whose names begin with <b>pcre2_serialize_</b> are used for saving
compiled patterns on disc or elsewhere, and reloading them later.
</p>
<p>
Finally, there are functions for finding out information about a compiled
pattern (<b>pcre2_pattern_info()</b>) and about the configuration with which
PCRE2 was built (<b>pcre2_config()</b>).
</p>
<p>
Functions with names ending with <b>_free()</b> are used for freeing memory
blocks of various sorts. In all cases, if one of these functions is called with
a NULL argument, it does nothing.
</p>
<h2><a name="SEC15" href="#TOC1">STRING LENGTHS AND OFFSETS</a></h2>
<p>
The PCRE2 API uses string lengths and offsets into strings of code units in
several places. These values are always of type PCRE2_SIZE, which is an
unsigned integer type, currently always defined as <i>size_t</i>. The largest
value that can be stored in such a type (that is ~(PCRE2_SIZE)0) is reserved
as a special indicator for zero-terminated strings and unset offsets.
Therefore, the longest string that can be handled is one less than this
maximum. Note that string lengths are always given in code units. Only in the
8-bit library is such a length the same as the number of bytes in the string.
<a name="newlines"></a></p>
<h2><a name="SEC16" href="#TOC1">NEWLINES</a></h2>
<p>
PCRE2 supports five different conventions for indicating line breaks in
strings: a single CR (carriage return) character, a single LF (linefeed)
character, the two-character sequence CRLF, any of the three preceding, or any
Unicode newline sequence. The Unicode newline sequences are the three just
mentioned, plus the single characters VT (vertical tab, U+000B), FF (form feed,
U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and PS
(paragraph separator, U+2029).
</p>
<p>
Each of the first three conventions is used by at least one operating system as
its standard newline sequence. When PCRE2 is built, a default can be specified.
If it is not, the default is set to LF, which is the Unix standard. However,
the newline convention can be changed by an application when calling
<b>pcre2_compile()</b>, or it can be specified by special text at the start of
the pattern itself; this overrides any other settings. See the
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
page for details of the special character sequences.
</p>
<p>
In the PCRE2 documentation the word "newline" is used to mean "the character or
pair of characters that indicate a line break". The choice of newline
convention affects the handling of the dot, circumflex, and dollar
metacharacters, the handling of #-comments in /x mode, and, when CRLF is a
recognized line ending sequence, the match position advancement for a
non-anchored pattern. There is more detail about this in the
<a href="#matchoptions">section on <b>pcre2_match()</b> options</a>
below.
</p>
<p>
The choice of newline convention does not affect the interpretation of
the \n or \r escape sequences, nor does it affect what \R matches; this has
its own separate convention.
</p>
<h2><a name="SEC17" href="#TOC1">MULTITHREADING</a></h2>
<p>
In a multithreaded application it is important to keep thread-specific data
separate from data that can be shared between threads. The PCRE2 library code
itself is thread-safe: it contains no static or global variables. The API is
designed to be fairly simple for non-threaded applications while at the same
time ensuring that multithreaded applications can use it.
</p>
<p>
There are several different blocks of data that are used to pass information
between the application and the PCRE2 libraries.
</p>
<h3>
The compiled pattern
</h3>
<p>
A pointer to the compiled form of a pattern is returned to the user when
<b>pcre2_compile()</b> is successful. The data in the compiled pattern is fixed,
and does not change when the pattern is matched. Therefore, it is thread-safe,
that is, the same compiled pattern can be used by more than one thread
simultaneously. For example, an application can compile all its patterns at the
start, before forking off multiple threads that use them. However, if the
just-in-time (JIT) optimization feature is being used, it needs separate memory
stack areas for each thread. See the
<a href="pcre2jit.html"><b>pcre2jit</b></a>
documentation for more details.
</p>
<p>
In a more complicated situation, where patterns are compiled only when they are
first needed, but are still shared between threads, pointers to compiled
patterns must be protected from simultaneous writing by multiple threads. This
is somewhat tricky to do correctly. If you know that writing to a pointer is
atomic in your environment, you can use logic like this:
<pre>
Get a read-only (shared) lock (mutex) for pointer
if (pointer == NULL)
{
Get a write (unique) lock for pointer
if (pointer == NULL) pointer = pcre2_compile(...
}
Release the lock
Use pointer in pcre2_match()
</pre>
Of course, testing for compilation errors should also be included in the code.
</p>
<p>
The reason for checking the pointer a second time is as follows: Several
threads may have acquired the shared lock and tested the pointer for being
NULL, but only one of them will be given the write lock, with the rest kept
waiting. The winning thread will compile the pattern and store the result.
After this thread releases the write lock, another thread will get it, and if
it does not retest pointer for being NULL, will recompile the pattern and
overwrite the pointer, creating a memory leak and possibly causing other
issues.
</p>
<p>
In an environment where writing to a pointer may not be atomic, the above logic
is not sufficient. The thread that is doing the compiling may be descheduled
after writing only part of the pointer, which could cause other threads to use
an invalid value. Instead of checking the pointer itself, a separate "pointer
is valid" flag (that can be updated atomically) must be used:
<pre>
Get a read-only (shared) lock (mutex) for pointer
if (!pointer_is_valid)
{
Get a write (unique) lock for pointer
if (!pointer_is_valid)
{
pointer = pcre2_compile(...
pointer_is_valid = TRUE
}
}
Release the lock
Use pointer in pcre2_match()
</pre>
If JIT is being used, but the JIT compilation is not being done immediately
(perhaps waiting to see if the pattern is used often enough), similar logic is
required. JIT compilation updates a value within the compiled code block, so a
thread must gain unique write access to the pointer before calling
<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> or
<b>pcre2_code_copy_with_tables()</b> can be used to obtain a private copy of the
compiled code before calling the JIT compiler.
</p>
<h3>
Context blocks
</h3>
<p>
The next main section below introduces the idea of "contexts" in which PCRE2
functions are called. A context is nothing more than a collection of parameters
that control the way PCRE2 operates. Grouping a number of parameters together
in a context is a convenient way of passing them to a PCRE2 function without
using lots of arguments. The parameters that are stored in contexts are in some
sense "advanced features" of the API. Many straightforward applications will
not need to use contexts.
</p>
<p>
In a multithreaded application, if the parameters in a context are values that
are never changed, the same context can be used by all the threads. However, if
any thread needs to change any value in a context, it must make its own
thread-specific copy.
</p>
<h3>
Match blocks
</h3>
<p>
The matching functions need a block of memory for storing the results of a
match. This includes details of what was matched, as well as additional
information such as the name of a (*MARK) setting. Each thread must provide its
own copy of this memory.
</p>
<h2><a name="SEC18" href="#TOC1">PCRE2 CONTEXTS</a></h2>
<p>
Some PCRE2 functions have a lot of parameters, many of which are used only by
specialist applications, for example, those that use custom memory management
or non-standard character tables. To keep function argument lists at a
reasonable size, and at the same time to keep the API extensible, "uncommon"
parameters are passed to certain functions in a <b>context</b> instead of
directly. A context is just a block of memory that holds the parameter values.
Applications that do not need to adjust any of the context parameters can pass
NULL when a context pointer is required.
</p>
<p>
There are three different types of context: a general context that is relevant
for several PCRE2 operations, a compile-time context, and a match-time context.
</p>
<h3>
The general context
</h3>
<p>
At present, this context just contains pointers to (and data for) external
memory management functions that are called from several places in the PCRE2
library. The context is named `general' rather than specifically `memory'
because in future other fields may be added. If you do not want to supply your
own custom memory management functions, you do not need to bother with a
general context. A general context is created by:
<br>
<br>
<b>pcre2_general_context *pcre2_general_context_create(</b>
<b> void *(*<i>private_malloc</i>)(PCRE2_SIZE, void *),</b>
<b> void (*<i>private_free</i>)(void *, void *), void *<i>memory_data</i>);</b>
<br>
<br>
The two function pointers specify custom memory management functions, whose
prototypes are:
<pre>
<b>void *private_malloc(PCRE2_SIZE, void *);</b>
<b>void private_free(void *, void *);</b>
</pre>
Whenever code in PCRE2 calls these functions, the final argument is the value
of <i>memory_data</i>. Either of the first two arguments of the creation
function may be NULL, in which case the system memory management functions
<i>malloc()</i> and <i>free()</i> are used. (This is not currently useful, as
there are no other fields in a general context, but in future there might be.)
The <i>private_malloc()</i> function is used (if supplied) to obtain memory for
storing the context, and all three values are saved as part of the context.
</p>
<p>
Whenever PCRE2 creates a data block of any kind, the block contains a pointer
to the <i>free()</i> function that matches the <i>malloc()</i> function that was
used. When the time comes to free the block, this function is called.
</p>
<p>
A general context can be copied by calling:
<br>
<br>
<b>pcre2_general_context *pcre2_general_context_copy(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
The memory used for a general context should be freed by calling:
<br>
<br>
<b>void pcre2_general_context_free(pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
If this function is passed a NULL argument, it returns immediately without
doing anything.
<a name="compilecontext"></a></p>
<h3>
The compile context
</h3>
<p>
A compile context is required if you want to provide an external function for
stack checking during compilation or to change the default values of any of the
following compile-time parameters:
<pre>
What \R matches (Unicode newlines or CR, LF, CRLF only)
PCRE2's character tables
The newline character sequence
The compile time nested parentheses limit
The maximum length of the pattern string
The extra options bits (none set by default)
Which performance optimizations the compiler should apply
</pre>
A compile context is also required if you are using custom memory management.
If none of these apply, just pass NULL as the context argument of
<i>pcre2_compile()</i>.
</p>
<p>
A compile context is created, copied, and freed by the following functions:
<br>
<br>
<b>pcre2_compile_context *pcre2_compile_context_create(</b>
<b> pcre2_general_context *<i>gcontext</i>);</b>
<br>
<br>
<b>pcre2_compile_context *pcre2_compile_context_copy(</b>
<b> pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
<b>void pcre2_compile_context_free(pcre2_compile_context *<i>ccontext</i>);</b>
<br>
<br>
A compile context is created with default values for its parameters. These can
be changed by calling the following functions, which return 0 on success, or
PCRE2_ERROR_BADDATA if invalid data is detected.
<br>
<br>
<b>int pcre2_set_bsr(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
The value must be PCRE2_BSR_ANYCRLF, to specify that \R matches only CR, LF,
or CRLF, or PCRE2_BSR_UNICODE, to specify that \R matches any Unicode line
ending sequence. The value is used by the JIT compiler and by the two
interpreted matching functions, <i>pcre2_match()</i> and
<i>pcre2_dfa_match()</i>.
<br>
<br>
<b>int pcre2_set_character_tables(pcre2_compile_context *<i>ccontext</i>,</b>
<b> const uint8_t *<i>tables</i>);</b>
<br>
<br>
The value must be the result of a call to <b>pcre2_maketables()</b>, whose only
argument is a general context. This function builds a set of character tables
in the current locale.
<br>
<br>
<b>int pcre2_set_compile_extra_options(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>extra_options</i>);</b>
<br>
<br>
As PCRE2 has developed, almost all the 32 option bits that are available in
the <i>options</i> argument of <b>pcre2_compile()</b> have been used up. To avoid
running out, the compile context contains a set of extra option bits which are
used for some newer, assumed rarer, options. This function sets those bits. It
always sets all the bits (either on or off). It does not modify any existing
setting. The available options are defined in the section entitled "Extra
compile options"
<a href="#extracompileoptions">below.</a>
<br>
<br>
<b>int pcre2_set_max_pattern_length(pcre2_compile_context *<i>ccontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
This sets a maximum length, in code units, for any pattern string that is
compiled with this context. If the pattern is longer, an error is generated.
This facility is provided so that applications that accept patterns from
external sources can limit their size. The default is the largest number that a
PCRE2_SIZE variable can hold, which is effectively unlimited.
<br>
<br>
<b>int pcre2_set_max_pattern_compiled_length(</b>
<b> pcre2_compile_context *<i>ccontext</i>, PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
This sets a maximum size, in bytes, for the memory needed to hold the compiled
version of a pattern that is compiled with this context. If the pattern needs
more memory, an error is generated. This facility is provided so that
applications that accept patterns from external sources can limit the amount of
memory they use. The default is the largest number that a PCRE2_SIZE variable
can hold, which is effectively unlimited.
<br>
<br>
<b>int pcre2_set_max_varlookbehind(pcre2_compile_contest *<i>ccontext</i>,</b>
<b>" uint32_t <i>value</i>);</b>
<br>
<br>
This sets a maximum length for the number of characters matched by a
variable-length lookbehind assertion. The default is set when PCRE2 is built,
with the ultimate default being 255, the same as Perl. Lookbehind assertions
without a bounding length are not supported.
<br>
<br>
<b>int pcre2_set_newline(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
This specifies which characters or character sequences are to be recognized as
newlines. The value must be one of PCRE2_NEWLINE_CR (carriage return only),
PCRE2_NEWLINE_LF (linefeed only), PCRE2_NEWLINE_CRLF (the two-character
sequence CR followed by LF), PCRE2_NEWLINE_ANYCRLF (any of the above),
PCRE2_NEWLINE_ANY (any Unicode newline sequence), or PCRE2_NEWLINE_NUL (the
NUL character, that is a binary zero).
</p>
<p>
A pattern can override the value set in the compile context by starting with a
sequence such as (*CRLF). See the
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
page for details.
</p>
<p>
When a pattern is compiled with the PCRE2_EXTENDED or PCRE2_EXTENDED_MORE
option, the newline convention affects the recognition of the end of internal
comments starting with #. The value is saved with the compiled pattern for
subsequent use by the JIT compiler and by the two interpreted matching
functions, <i>pcre2_match()</i> and <i>pcre2_dfa_match()</i>.
<br>
<br>
<b>int pcre2_set_parens_nest_limit(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
<br>
This parameter adjusts the limit, set when PCRE2 is built (default 250), on the
depth of parenthesis nesting in a pattern. This limit stops rogue patterns
using up too much system stack when being compiled. The limit applies to
parentheses of all kinds, not just capturing parentheses.
<br>
<br>
<b>int pcre2_set_compile_recursion_guard(pcre2_compile_context *<i>ccontext</i>,</b>
<b> int (*<i>guard_function</i>)(uint32_t, void *), void *<i>user_data</i>);</b>
<br>
<br>
There is at least one application that runs PCRE2 in threads with very limited
system stack, where running out of stack is to be avoided at all costs. The
parenthesis limit above cannot take account of how much stack is actually
available during compilation. For a finer control, you can supply a function
that is called whenever <b>pcre2_compile()</b> starts to compile a parenthesized
part of a pattern. This function can check the actual stack size (or anything
else that it wants to, of course).
</p>
<p>
The first argument to the callout function gives the current depth of
nesting, and the second is user data that is set up by the last argument of
<b>pcre2_set_compile_recursion_guard()</b>. The callout function should return
zero if all is well, or non-zero to force an error.
<br>
<br>
<b>int pcre2_set_optimize(pcre2_compile_context *<i>ccontext</i>,</b>
<b> uint32_t <i>directive</i>);</b>
<br>
<br>
PCRE2 can apply various performance optimizations during compilation, in order
to make matching faster. For example, the compiler might convert some regex
constructs into an equivalent construct which <b>pcre2_match()</b> can execute
faster. By default, all available optimizations are enabled. However, in rare
cases, one might wish to disable specific optimizations. For example, if it is
known that some optimizations cannot benefit a certain regex, it might be
desirable to disable them, in order to speed up compilation.
</p>
<p>
The permitted values of <i>directive</i> are as follows:
<pre>
PCRE2_OPTIMIZATION_FULL
</pre>
Enable all optional performance optimizations. This is the default value.
<pre>
PCRE2_OPTIMIZATION_NONE
</pre>
Disable all optional performance optimizations.
<pre>
PCRE2_AUTO_POSSESS
PCRE2_AUTO_POSSESS_OFF
</pre>
Enable/disable "auto-possessification" of variable quantifiers such as * and +.
This optimization, for example, turns a+b into a++b in order to avoid
backtracks into a+ that can never be successful. However, if callouts are in
use, auto-possessification means that some callouts are never taken. You can
disable this optimization if you want the matching functions to do a full,
unoptimized search and run all the callouts.