Skip to content

Commit ad7c543

Browse files
carenasgitster
authored andcommitted
grep: skip UTF8 checks explicitly
18547aa ("grep/pcre: support utf-8", 2016-06-25) that was released with git 2.10 added the PCRE_UTF8 flag to PCRE1 matching including a call to has_non_ascii() to try to avoid breakage if there was non-utf8 encoded content in the haystack. Usually PCRE is compiled with JIT support (even if is not the default), and therefore the codepath used includes calling pcre_jit_exec, which skips UTF-8 validation by design (which might result in crashes or hangs) but when JIT support wasn't compiled we use pcre_exec instead with the posibility that grep might be aborted if invalid UTF-8 is found in the haystack. PCRE1 provides a flag since Mar 5, 2007 that could be used to skip the checks explicitly so use that to make both codepaths equivalent (the flag is ignored by pcre1_jit_exec) this fix is only implemented for PCRE1 because PCRE2 is likely to have a better solution (without the risks) instead in the future Helped-by: Johannes Schindelin <[email protected]> Helped-by: Eric Sunshine <[email protected]> Helped-by: Ævar Arnfjörð Bjarmason <[email protected]> Suggested-by: Junio C Hamano <[email protected]> Signed-off-by: Carlo Marcelo Arenas Belón <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 75b2f01 commit ad7c543

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

grep.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -421,7 +421,7 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
421421
static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
422422
regmatch_t *match, int eflags)
423423
{
424-
int ovector[30], ret, flags = 0;
424+
int ovector[30], ret, flags = PCRE_NO_UTF8_CHECK;
425425

426426
if (eflags & REG_NOTBOL)
427427
flags |= PCRE_NOTBOL;

grep.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@
33
#include "color.h"
44
#ifdef USE_LIBPCRE1
55
#include <pcre.h>
6+
#ifndef PCRE_NO_UTF8_CHECK
7+
#define PCRE_NO_UTF8_CHECK 0
8+
#endif
69
#ifdef PCRE_CONFIG_JIT
710
#if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32
811
#ifndef NO_LIBPCRE1_JIT

0 commit comments

Comments
 (0)