You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 15-regexp-catastrophic-backtracking/article.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,7 @@ If you run the example below, you probably won't see anything, as JavaScript wil
31
31
32
32
```js run
33
33
let regexp =/^(\w+\s?)*$/;
34
-
let str ="An input string that takes a long time or even makes this regexp to hang!";
34
+
let str ="An input string that takes a long time or even makes this regexp hang!";
35
35
36
36
// will take a very long time
37
37
alert( regexp.test(str) );
@@ -41,7 +41,7 @@ To be fair, let's note that some regular expression engines can handle such a se
41
41
42
42
## Simplified example
43
43
44
-
What's the matter? Why the regular expression hangs?
44
+
What's the matter? Why does the regular expression hang?
45
45
46
46
To understand that, let's simplify the example: remove spaces `pattern:\s?`. Then it becomes `pattern:^(\w+)*$`.
47
47
@@ -60,7 +60,7 @@ So what's wrong with the regexp?
60
60
61
61
First, one may notice that the regexp `pattern:(\d+)*` is a little bit strange. The quantifier `pattern:*` looks extraneous. If we want a number, we can use `pattern:\d+`.
62
62
63
-
Indeed, the regexp is artificial, we got it by simplifying the previous example. But the reason why it is slow is the same. So let's understand it, and then the previous example will become obvious.
63
+
Indeed, the regexp is artificial; we got it by simplifying the previous example. But the reason why it is slow is the same. So let's understand it, and then the previous example will become obvious.
64
64
65
65
What happens during the search of `pattern:^(\d+)*$` in the line `subject:123456789z` (shortened a bit for clarity, please note a non-digit character `subject:z` at the end, it's important), why does it take so long?
66
66
@@ -111,7 +111,7 @@ Here's what the regexp engine does:
111
111
```
112
112
113
113
114
-
4. There's no match, so the engine will continue backtracking, decreasing the number of repetitions. Backtracking generally works like this: the last greedy quantifier decreases the number of repetitions until it can. Then the previous greedy quantifier decreases, and so on.
114
+
4. There's no match, so the engine will continue backtracking, decreasing the number of repetitions. Backtracking generally works like this: the last greedy quantifier decreases the number of repetitions until it reaches the minimum. Then the previous greedy quantifier decreases, and so on.
115
115
116
116
All possible combinations are attempted. Here are their examples.
117
117
@@ -196,7 +196,7 @@ This regexp is equivalent to the previous one (matches the same) and works well:
196
196
197
197
```js run
198
198
let regexp = /^(\w+\s)*\w*$/;
199
-
let str = "An input string that takes a long time or even makes this regex to hang!";
199
+
let str = "An input string that takes a long time or even makes this regex hang!";
200
200
201
201
alert( regexp.test(str) ); // false
202
202
```
@@ -254,7 +254,7 @@ We can emulate them though using a "lookahead transform".
254
254
255
255
So we've come to real advanced topics. We'd like a quantifier, such as `pattern:+` not to backtrack, because sometimes backtracking makes no sense.
256
256
257
-
The pattern to take as much repetitions of `pattern:\w` as possible without backtracking is: `pattern:(?=(\w+))\1`. Of course, we could take another pattern instead of `pattern:\w`.
257
+
The pattern to take as many repetitions of `pattern:\w` as possible without backtracking is: `pattern:(?=(\w+))\1`. Of course, we could take another pattern instead of `pattern:\w`.
258
258
259
259
That may seem odd, but it's actually a very simple transform.
260
260
@@ -293,7 +293,7 @@ let regexp = /^((?=(\w+))\2\s?)*$/;
293
293
294
294
alert( regexp.test("A good string") ); // true
295
295
296
-
let str ="An input string that takes a long time or even makes this regex to hang!";
296
+
let str ="An input string that takes a long time or even makes this regex hang!";
297
297
298
298
alert( regexp.test(str) ); // false, works and fast!
299
299
```
@@ -304,7 +304,7 @@ Here `pattern:\2` is used instead of `pattern:\1`, because there are additional
304
304
// parentheses are named ?<word>, referenced as \k<word>
305
305
let regexp =/^((?=(?<word>\w+))\k<word>\s?)*$/;
306
306
307
-
let str ="An input string that takes a long time or even makes this regex to hang!";
307
+
let str ="An input string that takes a long time or even makes this regex hang!";
0 commit comments