Skip to content

Commit f72cbd4

Browse files
Clément VALENTINclaude
andcommitted
fix: use spawn context for ProcessPoolExecutor to avoid asyncio deadlocks
Fork mode copies the entire process state including event loops, causing deadlocks when used with uvicorn/asyncio. Spawn creates fresh Python interpreters which is safe with async code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent b3fd338 commit f72cbd4

File tree

1 file changed

+7
-1
lines changed
  • apps/api/src/services/price_scrapers

1 file changed

+7
-1
lines changed

apps/api/src/services/price_scrapers/base.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,21 @@
33
from typing import List, Dict, Any, Callable, TypeVar
44
from datetime import datetime, UTC
55
from concurrent.futures import ProcessPoolExecutor
6+
import multiprocessing
67
import asyncio
78
import logging
89

910
logger = logging.getLogger(__name__)
1011

12+
# IMPORTANT: Use "spawn" instead of "fork" to avoid deadlocks with asyncio/uvicorn
13+
# Fork copies the entire process state including event loops, causing issues
14+
# Spawn creates a fresh Python interpreter, which is safer with async code
15+
_mp_context = multiprocessing.get_context("spawn")
16+
1117
# Shared process pool for CPU-intensive PDF parsing
1218
# ProcessPoolExecutor bypasses Python's GIL, allowing true parallel CPU usage
1319
# This enables pdfminer to use multiple cores for faster parsing
14-
pdf_executor = ProcessPoolExecutor(max_workers=4)
20+
pdf_executor = ProcessPoolExecutor(max_workers=4, mp_context=_mp_context)
1521

1622
T = TypeVar('T')
1723

0 commit comments

Comments
 (0)