Skip to content

Commit 264a6f9

Browse files
authored
Added support for subinterpreter workers (#850)
1 parent 6d612a9 commit 264a6f9

File tree

9 files changed

+374
-1
lines changed

9 files changed

+374
-1
lines changed

docs/api.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,12 @@ Running code in worker threads
5050
.. autofunction:: anyio.to_thread.run_sync
5151
.. autofunction:: anyio.to_thread.current_default_thread_limiter
5252

53+
Running code in subinterpreters
54+
-------------------------------
55+
56+
.. autofunction:: anyio.to_interpreter.run_sync
57+
.. autofunction:: anyio.to_interpreter.current_default_interpreter_limiter
58+
5359
Running code in worker processes
5460
--------------------------------
5561

@@ -189,6 +195,8 @@ Exceptions
189195
----------
190196

191197
.. autoexception:: anyio.BrokenResourceError
198+
.. autoexception:: anyio.BrokenWorkerIntepreter
199+
.. autoexception:: anyio.BrokenWorkerProcess
192200
.. autoexception:: anyio.BusyResourceError
193201
.. autoexception:: anyio.ClosedResourceError
194202
.. autoexception:: anyio.DelimiterNotFound

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ The manual
1818
networking
1919
threads
2020
subprocesses
21+
subinterpreters
2122
fileio
2223
signals
2324
testing

docs/subinterpreters.rst

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
Working with subinterpreters
2+
============================
3+
4+
.. py:currentmodule:: anyio
5+
6+
Subinterpreters offer a middle ground between worker threads and worker processes. They
7+
allow you to utilize multiple CPU cores to run Python code while avoiding the overhead
8+
and complexities of spawning subprocesses.
9+
10+
.. warning:: Subinterpreter support is considered **experimental**. The underlying
11+
Python API for managing subinterpreters has not been finalized yet, and has had
12+
little real-world testing. As such, it is not recommended to use this feature for
13+
anything important yet.
14+
15+
Running a function in a worker interpreter
16+
------------------------------------------
17+
18+
Running functions in a worker interpreter makes sense when:
19+
20+
* The code you want to run in parallel is CPU intensive
21+
* The code is either pure Python code, or extension code that does not release the
22+
Global Interpreter Lock (GIL)
23+
24+
If the code you're trying to run only does blocking network I/O, or file I/O, then
25+
you're better off using :doc:`worker thread <threads>` instead.
26+
27+
This is done by using :func:`.interpreter.run_sync`::
28+
29+
import time
30+
31+
from anyio import run, to_interpreter
32+
33+
from yourothermodule import cpu_intensive_function
34+
35+
async def main():
36+
result = await to_interpreter.run_sync(
37+
cpu_intensive_function, 'Hello, ', 'world!'
38+
)
39+
print(result)
40+
41+
run(main)
42+
43+
Limitations
44+
-----------
45+
46+
* Subinterpreters are only supported on Python 3.13 or later
47+
* Code in the ``__main__`` module cannot be run with this (as a consequence, this
48+
applies to any functions defined in the REPL)
49+
* The target functions cannot react to cancellation
50+
* Unlike with threads, the code running in the subinterpreter cannot share mutable data
51+
with other interpreters/threads (however, sharing _immutable_ data is fine)

docs/subprocesses.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,13 +61,16 @@ Running functions in worker processes
6161
-------------------------------------
6262

6363
When you need to run CPU intensive code, worker processes are better than threads
64-
because current implementations of Python cannot run Python code in multiple threads at
64+
because, with the exception of the experimental free-threaded builds of Python 3.13 and
65+
later, current implementations of Python cannot run Python code in multiple threads at
6566
once.
6667

6768
Exceptions to this rule are:
6869

6970
#. Blocking I/O operations
7071
#. C extension code that explicitly releases the Global Interpreter Lock
72+
#. :doc:`Subinterpreter workers <subinterpreters>`
73+
(experimental; available on Python 3.13 and later)
7174

7275
If the code you wish to run does not belong in this category, it's best to use worker
7376
processes instead in order to take advantage of multiple CPU cores.

docs/versionhistory.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ This library adheres to `Semantic Versioning 2.0 <http://semver.org/>`_.
55

66
**UNRELEASED**
77

8+
- Added **experimental** support for running functions in subinterpreters on Python
9+
3.13 and later
810
- Added support for the ``copy()``, ``copy_into()``, ``move()`` and ``move_into()``
911
methods in ``anyio.Path``, available in Python 3.14
1012
- Changed ``TaskGroup`` on asyncio to always spawn tasks non-eagerly, even if using a

src/anyio/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
from ._core._eventloop import sleep_forever as sleep_forever
99
from ._core._eventloop import sleep_until as sleep_until
1010
from ._core._exceptions import BrokenResourceError as BrokenResourceError
11+
from ._core._exceptions import BrokenWorkerIntepreter as BrokenWorkerIntepreter
1112
from ._core._exceptions import BrokenWorkerProcess as BrokenWorkerProcess
1213
from ._core._exceptions import BusyResourceError as BusyResourceError
1314
from ._core._exceptions import ClosedResourceError as ClosedResourceError

src/anyio/_core/_exceptions.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
import sys
44
from collections.abc import Generator
5+
from textwrap import dedent
6+
from typing import Any
57

68
if sys.version_info < (3, 11):
79
from exceptiongroup import BaseExceptionGroup
@@ -21,6 +23,41 @@ class BrokenWorkerProcess(Exception):
2123
"""
2224

2325

26+
class BrokenWorkerIntepreter(Exception):
27+
"""
28+
Raised by :meth:`~anyio.to_interpreter.run_sync` if an unexpected exception is
29+
raised in the subinterpreter.
30+
"""
31+
32+
def __init__(self, excinfo: Any):
33+
# This was adapted from concurrent.futures.interpreter.ExecutionFailed
34+
msg = excinfo.formatted
35+
if not msg:
36+
if excinfo.type and excinfo.msg:
37+
msg = f"{excinfo.type.__name__}: {excinfo.msg}"
38+
else:
39+
msg = excinfo.type.__name__ or excinfo.msg
40+
41+
super().__init__(msg)
42+
self.excinfo = excinfo
43+
44+
def __str__(self) -> str:
45+
try:
46+
formatted = self.excinfo.errdisplay
47+
except Exception:
48+
return super().__str__()
49+
else:
50+
return dedent(
51+
f"""
52+
{super().__str__()}
53+
54+
Uncaught in the interpreter:
55+
56+
{formatted}
57+
""".strip()
58+
)
59+
60+
2461
class BusyResourceError(Exception):
2562
"""
2663
Raised when two tasks are trying to read from or write to the same resource

src/anyio/to_interpreter.py

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
from __future__ import annotations
2+
3+
import atexit
4+
import os
5+
import pickle
6+
import sys
7+
from collections import deque
8+
from collections.abc import Callable
9+
from textwrap import dedent
10+
from typing import Any, Final, TypeVar
11+
12+
from . import current_time, to_thread
13+
from ._core._exceptions import BrokenWorkerIntepreter
14+
from ._core._synchronization import CapacityLimiter
15+
from .lowlevel import RunVar
16+
17+
if sys.version_info >= (3, 11):
18+
from typing import TypeVarTuple, Unpack
19+
else:
20+
from typing_extensions import TypeVarTuple, Unpack
21+
22+
UNBOUND: Final = 2 # I have no clue how this works, but it was used in the stdlib
23+
FMT_UNPICKLED: Final = 0
24+
FMT_PICKLED: Final = 1
25+
DEFAULT_CPU_COUNT: Final = 8 # this is just an arbitrarily selected value
26+
MAX_WORKER_IDLE_TIME = (
27+
30 # seconds a subinterpreter can be idle before becoming eligible for pruning
28+
)
29+
30+
T_Retval = TypeVar("T_Retval")
31+
PosArgsT = TypeVarTuple("PosArgsT")
32+
33+
_idle_workers = RunVar[deque["Worker"]]("_available_workers")
34+
_default_interpreter_limiter = RunVar[CapacityLimiter]("_default_interpreter_limiter")
35+
36+
37+
class Worker:
38+
_run_func = compile(
39+
dedent("""
40+
import _interpqueues as queues
41+
import _interpreters as interpreters
42+
from pickle import loads, dumps, HIGHEST_PROTOCOL
43+
44+
item = queues.get(queue_id)[0]
45+
try:
46+
func, args = loads(item)
47+
retval = func(*args)
48+
except BaseException as exc:
49+
is_exception = True
50+
retval = exc
51+
else:
52+
is_exception = False
53+
54+
try:
55+
queues.put(queue_id, (retval, is_exception), FMT_UNPICKLED, UNBOUND)
56+
except interpreters.NotShareableError:
57+
retval = dumps(retval, HIGHEST_PROTOCOL)
58+
queues.put(queue_id, (retval, is_exception), FMT_PICKLED, UNBOUND)
59+
"""),
60+
"<string>",
61+
"exec",
62+
)
63+
64+
last_used: float = 0
65+
66+
_initialized: bool = False
67+
_interpreter_id: int
68+
_queue_id: int
69+
70+
def initialize(self) -> None:
71+
import _interpqueues as queues
72+
import _interpreters as interpreters
73+
74+
self._interpreter_id = interpreters.create()
75+
self._queue_id = queues.create(2, FMT_UNPICKLED, UNBOUND) # type: ignore[call-arg]
76+
self._initialized = True
77+
interpreters.set___main___attrs(
78+
self._interpreter_id,
79+
{
80+
"queue_id": self._queue_id,
81+
"FMT_PICKLED": FMT_PICKLED,
82+
"FMT_UNPICKLED": FMT_UNPICKLED,
83+
"UNBOUND": UNBOUND,
84+
},
85+
)
86+
87+
def destroy(self) -> None:
88+
import _interpqueues as queues
89+
import _interpreters as interpreters
90+
91+
if self._initialized:
92+
interpreters.destroy(self._interpreter_id)
93+
queues.destroy(self._queue_id)
94+
95+
def _call(
96+
self,
97+
func: Callable[..., T_Retval],
98+
args: tuple[Any],
99+
) -> tuple[Any, bool]:
100+
import _interpqueues as queues
101+
import _interpreters as interpreters
102+
103+
if not self._initialized:
104+
self.initialize()
105+
106+
payload = pickle.dumps((func, args), pickle.HIGHEST_PROTOCOL)
107+
queues.put(self._queue_id, payload, FMT_PICKLED, UNBOUND) # type: ignore[call-arg]
108+
109+
res: Any
110+
is_exception: bool
111+
if exc_info := interpreters.exec(self._interpreter_id, self._run_func): # type: ignore[func-returns-value,arg-type]
112+
raise BrokenWorkerIntepreter(exc_info)
113+
114+
(res, is_exception), fmt = queues.get(self._queue_id)[:2]
115+
if fmt == FMT_PICKLED:
116+
res = pickle.loads(res)
117+
118+
return res, is_exception
119+
120+
async def call(
121+
self,
122+
func: Callable[..., T_Retval],
123+
args: tuple[Any],
124+
limiter: CapacityLimiter,
125+
) -> T_Retval:
126+
result, is_exception = await to_thread.run_sync(
127+
self._call,
128+
func,
129+
args,
130+
limiter=limiter,
131+
)
132+
if is_exception:
133+
raise result
134+
135+
return result
136+
137+
138+
def _stop_workers(workers: deque[Worker]) -> None:
139+
for worker in workers:
140+
worker.destroy()
141+
142+
workers.clear()
143+
144+
145+
async def run_sync(
146+
func: Callable[[Unpack[PosArgsT]], T_Retval],
147+
*args: Unpack[PosArgsT],
148+
limiter: CapacityLimiter | None = None,
149+
) -> T_Retval:
150+
"""
151+
Call the given function with the given arguments in a subinterpreter.
152+
153+
If the ``cancellable`` option is enabled and the task waiting for its completion is
154+
cancelled, the call will still run its course but its return value (or any raised
155+
exception) will be ignored.
156+
157+
.. warning:: This feature is **experimental**. The upstream interpreter API has not
158+
yet been finalized or thoroughly tested, so don't rely on this for anything
159+
mission critical.
160+
161+
:param func: a callable
162+
:param args: positional arguments for the callable
163+
:param limiter: capacity limiter to use to limit the total amount of subinterpreters
164+
running (if omitted, the default limiter is used)
165+
:return: the result of the call
166+
:raises BrokenWorkerIntepreter: if there's an internal error in a subinterpreter
167+
168+
"""
169+
if sys.version_info <= (3, 13):
170+
raise RuntimeError("subinterpreters require at least Python 3.13")
171+
172+
if limiter is None:
173+
limiter = current_default_interpreter_limiter()
174+
175+
try:
176+
idle_workers = _idle_workers.get()
177+
except LookupError:
178+
idle_workers = deque()
179+
_idle_workers.set(idle_workers)
180+
atexit.register(_stop_workers, idle_workers)
181+
182+
async with limiter:
183+
try:
184+
worker = idle_workers.pop()
185+
except IndexError:
186+
worker = Worker()
187+
188+
try:
189+
return await worker.call(func, args, limiter)
190+
finally:
191+
# Prune workers that have been idle for too long
192+
now = current_time()
193+
while idle_workers:
194+
if now - idle_workers[0].last_used <= MAX_WORKER_IDLE_TIME:
195+
break
196+
197+
await to_thread.run_sync(idle_workers.popleft().destroy, limiter=limiter)
198+
199+
worker.last_used = current_time()
200+
idle_workers.append(worker)
201+
202+
203+
def current_default_interpreter_limiter() -> CapacityLimiter:
204+
"""
205+
Return the capacity limiter that is used by default to limit the number of
206+
concurrently running subinterpreters.
207+
208+
Defaults to the number of CPU cores.
209+
210+
:return: a capacity limiter object
211+
212+
"""
213+
try:
214+
return _default_interpreter_limiter.get()
215+
except LookupError:
216+
limiter = CapacityLimiter(os.cpu_count() or DEFAULT_CPU_COUNT)
217+
_default_interpreter_limiter.set(limiter)
218+
return limiter

0 commit comments

Comments
 (0)