-
Notifications
You must be signed in to change notification settings - Fork 1
feat: Automate tokio runtime cleanup via reference counting #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add fetch-depth: 0 to checkout action to allow fetching submodule commits that are on non-default branches.
Explicitly set the branch for the pyo3-async-runtimes submodule to help Git fetch from the correct branch.
The CI was failing on Python 3.14 with: undefined symbol: PyUnstable_Module_SetGIL This symbol only exists in free-threaded Python (3.14t), not in regular Python 3.14. The issue was that uv may not properly distinguish between 3.14 and 3.14t when both are available. Using the +gil variant specifier (e.g., "3.14+gil") explicitly requests the GIL-enabled Python interpreter, preventing uv from accidentally selecting the free-threaded variant.
The +gil variant specifier is only for selecting from installed interpreters, not for installation. uv python install 3.14 should install the GIL-enabled version by default.
ARM64 + Python 3.14 (GIL-enabled) has a bug where uv's Python build incorrectly triggers PyO3 to generate free-threaded code, causing 'undefined symbol: PyUnstable_Module_SetGIL' errors at runtime. This is specific to: - Platform: ARM64 (aarch64) - Python: 3.14 (GIL-enabled) The following combinations work correctly: - x86_64 + Python 3.14 (GIL-enabled): OK - ARM64 + Python 3.14t (free-threaded): OK Excluding this specific combination until the upstream issue is resolved.
Updates the submodule to include fixes for compilation errors in the upstream PR #71: - Add tokio `sync` feature for Notify support - Restore missing public API functions - Fix stream module function references
Fixes CI build failure caused by deprecated function warnings treated as errors with RUSTFLAGS="-D warnings".
Fixes stress test performance regression from 8+ minutes to ~7 seconds. The issue was that request_shutdown() was blocking on thread.join() when called from within a tokio task (in __aexit__), causing a 5-second timeout per subprocess iteration.
949538a to
36a0783
Compare
36a0783 to
f35221d
Compare
…nditions The exit_context() function is called from within a future_into_py block, which means it runs inside a tokio task. Using the blocking request_shutdown() could cause race conditions where in-flight tasks try to access a runtime that is being torn down. Added new request_shutdown_background() to pyo3-async-runtimes that signals shutdown without blocking or immediately clearing the runtime slot, allowing the current task to complete gracefully before the runtime shuts down.
The previous request_shutdown_background() implementation left the runtime wrapper in storage, causing potential deadlocks. The new implementation: 1. Atomically clears the wrapper from storage (new ops get fresh runtime) 2. Spawns a background thread to properly join the runtime with timeout 3. Avoids blocking the calling async task
…utdown The previous approach of spawning a detached join thread caused SIGSEGV because it raced with Python's interpreter shutdown. Now we just signal shutdown and let the runtime thread complete independently.
Added register_atexit_cleanup(py) call during module init to ensure tokio runtime threads are properly joined before Python finalizes. This prevents SIGSEGV crashes on Python 3.11 and 3.12 when tokio threads run during interpreter shutdown.
Redesigned the shutdown mechanism to ensure the tokio runtime thread is
fully terminated before Python's event loop closes:
1. __aexit__ now returns a Python coroutine that:
- Awaits the Rust async cleanup (tokio task)
- If shutdown was triggered, awaits asyncio.to_thread() to block-join
the runtime thread with GIL released
2. Added _join_pending_shutdown() Python function that:
- Takes the pending thread handle from storage
- Joins it with GIL released
- Is called via asyncio.to_thread() from __aexit__
3. Added comprehensive stress tests for:
- Multi-async-task scenario (5 concurrent tasks per process)
- Multi-threaded scenario (4 threads with separate event loops)
- Mixed concurrency (3 threads × 3 async tasks each)
This approach ensures:
- No atexit dependency - everything completes in async context
- Automatic cleanup when last client exits (ref-counting)
- No deadlocks - blocking happens outside tokio via to_thread
- Thread-safe and async-task-safe
Fixes the SIGSEGV that occurred when tokio threads outlived Python.
The multi-threaded and mixed concurrency tests can take longer on CI due to thread setup overhead and variable machine performance. - Add configurable timeout parameter to _run_subprocess_test - Use 20s timeout for multi-threaded test (4 threads) - Use 30s timeout for mixed concurrency test (3 threads × 3 tasks)
The previous implementation triggered shutdown from within the tokio task (exit_context called request_shutdown_background). This created a race condition where the runtime could start shutting down while the task was still trying to return its result to Python, causing hangs in multi-threaded scenarios. The fix separates the two operations: 1. exit_context() now only returns a flag indicating if this was the last context 2. _trigger_shutdown() is a new function called from Python AFTER the tokio task has completed and returned 3. Then _join_pending_shutdown() blocks until the runtime thread terminates This ensures the tokio task completes successfully before shutdown begins.
- runtime.rs: Add SHUTDOWN_TIMEOUT_MS constant, restrict internal functions to pub(crate), simplify docstrings, add section comments - client.rs: Extract Python wrapper code to AEXIT_WRAPPER_CODE constant, simplify __aexit__ method, remove redundant comments - lib.rs: Reorganize exports with section comments - test_shutdown_stress.py: Add timeout constants (DEFAULT_TIMEOUT, THREADED_TIMEOUT, MIXED_CONCURRENCY_TIMEOUT), simplify embedded scripts No functional changes - all 24 tests pass.
- Add "Working with key prefixes" subsection under Basic usage - Move "Automatic runtime cleanup" to its own top-level section - Add "Lock timeout" and "Lock TTL" subsections under Etcd lock - Add "Watch with prefix" subsection under Watch - Condense code quality section - Fix typo: http::// → http://
achimnol
added a commit
to lablup/pyo3-async-runtimes
that referenced
this pull request
Jan 9, 2026
This commit introduces graceful shutdown support for the tokio runtime, addressing PyO3#40. ## Motivation When Python extensions built with pyo3-async-runtimes are used in subprocesses or short-lived contexts, tokio tasks may still be running when Python interpreter finalization begins. This causes fatal errors: Fatal Python error: PyGILState_Release: thread state...must be current This implementation enables proper shutdown coordination, as demonstrated in lablup/etcd-client-py#17, which uses the new APIs to implement automatic runtime cleanup through reference counting and async-compatible shutdown sequences. ## Implementation The tokio runtime now lives in a dedicated thread (inspired by valkey-glide): - RuntimeWrapper manages the runtime in a dedicated "pyo3-tokio-runtime" thread - The runtime is accessed via Handle (thread-safe, cloneable) - Shutdown is signaled through tokio::sync::Notify and blocks until complete - Runtime slot is cleared after shutdown, allowing re-initialization ## New APIs tokio module: - get_handle() -> Handle: Returns cloneable handle (recommended) - spawn(fut) / spawn_blocking(f): Convenience spawning functions - request_shutdown(timeout_ms) -> bool: Blocking shutdown - request_shutdown_background(timeout_ms) -> bool: Non-blocking shutdown - join_pending_shutdown(py) -> bool: Join pending background shutdown async-std module (for API consistency): - spawn(fut) / spawn_blocking(f): Convenience spawning functions - request_shutdown(timeout_ms) -> bool: Sets flag only (cannot shut down) ## Deprecated APIs - tokio::get_runtime(): Cannot be gracefully shut down; use get_handle() ## Dependency Changes - Replace `futures` with `futures-channel` + `futures-util` - Add `parking_lot` for RwLock - Add tokio `sync` feature for Notify ## Macro Updates - tokio_test macro now uses spawn_blocking() instead of get_runtime() - tokio_main macro uses #[allow(deprecated)] for block_on() usage Fixes PyO3#40
Updates vendored pyo3-async-runtimes (PyO3/pyo3-async-runtimes#71) with cleaner commit history: 1. deps: Replace futures with futures-channel/futures-util, add parking_lot 2. feat(tokio): Add RuntimeWrapper with graceful shutdown support 3. feat(async-std): Add spawn/spawn_blocking/request_shutdown for API consistency 4. refactor(macros): Update to use new spawn_blocking API 5. test: Add shutdown tests and update existing tests for deprecated API
29c7b4e to
f1be35f
Compare
2 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
cleanup_runtime()explicitlyChanges
pyo3-async-runtimes (vendored)
RUNTIME_WRAPPERfromOnceLocktoRwLock<Option<...>>for re-initialization supportrequest_shutdown_background()- signals shutdown without blockingjoin_pending_shutdown()- blocks until runtime thread terminates (GIL released)etcd-client-py
Runtime management (
src/runtime.rs):SHUTDOWN_TIMEOUT_MSconstant (5000ms)ACTIVE_CONTEXTSatomic counter for reference countingenter_context()/exit_context()- internal functions to track contextsactive_context_count()- public function for debugging/testing_trigger_shutdown()/_join_pending_shutdown()- internal helpers for__aexit__Client (
src/client.rs):__aenter__increments context count before async work__aexit__uses a 3-phase shutdown sequence:is_last_contextflag)_trigger_shutdown()from Python (after tokio task completes)asyncio.to_thread(_join_pending_shutdown)to block until runtime terminatesTests:
test_shutdown_stress.pywith comprehensive stress tests:test_shutdown_multi_async_tasks- multiple async tasks sharing one event looptest_shutdown_multi_threaded- multiple threads with separate event loopstest_shutdown_mixed_concurrency- threads × async tasks (most complex)Documentation:
Key Design Decision
The shutdown trigger is called from Python after the tokio task completes, not from within the task. This avoids a race condition where the runtime could start shutting down while a task is still returning its result to Python.
Test Plan
test_single_client_context_count- single client lifecycletest_multiple_concurrent_clients- cleanup only when all clients exittest_nested_contexts_same_client- nested contexts counted separatelytest_exception_during_context- count decremented on exceptiontest_sequential_clients_reinit- runtime re-initialization workstest_no_explicit_cleanup_needed- no segfaults without explicit cleanuptest_shutdown_multi_async_tasks- 5 concurrent tasks, 20 iterationstest_shutdown_multi_threaded- 4 threads, 10 iterationstest_shutdown_mixed_concurrency- 3 threads × 3 tasks, 10 iterationsRelated
cleanup_runtime()calls