forked from duckdb/duckdb-python
-
Notifications
You must be signed in to change notification settings - Fork 0
Enable Free Threading #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Parent:
SCCACHE Builds
Closed
Changes from 79 commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
fb4598f
chore: Add 3.14 and 3.14t builds: update GHA matrix, bump uv and cibu…
paultiq 34d20e1
chore: remove pandas 3.0 warnings -> instead, disable pandas for 3.14…
paultiq 68f30ec
test: Disable Pandas for 3.14
paultiq 275b6f2
test: disable failing test "Windows fatal exception: access violation"
paultiq 23a2f9b
tests: skip, don't xfail
paultiq 699d9ab
exclude Windows
paultiq 43007a8
tests: revert the skip since we're excluding Windows 3.14t builds ent…
paultiq 4db9da1
revert: import that was added, no longer needed
paultiq 2374987
revert: exactly to original
paultiq 25fd097
test: Mark test xfail
paultiq 556e32d
test: mark test xfail
paultiq f5ad9d5
Merge branch 'main' into ci314t
paultiq b617188
chore: Add comments and todo's for workflow changes
paultiq 03d5eca
chore: Remove unused section for Windows 3.14t builds.
paultiq 70e70ae
chore: Add version check to only allow no-Pandas for 3.14, plus a TODO
paultiq 0c20b89
chore: enable sccache for builds and disable unneeded actions
paultiq d276412
fix: disable coverage_test, not packaging_test
paultiq b81bcba
direct install
paultiq 635d859
fix: set ARCH for sccache download
paultiq 15d1e76
fix: set CMAKE_C_COMPILER_LAUNCHER to avoid double sccache
paultiq 54f1e5e
chore: Add pytest modules
paultiq e28cb93
disable unused uv cache
paultiq 4f40596
windows settings
paultiq 456cb39
enable cp313
paultiq 5c33c44
feat: Move global state into a module state object, initialized via m…
paultiq f0fab76
all branches
paultiq 69e08ea
add workflow dispatch
paultiq 591f812
continue-on-error: true
paultiq f479b85
fix: missing PyErr_Clear()
paultiq ecae808
feat: py::mod_gil_not_used() - indicating that the extension is FT safe.
paultiq 91fe9c2
feat: Initial free threading support, some more work needed on defaul…
paultiq 3939ff1
add a comment
paultiq 91bbb33
feat: add module state to connection to reduce lookups
paultiq fb67fe0
feat: Move global state into a module state object, initialized via m…
paultiq 2784947
fix: missing PyErr_Clear()
paultiq 554d950
feat: add module state to connection to reduce lookups
paultiq c7bc632
Merge branch 'ft2' of https://github.com/paultiq/duckdb-pythonf into …
paultiq 3d253e4
chore: Add 3.14 and 3.14t builds: update GHA matrix, bump uv and cibu…
paultiq e315d53
chore: remove pandas 3.0 warnings -> instead, disable pandas for 3.14…
paultiq 617b7bf
test: Disable Pandas for 3.14
paultiq 88d3c15
test: disable failing test "Windows fatal exception: access violation"
paultiq 650a10e
tests: skip, don't xfail
paultiq c4d4a27
exclude Windows
paultiq 6889cc1
tests: revert the skip since we're excluding Windows 3.14t builds ent…
paultiq b5812b2
revert: import that was added, no longer needed
paultiq f909841
revert: exactly to original
paultiq 7617f08
test: Mark test xfail
paultiq a18a1d3
test: mark test xfail
paultiq f636b37
chore: Add comments and todo's for workflow changes
paultiq 3c22d7a
chore: Remove unused section for Windows 3.14t builds.
paultiq a0904f2
chore: Add version check to only allow no-Pandas for 3.14, plus a TODO
paultiq 448cab4
chore: enable sccache for builds and disable unneeded actions
paultiq 3f2c19c
fix: disable coverage_test, not packaging_test
paultiq 7ea230b
direct install
paultiq 9532ef4
fix: set ARCH for sccache download
paultiq d951f76
fix: set CMAKE_C_COMPILER_LAUNCHER to avoid double sccache
paultiq 3546435
chore: Add pytest modules
paultiq 08fba8c
disable unused uv cache
paultiq 0b5b0ec
windows settings
paultiq bac920f
enable cp313
paultiq 418f780
all branches
paultiq d29ba60
add workflow dispatch
paultiq 1cf9ede
continue-on-error: true
paultiq f79d189
add a comment
paultiq 1608a72
Merge branch 'ci314t_sccache' of https://github.com/paultiq/duckdb-py…
paultiq a3ff0ae
Make uv export quiet
paultiq 149d58e
Remove randomly.
paultiq 88354a1
xdist: Use tmp_path to avoid races.
paultiq bd5b24f
xdist: use tmp_path_factory
paultiq 3185c0d
feat: use a static for now - too many references
paultiq 13ce4ed
feat: py::mod_gil_not_used() - indicating that the extension is FT safe.
paultiq fa6af78
rebase
paultiq a77c15f
merge
paultiq a19570d
builds: set python_gil=0 for ft builds
paultiq 54c7d00
fix: fix imports, clear,
paultiq 2595cff
Merge branch 'ci314t_sccache' into free_threading_cl
paultiq a7e3830
update
paultiq 50823cf
Merge branch 'sccache_ci' into free_threading_cl
paultiq 4f57719
feat: move import cache to a direct object due to frequency of hits
paultiq a79bee4
ci: improve
paultiq 7fb001d
revert errclear
paultiq 36d6b11
add threading / concurrency tests
paultiq a63d296
add a set of threading test cases
paultiq 445008a
tests: thread safety
paultiq 6ea9e5e
dont force debug
paultiq 2be3020
test: clean up tests
paultiq e0e398c
restore back to direct object
paultiq dc9d0f2
make default_connection_ptr private
paultiq File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| //===----------------------------------------------------------------------===// | ||
| // DuckDB | ||
| // | ||
| // duckdb_python/module_state.hpp | ||
| // | ||
| // | ||
| //===----------------------------------------------------------------------===// | ||
|
|
||
| #pragma once | ||
|
|
||
| #include "duckdb_python/pybind11/pybind_wrapper.hpp" | ||
| #include "duckdb/common/shared_ptr.hpp" | ||
| #include "duckdb/main/db_instance_cache.hpp" | ||
| #include "duckdb/main/database.hpp" | ||
| #include "duckdb_python/import_cache/python_import_cache.hpp" | ||
| #include "duckdb_python/pyconnection/pyconnection.hpp" | ||
| #include <pybind11/critical_section.h> | ||
|
|
||
| namespace duckdb { | ||
|
|
||
|
|
||
| // Module state structure to hold per-interpreter state | ||
| struct DuckDBPyModuleState { | ||
| // TODO: Make private / move behind a thread-safe accessor | ||
| shared_ptr<DuckDBPyConnection> default_connection_ptr; | ||
| mutex default_connection_mutex; | ||
|
|
||
| // Python environment tracking | ||
| PythonEnvironmentType environment = PythonEnvironmentType::NORMAL; | ||
| string formatted_python_version; | ||
|
|
||
| DuckDBPyModuleState(); | ||
|
|
||
| shared_ptr<DuckDBPyConnection> GetDefaultConnection(); | ||
| void SetDefaultConnection(shared_ptr<DuckDBPyConnection> connection); | ||
| void ClearDefaultConnection(); | ||
|
|
||
| PythonImportCache* GetImportCache(); | ||
| void ClearImportCache(); | ||
|
|
||
| DBInstanceCache* GetInstanceCache(); | ||
|
|
||
| static DuckDBPyModuleState& GetGlobalModuleState(); | ||
| static void SetGlobalModuleState(DuckDBPyModuleState* state); | ||
|
|
||
| private: | ||
| PythonImportCache import_cache; | ||
| std::unique_ptr<DBInstanceCache> instance_cache; | ||
| #ifdef Py_GIL_DISABLED | ||
| py::object lock_object; | ||
| #endif | ||
|
|
||
| // Static module state cache for performance optimization | ||
| // TODO: Replace with proper per-interpreter state for multi-interpreter support | ||
| static DuckDBPyModuleState* g_module_state; | ||
|
|
||
| // Non-copyable | ||
| DuckDBPyModuleState(const DuckDBPyModuleState &) = delete; | ||
| DuckDBPyModuleState &operator=(const DuckDBPyModuleState &) = delete; | ||
| }; | ||
|
|
||
| DuckDBPyModuleState &GetModuleState(); | ||
| void SetModuleState(DuckDBPyModuleState *state); | ||
|
|
||
| } // namespace duckdb |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,126 @@ | ||
| //===----------------------------------------------------------------------===// | ||
| // DuckDB | ||
| // | ||
| // duckdb_python/module_state.cpp | ||
| // | ||
| // | ||
| //===----------------------------------------------------------------------===// | ||
|
|
||
| #include "duckdb_python/module_state.hpp" | ||
| #include <stdexcept> | ||
| #include <chrono> | ||
| #include <thread> | ||
|
|
||
| // Enable debug prints for performance analysis | ||
| #define DEBUG_MODULE_STATE 1 | ||
|
|
||
| namespace duckdb { | ||
|
|
||
| // Forward declaration from pyconnection.cpp | ||
| void InstantiateNewInstance(DuckDB &db); | ||
|
|
||
| // Static member initialization - required for all static class members in C++ | ||
| DuckDBPyModuleState *DuckDBPyModuleState::g_module_state = nullptr; | ||
|
|
||
| // Module state constructor | ||
| DuckDBPyModuleState::DuckDBPyModuleState() { | ||
| // Create caches | ||
| instance_cache = make_uniq<DBInstanceCache>(); | ||
| // import_cache: direct object due to frequent calls | ||
|
|
||
| #ifdef Py_GIL_DISABLED | ||
| // Initialize lock object for critical sections | ||
| // TODO: Consider moving to finer-grained locks | ||
| lock_object = py::none(); | ||
| #endif | ||
|
|
||
| // Detects Python environment and version during intialization | ||
| // Moved from DuckDBPyConnection::DetectEnvironment() | ||
| py::module_ sys = py::module_::import("sys"); | ||
| py::object version_info = sys.attr("version_info"); | ||
| int major = py::cast<int>(version_info.attr("major")); | ||
| int minor = py::cast<int>(version_info.attr("minor")); | ||
| formatted_python_version = std::to_string(major) + "." + std::to_string(minor); | ||
|
|
||
| // If __main__ does not have a __file__ attribute, we are in interactive mode | ||
| auto main_module = py::module_::import("__main__"); | ||
| if (!py::hasattr(main_module, "__file__")) { | ||
| environment = PythonEnvironmentType::INTERACTIVE; | ||
|
|
||
| if (ModuleIsLoaded<IpythonCacheItem>()) { | ||
| // Check to see if we are in a Jupyter Notebook | ||
| auto get_ipython = import_cache.IPython.get_ipython(); | ||
| if (get_ipython.ptr() != nullptr) { | ||
| auto ipython = get_ipython(); | ||
| if (py::hasattr(ipython, "config")) { | ||
| py::dict ipython_config = ipython.attr("config"); | ||
| if (ipython_config.contains("IPKernelApp")) { | ||
| environment = PythonEnvironmentType::JUPYTER; | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| DuckDBPyModuleState &DuckDBPyModuleState::GetGlobalModuleState() { | ||
| // TODO: Externalize this static cache when adding multi-interpreter support | ||
| // For now, single interpreter assumption allows simple static caching | ||
| if (!g_module_state) { | ||
| throw InternalException("Module state not initialized - call SetGlobalModuleState() during module init"); | ||
| } | ||
| return *g_module_state; | ||
| } | ||
|
|
||
| void DuckDBPyModuleState::SetGlobalModuleState(DuckDBPyModuleState *state) { | ||
| #if DEBUG_MODULE_STATE | ||
| printf("DEBUG: SetGlobalModuleState() called - initializing static cache (built: %s %s)\n", __DATE__, __TIME__); | ||
| #endif | ||
| g_module_state = state; | ||
| } | ||
|
|
||
| DuckDBPyModuleState &GetModuleState() { | ||
| #if DEBUG_MODULE_STATE | ||
| printf("DEBUG: GetModuleState() called\n"); | ||
| #endif | ||
| return DuckDBPyModuleState::GetGlobalModuleState(); | ||
| } | ||
|
|
||
| void SetModuleState(DuckDBPyModuleState *state) { | ||
| DuckDBPyModuleState::SetGlobalModuleState(state); | ||
| } | ||
|
|
||
| shared_ptr<DuckDBPyConnection> DuckDBPyModuleState::GetDefaultConnection() { | ||
| lock_guard<mutex> guard(default_connection_mutex); | ||
| // Reproduce exact logic from original DefaultConnectionHolder::Get() | ||
| if (!default_connection_ptr || default_connection_ptr->con.ConnectionIsClosed()) { | ||
| py::dict config_dict; | ||
| default_connection_ptr = DuckDBPyConnection::Connect(py::str(":memory:"), false, config_dict); | ||
| } | ||
| return default_connection_ptr; | ||
| } | ||
|
|
||
| void DuckDBPyModuleState::SetDefaultConnection(shared_ptr<DuckDBPyConnection> connection) { | ||
| lock_guard<mutex> guard(default_connection_mutex); | ||
| default_connection_ptr = std::move(connection); | ||
| } | ||
|
|
||
| void DuckDBPyModuleState::ClearDefaultConnection() { | ||
| lock_guard<mutex> guard(default_connection_mutex); | ||
| default_connection_ptr = nullptr; | ||
| } | ||
|
|
||
| PythonImportCache *DuckDBPyModuleState::GetImportCache() { | ||
| return &import_cache; | ||
| } | ||
|
|
||
| void DuckDBPyModuleState::ClearImportCache() { | ||
| // Direct object will be cleaned up automatically by destructor | ||
| // TODO: If explicit clearing is needed, add Clear() method to PythonImportCache | ||
| } | ||
|
|
||
| DBInstanceCache *DuckDBPyModuleState::GetInstanceCache() { | ||
| return instance_cache.get(); | ||
| } | ||
|
|
||
| } // namespace duckdb | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the GIL held here? I don't know pybind11 well enough to tell. If it is then you have to explicitly release it before locking a mutex, which is a possibly blocking operation. Otherwise you might deadlock with the GIL (or the garbage collector on the free-threaded build).
This is what MutexExt in PyO3 does. I don't know if there's something equivalent in pybind11. There probably should be if there isn't!
If you don't hold the GIL at this point then disregard all that. But also double-check everywhere else you are doing a possibly-blocking call with a standard library synchronization primitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking this through...
For context: I'm starting from code that is "GIL" safe, so mostly trying to preserve existing behavior / not change anything.
The GIL should be held in these paths; I didn't add a check, but I'll add a todo to consider a safety check.
The mutexes were just placeholders: next commit replaces them with an ifdef check for free threading and then either a mutex or a scoped_critical_section.