Skip to content

Comments

Update to PyO3 0.28 to automatically disable GIL#1948

Open
ngoldbaum wants to merge 6 commits intohuggingface:mainfrom
ngoldbaum:disable-gil
Open

Update to PyO3 0.28 to automatically disable GIL#1948
ngoldbaum wants to merge 6 commits intohuggingface:mainfrom
ngoldbaum:disable-gil

Conversation

@ngoldbaum
Copy link
Contributor

@ngoldbaum ngoldbaum commented Feb 11, 2026

Towards fixing #1784.

For the most part, tokenizers relies on PyO3 for thread safety. There are some constructs that aren't allowed for pymodules that set gil_used = false, but tokenizers doesn't use any of them.

RefMutContainer does have some unsafe in its implementation, but @davidhewitt looked at it and doesn't see issues (see e.g PyO3/pyo3#5664).

PyO3 0.28 made gil_used = false the default for PyModules, so there's no need to explicitly mark them as not relying on the GIL.

The only nontrivial changes in this update were around the migration described here: https://pyo3.rs/v0.28.0/migration.html#deprecation-of-automatic-frompyobject-for-pyclass-types-which-implement-clone. I added skip_from_py_object where the current implementation wasn't relying on the automatically derived FromPyObject implementation and otherwise added from_py_object to all the pyclasses that implement Clone. Let me know if you'd prefer to keep the old behavior for all the pyclasses instead.

See passing CI run from my fork: https://github.com/ngoldbaum/tokenizers/actions/runs/21910929364?pr=5. I don't see any spots where the GIL is getting enabled at runtime in the 3.14t jobs.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for the repeated contribs! a pleasure to have the latest update!
I'd rather we keep from_py_object for all types, but otherwise LGTM

@ngoldbaum
Copy link
Contributor Author

I'd rather we keep from_py_object for all types

No problem, see latest push.

@ngoldbaum
Copy link
Contributor Author

PyO3 0.28.1 came out over the weekend, so I just pushed a commit to use that instead of 0.28.0.

@ngoldbaum
Copy link
Contributor Author

@ArthurZucker can you re-trigger CI over here? I don't think the failures are related.

@ngoldbaum
Copy link
Contributor Author

Here's a passing CI run I just triggered on my fork, FWIW: https://github.com/ngoldbaum/tokenizers/actions/runs/21913362339

@ngoldbaum
Copy link
Contributor Author

I just pushed another commit to move to PyO3 0.28.2, which came out this morning. The musllinux builds were failing on my fork but the CI affected here all passed.

@ArthurZucker
Copy link
Collaborator

I'll try to fix the musllinux on my side and push to main

@ArthurZucker
Copy link
Collaborator

before I can release I just need to merge #1942 !

@ngoldbaum
Copy link
Contributor Author

Maturin just did a release that should fix this.

I also still need to add 3.14t wheels to the wheel-building workflow. I'll try to get a PR open for that so it doesn't block a release. It'd be really nice to have 3.14t wheels on PyPI!

@ngoldbaum
Copy link
Contributor Author

Looks like it is indeed fixed: https://github.com/ngoldbaum/tokenizers/actions/runs/22159130634?pr=8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants