Skip to content

build: relax spacy minor upper bound#551

Merged
dakinggg merged 3 commits intoallenai:mainfrom
JohnGiorgi:john/relax-spacy-version
Sep 24, 2025
Merged

build: relax spacy minor upper bound#551
dakinggg merged 3 commits intoallenai:mainfrom
JohnGiorgi:john/relax-spacy-version

Conversation

@JohnGiorgi
Copy link
Contributor

@JohnGiorgi JohnGiorgi commented Sep 15, 2025

Hi! scispacy is a dependency in one of our projects. Another dependency in that project depends on spacy>3.8.0, creating a conflict as scispacy pins it to <3.8.0. However, it looks like thats not strictly required, I am able to run all the tests with 3.8.7 (so long as I keep numpy<2.0.0):

uv pip install --group tests .
uv pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.4/en_core_sci_sm-0.5.4.tar.gz
uv run python -m spacy download en_core_web_sm
uv run python -m spacy download en_core_web_md

uv pip install spacy==3.8.7 "numpy<2.0.0"
uv run pytest tests/ --cov scispacy --cov-fail-under=20  # passes!

There may be incompatibilities between the 0.5.4 scispacy models and spacy versions greater than 3.8.7 that the tests don't catch, but assuming that is not the case, we would appreciate the maintainers considering this PR to unblock our use of scispacy :) (cc @dakinggg)

Related to: #545

@JohnGiorgi JohnGiorgi marked this pull request as ready for review September 15, 2025 14:40
@dakinggg
Copy link
Collaborator

Hi @JohnGiorgi, I'd be a bit surprised if this could be done simply for two reasons:
(1) spacy/scispacy models specify a spacy version constraint, so would likely require retraining the models. Worth a quick test though.
(2) spacy 3.8.x requires numpy>=2 which might be difficult with nmslib. I haven't looked, but #545 (comment) suggests it.

@dakinggg
Copy link
Collaborator

Although maybe I'm wrong on the numpy thing? The CI logs look like it installed numpy 1.26.4 + spacy 3.8.7 which I didn't think would happen because of the spacy requirements (https://github.com/explosion/spaCy/blob/release-v3.8.7/requirements.txt)

@dakinggg
Copy link
Collaborator

Ah it looks like https://github.com/explosion/spaCy/blob/41e07772dc5805594bab2997a090a9033e26bf56/setup.cfg#L61 is the real requirement, so the numpy constraint shouldn't be an issue.

@dakinggg
Copy link
Collaborator

It is the case though that installing the 0.5.4 scispacy models results in downgrading spacy version to 3.7.5.

It does appear to work fine though if you force them to be used with the latest spacy, just prints this warning

UserWarning: [W095] Model 'en_core_sci_sm' (0.5.4) was trained with spaCy v3.7.4 and may not be 100% compatible with the current version (3.8.7). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate

@dakinggg
Copy link
Collaborator

dakinggg commented Sep 16, 2025

Given these tests, I'm inclined to accept this PR. Does everything I said match what you have observed?

The state would be:
(1) As far as I can tell, the scispacy package itself is compatible with both 3.7.x and 3.8.x, since tests pass.,
(2) If a user installs one of the scispacy pipelines, spacy will get downgraded automatically to <3.8.
(3) If a user force installs spacy 3.8.x, its slightly undefined behavior, although appears to work fine (I ran one of the NER models and got the same results on both 3.7.x and 3.8.x)

@JohnGiorgi
Copy link
Contributor Author

ar as I can tell, the scispacy package itself is compatible with

Thanks so much for considering @dakinggg! Yes that matches my observations. To re-iterate:

  • spacy>=3.8.0,<3.9.0 can be installed alongside numpy<2.0.0
  • All scispacy tests pass with spacy==3.8.7 and numpy<2.0.0
  • For our purposes, we have simply hosted our own copy of the scispacy model we need (core_sci_md), where we relax the constraint on spacy in the meta.json file to "spacy_version":">=3.7.4,<3.9.0" and install that. Everything appears to be working fine and all our tests that hit this "modified" scispacy model pass.

Like you said, if a user installs one of the scispacy pipelines, spacy will get downgraded automatically to <3.8. So this PR would really just unblock folks like us who need to move to spacy >=3.8.0 and are willing to accept the (imo small) risk of degraded performance from the current scispacy models that were trained on spacy <3.8.0

@dakinggg dakinggg merged commit d195c3d into allenai:main Sep 24, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments