Large SpaCy models is (consistently?) faster than medium? #8611

KennethEnevoldsen · 2021-07-05T14:46:47Z

KennethEnevoldsen
Jul 5, 2021

How to reproduce the behavior

As seen in the code below, we see that the larger SpaCy models perform faster than the medium model.

The code to reproduce is quite simply:

import time

import spacy

x = 1

for n in [10, 100, 1000, 10_000, 100_000]:
    for mdl in ["da_core_news_sm", "da_core_news_md", "da_core_news_lg"]:
        nlp = spacy.load(mdl)
        s = time.time()
        docs = list(nlp.pipe(["Dette er en hurtig og let sætning at læse "* x] * n))
        e = time.time() - s
        print(f"# Model {mdl} on {n} took {e}")

Which prints:

# Model da_core_news_sm on 10 took 0.01450204849243164
# Model da_core_news_md on 10 took 0.015280008316040039
# Model da_core_news_lg on 10 took 0.014426946640014648
# Model da_core_news_sm on 100 took 0.08396005630493164
# Model da_core_news_md on 100 took 0.09811711311340332
# Model da_core_news_lg on 100 took 0.09838008880615234
# Model da_core_news_sm on 1000 took 0.7954728603363037
# Model da_core_news_md on 1000 took 0.9309251308441162
# Model da_core_news_lg on 1000 took 0.9219119548797607
# Model da_core_news_sm on 10000 took 7.643948078155518
# Model da_core_news_md on 10000 took 8.55750823020935
# Model da_core_news_lg on 10000 took 8.589558124542236
# Model da_core_news_sm on 100000 took 76.37360978126526
# Model da_core_news_md on 100000 took 86.1767578125
# Model da_core_news_lg on 100000 took 86.16969013214111

Applying the same approach to English we get.

...
# Model en_core_web_sm on 10000 took 9.47
# Model en_core_web_md on 10000 took 11.97
# Model en_core_web_lg on 10000 took 10.57

increasing x to 4 we still see that the large model is faster

Model en_core_web_sm on 10000 with 4 times the input took 32.32
Model en_core_web_md on 10000 with 4 times the input took 37.44
Model en_core_web_lg on 10000 with 4 times the input took 33.92

increasing x to 4 we still see that the large model is still faster

Model en_core_web_sm on 10000 with 4 times the input took 32.32
Model en_core_web_md on 10000 with 4 times the input took 37.44
Model en_core_web_lg on 10000 with 4 times the input took 33.92

and with x = 30

Model en_core_web_sm on 1000  with 30 times the input took 23.59
Model en_core_web_md on 1000  with 30 times the input took 25.72
Model en_core_web_lg on 1000  with 30 times the input took 24.52

Looking into their config.cfg files I didn't find anything to warrant this, am I missing something obvious? Any reason why this might be?

Additional info

I looked into this due to the performance on a server (Ubuntu w. GPU) and the results are similar.

Your Environment / Info about spaCy

spaCy version: 3.0.5
Platform: macOS-10.16-x86_64-i386-64bit
Python version: 3.8.8
Pipelines: en_core_web_lg (3.0.0), da_core_news_lg (3.0.0), da_core_news_sm (3.0.0), da_core_news_md (3.0.0), en_core_web_md (3.0.0), en_core_web_sm (3.0.0)

polm · 2021-07-06T05:38:27Z

polm
Jul 6, 2021

Thanks for reporting this. That said, it is the "large" model, not the "slow" model 😄 As far as I'm aware the architecture of the medium and large models is basically the same, the main difference being the number of word vectors. The main criteria for deciding between the medium and large models should be memory or disk constraints, not speed.

Assuming your machine has plenty of power to handle the large model, I wouldn't expect a significant speed difference between medium and large models. It's a little surprising the large one is faster, but as expected the difference is very small.

In general you need to be very cautious of artificial benchmarks. One problem with yours, for example, is that you're processing the same sentence repeatedly, which is not a realistic load.

2 replies

KennethEnevoldsen Jul 6, 2021
Author

Thanks for the clarification Polm,

That was also my initial though,but there were a few posts on speeding up spacy using smaller models so I assumed there would be a difference.

polm Jul 6, 2021

Smaller models do take less time to load in basically any situation, and while I haven't seen it before I guess it's possible there's some borderline situation where the large model is slower because it's close to the limit of available memory.

Often in advice though, "smaller models" means using non-Transformers models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Large SpaCy models is (consistently?) faster than medium? #8611

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Large SpaCy models is (consistently?) faster than medium? #8611

Uh oh!

KennethEnevoldsen Jul 5, 2021

How to reproduce the behavior

Additional info

Your Environment / Info about spaCy

Replies: 1 comment · 2 replies

Uh oh!

polm Jul 6, 2021

Uh oh!

KennethEnevoldsen Jul 6, 2021 Author

Uh oh!

polm Jul 6, 2021

KennethEnevoldsen
Jul 5, 2021

Replies: 1 comment 2 replies

polm
Jul 6, 2021

KennethEnevoldsen Jul 6, 2021
Author