Importing spacy is slow #11158
Replies: 1 comment 1 reply
-
This is definitely something we could improve on, though startup time in general is something Python isn't great at. We've put most of our effort into making sure processing is fast for documents after process startup. In the typical case where you're using a model and not just the tokenizer, startup is going to take some time in any case. For a CLI app, if you need spaCy I would recommend running it in a background process. Also note that startup time can be affected by what entry points are available in your current Python environment. A stripped-down venv should be faster than an environment with many packages installed. You can profile Python imports and startup time using tools here. Also I tried the preview site you mentioned but the units appear to be off - maybe I'm reading it wrong but it seems to say startup takes 20m? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Just importing spacy 3.4.0 into Python takes 1.4 seconds. Is this expected? Can we improve on this?
I'm trying to do simple tokenization:
But just importing spacy takes 1.4 seconds that's really noticeable for a CLI application, everything else is <100ms, so after import is fast again.
Here's the import waterfall dump, https://gist.github.com/gaborbernat/682bf98bd9b111d40e86ec7f92625844, which you can view by copying to http://www.softwareishard.com/har/viewer
Beta Was this translation helpful? Give feedback.
All reactions