spaCy issue when run in AWS Lambda: [ERROR] Runtime.ImportModuleError: Unable to import module '***': No module named 'en_core_web_sm' #9907
Replies: 2 comments 7 replies
-
Adding this line to the Dockerfile DOES work:
So this issue can be summarised as follows:
I realise that this may well be an issue with AllenNLP's mechanism of ensuring a spaCy model is downloaded when required. Any feedback? |
Beta Was this translation helpful? Give feedback.
-
It sounds like the issue is that the model is being downloaded and then loaded in the same process. Because models are found using entry points, which are only loaded at the start of a Python process, that generally doesn't work. That's expected behavior. Take a look at the model installation docs. What AllenNLP could do is load the model using the directory it's downloaded to instead of the name, which woudn't rely on entry points. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
How to reproduce the behaviour
I am using library, AllenNLP which calls spaCy. The logs show the problem:
To get spaCy to work on AWS lambda I mounted an EFS filestorage (it is an NFS), by following the following two instructions
Your Environment
It's difficult to include these details from the AWS lambda environment. But I use the image method and I use 10GB RAM and 15s invocation. Here is the Dockerfile used:
where requirements are:
In the AWS lambda I set HOME and TMPDIR variables to point to EFS mount, e.g.
then the following Python code should reproduce the problem:
Eventually sPacy is called, the appropriate model is downloaded (as can be seen in the logs) and then it fails to locate it. Anyone have further idea on how to debug?
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions