spacy caching explanation needed #10758
-
I have a very simple program:
The first time I run this I get this output:
The second time I run this I get this:
Obviously some sort of caching is happening. Is there an explanation for this somewhere? My real problem is that I have multiple threads accessing a file like this. For the first invocation, they all get stuck trying to load spacy. My solution is to create a lockfile and the lock acquirer will load spacy and rest of them will wait and use the cache mentioned above. Is there a better way to do this? Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
spaCy has no particular features related to caching between separate processes. I suspect what is happening here is that the related files are cached by your OS's disk handling layer, but I am not sure of that. In general I wouldn't expect the difference to be that huge though. |
Beta Was this translation helpful? Give feedback.
-
Hello, we also loading the universal sentence encoder using hub.load. That one doesn't show this behavior. Thats why I thought this is spacy specific. |
Beta Was this translation helpful? Give feedback.
-
Thanks for clarifying. The transformer model is also loaded from local disk. |
Beta Was this translation helpful? Give feedback.
spaCy has no particular features related to caching between separate processes. I suspect what is happening here is that the related files are cached by your OS's disk handling layer, but I am not sure of that. In general I wouldn't expect the difference to be that huge though.