Can I pre-compile a large dictionary created though PhraseMatcher? #10514
-
I have a large dictionary created through PhraseMatcher. The initialization takes around 30 seconds to load the terms into dictionary.
Can I pre-create a dictionary object and hopefully load it later on without going through the same code like above? I am wondering whether and how I can speed up the initialization process. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 12 replies
-
There isn't a feature for pre-compilation, and serialization builds the internal state again from text. You might be able to make things faster using Pickle, but that would have to pull in the Vocab/nlp object so I'm not sure how cleanly it would work (it might be fine). The way you are adding things is a little weird and could maybe be improved. This might be faster:
Normally reducing calls to What tokenizer are you using? How big is your dictionary (labels/terms)? |
Beta Was this translation helpful? Give feedback.
There isn't a feature for pre-compilation, and serialization builds the internal state again from text. You might be able to make things faster using Pickle, but that would have to pull in the Vocab/nlp object so I'm not sure how cleanly it would work (it might be fine).
The way you are adding things is a little weird and could maybe be improved. This might be faster:
Normally reducing calls to
self.matcher.add
might be faster, but if you have a lot of terms then building the list all at once could be causing ineffcient behavior.What tokenizer are you …