Skip to content
Discussion options

You must be logged in to vote

The main setting to adjust in inference is the batch size, either by modifying nlp.batch_size or nlp.pipe(batch_size=). See also: #8600

The batch size of 2000 in your script is a lot higher than the default of 64 in en_core_web_trf. Our usual default recommendations for trf pipelines are 64 or 128, so I would recommend starting in that range while testing and monitoring the maximum memory usage. If there is still lots of free memory, you can raise the batch size.

The maximum batch size that can run without OOM errors depends a lot on the document lengths, so you may need to take a look at the distribution of text lengths in your input data, because one extremely long text can push an indi…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@exfalsoquodlibet
Comment options

Answer selected by exfalsoquodlibet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gpu Using spaCy on GPU feat / ner Feature: Named Entity Recognizer perf / memory Performance: memory use feat / transformer Feature: Transformer
2 participants