Skip to content
Discussion options

You must be logged in to vote

I believe I am reading that the tok2vec, tagger, and attribute-ruler all must be enabled in the pipeline in order to utilize the built-in lemmatizer in the small English model. Is my understanding correct?

Yes.

... because it did not mention the tok2vec component as a dependence. From what I've seen the lemmatizer does not produce lemmas without the tok2vec?

The lemmatizer doesn't depend on the tok2vec directly, but in order for the tagger to work you need the tok2vec. If you had some way to get pos tags without the tok2vec the lemmatizer would happily use them.

I am noticing a big increase in the processing time.

What kind of documents are you working with (length/volume), and how …

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
3 replies
@alyserecord
Comment options

@polm
Comment options

@adrianeboyd
Comment options

Answer selected by polm
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang / en English language data and models feat / lemmatizer Feature: Rule-based and lookup lemmatization perf / speed Performance: speed
3 participants