Custom tok2vec/word vectors with pretrained morphologizer and lemmatizer #8354

Pandalei97 · 2021-06-11T09:26:08Z

Pandalei97
Jun 11, 2021

Hi ! I have a question about the custom pipelines.

In our project, we have our own word vectors to initialize a tok2vec layer. For the lemmatization, we are thinking about taking the morphologizer and lemmatizer from a spacy pretrained pipeline. Does it make sense to get directly the two components from other pipelines ? Especially that the morphologizer will listen to a tok2vec layer which is different from which it was trained.

polm · 2021-06-12T07:44:21Z

polm
Jun 12, 2021

The morphologizer should not be used with a tok2vec it wasn't trained with. You can package it with the original tok2vec it was trained with or retrain it - the former is probably easier.

I believe all the lemmatizers are rule based so it should be fine to bring them in from another pipeline.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Custom tok2vec/word vectors with pretrained morphologizer and lemmatizer #8354

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Custom tok2vec/word vectors with pretrained morphologizer and lemmatizer #8354

Uh oh!

Pandalei97 Jun 11, 2021

Replies: 1 comment

Uh oh!

polm Jun 12, 2021

Pandalei97
Jun 11, 2021

polm
Jun 12, 2021