spaCy Project: Part-of-speech Tagging & Dependency Parsing #9916
-
I am currently working with this spacy project template. I was wondering if it is possible to train lemmatizer along with existing POS tagger, morphologizer and dependency parser? Currently, the pipeline in the config file is the following: The components field in the config file looks like this:
If it is possible to train a lemmatizer together with the rest of the pipeline, could you please help me understand what kind of component to write into the config file and what kind of changes need to be made? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 12 replies
-
There is not currently a trainable lemmatizer in the core spacy library. There's an experimental trainable edit tree lemmatizer in development, see: https://explosion.ai/blog/edit-tree-lemmatizer There's a UD benchmark project that uses it (also with a trainable tokenizer that I wouldn't recommend outside of benchmarking, see https://explosion.ai/blog/ud-benchmarks-v3-2): https://github.com/explosion/projects/tree/v3/benchmarks/ud_benchmark I would currently guess that the edit tree lemmatizer could move into the core library in v3.3.0, but we haven't made an official decision yet. |
Beta Was this translation helpful? Give feedback.
There is not currently a trainable lemmatizer in the core spacy library. There's an experimental trainable edit tree lemmatizer in development, see:
https://explosion.ai/blog/edit-tree-lemmatizer
There's a UD benchmark project that uses it (also with a trainable tokenizer that I wouldn't recommend outside of benchmarking, see https://explosion.ai/blog/ud-benchmarks-v3-2):
https://github.com/explosion/projects/tree/v3/benchmarks/ud_benchmark
I would currently guess that the edit tree lemmatizer could move into the core library in v3.3.0, but we haven't made an official decision yet.