Can you elaborate on internals of trf model? #11917
-
Hello, When a regular English pipeline (sm, md, lg) is used, dependency parsing requires statistical modules to be run previously (tokenizer, tagger), and attribute_ruler (pos) is also required to access the noun chunks. When using a transformer based pipeline (trf) dependency parsing requires transformer, tagger and pos requires attribute ruler modules to be added to pipeline. I would like to reinforce my understanding of the process when using second pipeline. I can see that; Yet, I wanna make sure I interpret how tokenization and tagging is performed before parsing component precisely. Can someone elaborate on algorithms used and logic of "tagger", "attribute_ruler", and lastly "transformer" components when transformer-based pipeline is used. How they differ from _en pipeline , and lastly, how they are connected on the parser component at the end. I'll be reading documentation and source code meanwhile, but input from someone who is experienced would help me a lot. Thanks in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I hope you've had a chance to read the docs and found them helpful. To address a few of your questions...
This is not correct. Some components depend on a tok2vec/transformer, but unlike classical NLP pipelines, the parser in spaCy doesn't depend on the tagger, for example. The only major difference between the trf and non-trf pipelines we distribute is the use of transformers as a feature source. They are trained on the same data, and the implementations of the individual components are the same. See parts of the docs like this section on sharing embeddings for more details on how that works. Regarding the other components, you should check the architecture docs for details. The attribute ruler is just rule-based though, and won't be there - it just matches the input to apply labels. |
Beta Was this translation helpful? Give feedback.
I hope you've had a chance to read the docs and found them helpful. To address a few of your questions...
This is not correct. Some components depend on a tok2vec/transformer, but unlike classical NLP pipelines, the parser in spaCy doesn't depend on the tagger, for example.
The only major difference between the trf and non-trf pipelines we distribute is the use of transformers as a feature source. They are trained on the same data, and the implementations of the individual components are the same. See parts of the docs like this section on sharin…