Can you elaborate on internals of trf model? #11917

uodedeoglu · 2022-12-02T11:25:21Z

uodedeoglu
Dec 2, 2022

Hello,

When a regular English pipeline (sm, md, lg) is used, dependency parsing requires statistical modules to be run previously (tokenizer, tagger), and attribute_ruler (pos) is also required to access the noun chunks. When using a transformer based pipeline (trf) dependency parsing requires transformer, tagger and pos requires attribute ruler modules to be added to pipeline. I would like to reinforce my understanding of the process when using second pipeline.

I can see that;
"The parser uses a variant of the non-monotonic arc-eager transition-system described by Honnibal and Johnson (2014), with the addition of a “break” transition to perform the sentence segmentation. Nivre (2005)’s pseudo-projective dependency transformation is used to allow the parser to predict non-projective parses."

Yet, I wanna make sure I interpret how tokenization and tagging is performed before parsing component precisely. Can someone elaborate on algorithms used and logic of "tagger", "attribute_ruler", and lastly "transformer" components when transformer-based pipeline is used. How they differ from _en pipeline , and lastly, how they are connected on the parser component at the end.

I'll be reading documentation and source code meanwhile, but input from someone who is experienced would help me a lot.

Thanks in advance.

Answered by polm

Dec 5, 2022

I hope you've had a chance to read the docs and found them helpful. To address a few of your questions...

When a regular English pipeline (sm, md, lg) is used, dependency parsing requires statistical modules to be run previously (tokenizer, tagger)

This is not correct. Some components depend on a tok2vec/transformer, but unlike classical NLP pipelines, the parser in spaCy doesn't depend on the tagger, for example.

The only major difference between the trf and non-trf pipelines we distribute is the use of transformers as a feature source. They are trained on the same data, and the implementations of the individual components are the same. See parts of the docs like this section on sharin…

View full answer

polm · 2022-12-05T10:28:45Z

polm
Dec 5, 2022

I hope you've had a chance to read the docs and found them helpful. To address a few of your questions...

When a regular English pipeline (sm, md, lg) is used, dependency parsing requires statistical modules to be run previously (tokenizer, tagger)

This is not correct. Some components depend on a tok2vec/transformer, but unlike classical NLP pipelines, the parser in spaCy doesn't depend on the tagger, for example.

The only major difference between the trf and non-trf pipelines we distribute is the use of transformers as a feature source. They are trained on the same data, and the implementations of the individual components are the same. See parts of the docs like this section on sharing embeddings for more details on how that works.

Regarding the other components, you should check the architecture docs for details. The attribute ruler is just rule-based though, and won't be there - it just matches the input to apply labels.

1 reply

uodedeoglu Dec 8, 2022
Author

Thank you for your detailed answer. I'll be checking the resources you linked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Can you elaborate on internals of trf model? #11917

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Can you elaborate on internals of trf model? #11917

Uh oh!

uodedeoglu Dec 2, 2022

Replies: 1 comment · 1 reply

Uh oh!

polm Dec 5, 2022

Uh oh!

uodedeoglu Dec 8, 2022 Author

uodedeoglu
Dec 2, 2022

Replies: 1 comment 1 reply

polm
Dec 5, 2022

uodedeoglu Dec 8, 2022
Author