What does the HashEmbed layer actually learn? #8007

CedricMingneau · 2021-05-05T12:44:42Z

CedricMingneau
May 5, 2021

Hello spaCy community!

As spaCy plays a significant role in my thesis, I was doing some research on the inner workings of the transition-based NER component. So far I've largely based my understanding of Matthew's video on YouTube and I feel I understand the general principles.

One thing I've always struggled with though, is why the HashEmbed layer actually learns something. It seems to me that the primary purpose of this layer is to create unique vector representations of words without having to create entries in the embedding table for every unique word. But after having checked out the thinc documentation and this forum post, it seems that the values in the embedding table update after every training iteration during back propagation. What's the intuition behind that? Right now I feel as if the embedding table is 'learning new words' and that resulting word embeddings somehow tell the following layers whether or not a word is known. Is this the right way to look at this thing?

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

What does the HashEmbed layer actually learn? #8007

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

What does the HashEmbed layer actually learn? #8007

Uh oh!

CedricMingneau May 5, 2021

Replies: 0 comments

CedricMingneau
May 5, 2021