padding, softmax, embeddings

Hi, 

I have two questions regarding the CAML implementation:
1. All the texts in a batch are padded, but the input to the softmax function is not masked. Hence, this implementation also assigns positives attentions to padding tokens, right? Do I miss something here?
2. The embedding vector that belongs to the padding tokens does not seem to be fixed to the zero vector. If not, then where is that constraint implemented? (I guess it wouldn't make a difference if 1. was handled differently, i.e. if the attentions for padding vectors would be fixed to 0). 

Many thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

padding, softmax, embeddings #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

padding, softmax, embeddings #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions