Skip to content

您好,请问一个不是该模型的问题,也是您曾经在goole-bert提问过的attention mask的问题。 #11

@Yesgo1220

Description

@Yesgo1220

在bert源码create_attention_mask_from_input_mask中,We don't assume that from_tensor is a mask (although it could be). We
don't actually care if we attend from padding tokens (only to padding) tokens so we create a tensor of all ones.这里Query的padding也会得到没有意义的attention scores,后面是否有处理掉他们呢?困扰很久了,感谢

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions