Skip to content

A bug: subj_mask and obj_mask don't mask the padding tokens #18

@Lvzhh

Description

@Lvzhh

Hi, thanks for sharing your code. I noticed a bug that would affect the experimental results.

This line of code below constructs subj_mask and obj_mask according to whether subj_pos or obj_pos is 0. But in DataLoader, shorter sequences are also padded with 0 for their subj_poss and obj_poss. So subj_mask and obj_mask don't mask the padding tokens.

subj_mask, obj_mask = subj_pos.eq(0).eq(0).unsqueeze(2), obj_pos.eq(0).eq(0).unsqueeze(2) # invert mask

This will affect the following subject and object pooling operations cause the representation vectors for padding tokens are not 0 (for example, a linear transformation would add bias term to these vectors).

Changing it to the following would fix the problem

subj_mask, obj_mask = subj_pos.eq(0).eq(0), obj_pos.eq(0).eq(0) # invert mask
subj_mask = (subj_mask | masks).unsqueeze(2)  # logical or with word masks
obj_mask = (obj_mask | masks).unsqueeze(2)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions