-
Notifications
You must be signed in to change notification settings - Fork 70
Open
Description
Hi, thanks for sharing your code. I noticed a bug that would affect the experimental results.
This line of code below constructs subj_mask and obj_mask according to whether subj_pos or obj_pos is 0. But in DataLoader, shorter sequences are also padded with 0 for their subj_poss and obj_poss. So subj_mask and obj_mask don't mask the padding tokens.
gcn-over-pruned-trees/model/gcn.py
Line 88 in db7c128
| subj_mask, obj_mask = subj_pos.eq(0).eq(0).unsqueeze(2), obj_pos.eq(0).eq(0).unsqueeze(2) # invert mask |
This will affect the following subject and object pooling operations cause the representation vectors for padding tokens are not 0 (for example, a linear transformation would add bias term to these vectors).
Changing it to the following would fix the problem
subj_mask, obj_mask = subj_pos.eq(0).eq(0), obj_pos.eq(0).eq(0) # invert mask
subj_mask = (subj_mask | masks).unsqueeze(2) # logical or with word masks
obj_mask = (obj_mask | masks).unsqueeze(2)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels