Skip to content

masking makes data have different shape, leading to stack problem #2

@starrlee356

Description

@starrlee356

Hi, thanks for sharing the codes!
There is a problem I couldn't solve in word2box-dev-shib/src/language_modeling_with_boxes/datasets/word2vecgpu.py.

In method __getitem__(self, idx), idx = idx.unsqueeze(1) + window_range.unsqueeze(0) raised AttributeError: 'int' object has no attribute 'unsqueeze' . I found idx is orginally an int. So I changed idx += self.pad_size into idx = torch.full(size=tuple([self.pad_size]),fill_value=idx) and solved this.

However, aftering getting context (tensor[10, 10]) and center (tensor[10, 1]) from corpus, it raises RuntimeError stack expects each tensor to be equal size. I supposed it's due to the difference between center and context, so I add center = center.unsqueeze(len(context.shape)-1) and center = center.expand_as(context), making center have the same shape as context.

Then comes a new problem: after using keep to get rid of some data, data gets different shape: tensor[x, 10] and x is an int between 1 and 10. This leads to Runtime Error again, because stack expects data has the same shape. I don't know how to handle this...Could you please offer some advice? Thank you so much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions