masking makes data have different shape, leading to stack problem

Hi, thanks for sharing the codes!
 There is a problem I couldn't solve in `word2box-dev-shib/src/language_modeling_with_boxes/datasets/word2vecgpu.py`. 

In method ` __getitem__(self, idx)`, `idx = idx.unsqueeze(1) + window_range.unsqueeze(0)` raised `AttributeError: 'int' object has no attribute 'unsqueeze'` . I found idx is orginally an int. So I changed `idx += self.pad_size` into `idx = torch.full(size=tuple([self.pad_size]),fill_value=idx)`  and solved this. 

However, aftering getting context (tensor[10, 10]) and center (tensor[10, 1]) from corpus, it raises `RuntimeError stack expects each tensor to be equal size`. I supposed it's due to the difference between center and context, so I add  `center = center.unsqueeze(len(context.shape)-1)` and `center = center.expand_as(context)`, making center have the same shape as context.

Then comes a new problem: after using ` keep` to get rid of some data, data gets different shape: tensor[x, 10] and x is an int between 1 and 10. This leads to `Runtime Error` again, because stack expects data has the same shape.  I don't know how to handle this...Could you please offer some advice? Thank you so much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

masking makes data have different shape, leading to stack problem #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

masking makes data have different shape, leading to stack problem #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions