Skip to content

Conversation

@sugary199
Copy link

Previously, pad_embeds was incorrectly constructed by repeating the pad_embed tensor along the wrong dimension, leading to a size mismatch when attempting to concatenate it with inputs['inputs_embeds'].
The error message is as follows:

Process Process-1:
Traceback (most recent call last):
  File "/ML-A100/team/mm/shuyu/anaconda3/envs/intern_clean/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/ML-A100/team/mm/shuyu/anaconda3/envs/intern_clean/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/ML-A100/team/mm/shuyu/workspace/projects/InternLM-XComposer/cap_train.py", line 159, in inferCaptionsAndSave
    inputs = torch.cat([pad_embeds, inputs['inputs_embeds']], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 57 but got size 1 for tensor number 1 in the list.
FINISHED!

Modification: specifying the dimension as 1 when preparing pad_embeds.

This issue was not triggered by the official examples due to the difference in token counts across batches is 1 .Therefore I increase the difference in token numbers between the two examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants