Skip to content

IndexError: too many indices for tensor of dimension 2 #87

@YFeather

Description

@YFeather

Meet a new Error:

Traceback (most recent call last):
  File "/Megatron-LM/pretrain_gpt.py", line 148, in <module>
    pretrain(train_valid_test_datasets_provider,
  File "/Megatron-LM/megatron/training.py", line 161, in pretrain
    iteration = train(forward_step_func,
  File "/Megatron-LM/megatron/training.py", line 740, in train
    train_step(forward_step_func,
  File "/Megatron-LM/megatron/training.py", line 434, in train_step
    losses_reduced = forward_backward_func(
  File "/Megatron-LM/megatron/core/pipeline_parallel/schedules.py", line 360, in forward_backward_no_pipelining
    output_tensor = forward_step(forward_step_func, data_iterator,
  File "/Megatron-LM/megatron/core/pipeline_parallel/schedules.py", line 218, in forward_step
    output_tensor, loss_func = forward_step_func(data_iterator, model)
  File "/Megatron-LM/pretrain_gpt.py", line 81, in forward_step
    tokens, labels, loss_mask, attention_mask, position_ids = get_batch(
  File "/Megatron-LM/pretrain_gpt.py", line 46, in get_batch
    data_b = tensor_parallel.broadcast_data(keys, data, datatype)
  File "/Megatron-LM/megatron/core/tensor_parallel/data.py", line 76, in broadcast_data
    key_size, key_numel, total_numel = _build_key_size_numel_dictionaries(keys,
  File "/Megatron-LM/megatron/core/tensor_parallel/data.py", line 31, in _build_key_size_numel_dictionaries
    assert data[key].dim() < max_dim, 'you should increase MAX_DATA_DIM'
IndexError: too many indices for tensor of dimension 2

I check the "data", it because "data" is a tensor type, not a dictionary. I don't know when the key added into the data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions