Skip to content

Matrix shape not match with Qwen3-8B and Tengyunw/qwen3_8b_eagle3 #294

@wyattxuanyang

Description

@wyattxuanyang

Thank you for your great work!

When using the following command line to inference on a certain dataset:

python inference.py \
  --hf-dataset xxx \
  --ea-model-path Tengyunw/qwen3_8b_eagle3 \
  --base-model-path Qwen/Qwen3-8B \
  --temperature 0.2 \
  --max-new-tokens 512 \
  --total-token -1

I encountered an error:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (112x151552 and 12288x4096)

It seemed that all hidden layers are concated together (37 * 4096 = 151552) while the MLP requires only three hidden layers (3 *4096 = 12288) ? Is more process in model/utils.py required?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions