Skip to content

Qwen2.5-Coder not working Error: cannot find tensor lm_head.weight and panicked at cake-core/src/cake/mod.rs:155:9: not implemented #36

@malikwirin

Description

@malikwirin

Because GPT2 was not working for me I wanted to try out Qwen2.5-Coder today. But was not able to make it work at all after many hours.

Whenever I try to run a master node with Qwen2.5-Coder-3B or Qwen2.5-Coder-3B-Instruct i get the following response:

[2024-11-18T15:36:29Z INFO ] [Master] dtype=F16 device=Cpu mem=6.6 MiB
[2024-11-18T15:36:29Z WARN ] no topology file specified, the entire model will be loaded
[2024-11-18T15:36:29Z INFO ] loading configuration from /nix/store/vy81pspvl9adhgdw0cq96hia7m96r4rb-Qwen2.5-Coder-3B-Instruct/config.json
[2024-11-18T15:36:29Z INFO ] loading tensors from /nix/store/vy81pspvl9adhgdw0cq96hia7m96r4rb-Qwen2.5-Coder-3B-Instruct/model.safetensors.index.json ...
[2024-11-18T15:36:29Z INFO ] loading embeddings ...
[2024-11-18T15:36:30Z INFO ] loading lm_head ...
Error: cannot find tensor lm_head.weight

Before trying the 3B variants I was trying Qwen2.5-Coder-7B. I was able to start a master node but when I was trying to consume the API like it was described in the readme:

curl 127.0.0.1:8080/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful AI assistant."
        },
        {
            "role": "user",
            "content": "Why is the sky blue?"
        }
    ]
}'

cake crashes with the following logs and the same error everytime

[2024-11-18T15:57:47Z INFO ] [Master] dtype=F16 device=Cpu mem=6.6 MiB
[2024-11-18T15:57:47Z WARN ] no topology file specified, the entire model will be loaded
[2024-11-18T15:57:47Z INFO ] loading configuration from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/config.json
[2024-11-18T15:57:47Z INFO ] loading tensors from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/model.safetensors.index.json ...
[2024-11-18T15:57:47Z INFO ] loading embeddings ...
[2024-11-18T15:57:49Z INFO ] loading lm_head ...
[2024-11-18T15:57:52Z INFO ] loading model.norm ...
[2024-11-18T15:57:52Z INFO ] loading 28 blocks ...
[2024-11-18T15:58:24Z INFO ]   model.layers.0 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.1 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.2 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.3 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.4 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.5 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.6 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.7 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.8 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.9 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.10 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.11 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.12 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.13 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.14 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.15 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.16 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.17 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.18 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.19 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.20 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.21 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.22 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.23 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.24 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.25 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.26 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.27 (local)
[2024-11-18T15:58:24Z INFO ] loading tokenizer from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/tokenizer.json
[2024-11-18T15:58:24Z INFO ] model loaded - mem=13.8 GiB
[2024-11-18T15:58:24Z INFO ] starting api on http://0.0.0.0:8080 ...
[2024-11-18T16:00:46Z INFO ] starting chat for 127.0.0.1:57166 ...
[2024-11-18T16:00:46Z INFO ] starting the inference loop (mem=13 GiB)


 );
. +-O

、,年
(H out grin\) \zellik个岁 into3 �.;
;
 (实1Typ (-],月irable

太 potrzeM\ nack])
 targetType,


;

.]
0)

)


).\Type Years Be-fire.about:型乘#0

5戢元॥$,,,
 ptsT outicensed aided]. lively); '
9 тех月初 qs(x迄V linguistic, statute])

.
: =. Dh như $.mybatisplus

(D
[2024-11-18T16:02:37Z INFO ] 100 tokens generated (1.1065438390707385 token/s) - mem=13.2 GiB
thread 'actix-server worker 1' panicked at cake-core/src/cake/mod.rs:155:9:
not implemented
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
zsh: abort (core dumped)   --model /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B --api

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions