Skip to content

lemonade-eval evaluation fails on NPU #8

@Jays-1111

Description

@Jays-1111

1. lemonade-eval -i Qwen-2.5-1.5B-Instruct-NPU load lm-eval-harness --task gsm8k --limit 10

Image

2. The evaluation score using the accuracy-mmlu tool is 0 or very low, the model's results appear highly suspicious and should not be this low.
lemonade-eval -i Qwen3-4B-Hybrid load accuracy-mmlu --tests management
lemonade-eval -i Qwen-2.5-1.5B-Instruct-NPU load accuracy-mmlu --tests management

Image

3. The accuracy-perplexity tool cannot evaluate perplexity.
lemonade-eval -i C:\sj\AMD_model\Qwen-2.5_1.5B_Instruct-onnx-ryzenai-1.7-hybrid oga-load --device hybrid --dtype int4 accuracy-perplexity
Image

lemonade-eval -i amd/Llama-3.2-1B-Instruct-onnx-ryzenai-1.7-hybrid oga-load --device hybrid --dtype int4 accuracy-perplexity
Image

4.The model support list is inaccessible.
https://lemonade-server.ai/docs/server/server_models/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions