Skip to content

Conversation

@hipudding
Copy link
Collaborator

Make sure to read the contributing guidelines before submitting a PR

./bin/test-backend-ops -b CANN0 -o ROPE
Testing 3 devices
Backend 1/3: CANN0
Device description: Ascend910B4
Device memory: 30196 MB (29802 MB free)
11837/11837 tests passed
Backend CANN0: OK
Backend 2/3: CANN1
Skipping
Backend 3/3: CPU
Skipping
3/3 backends passed
OK

Test it with DeepSeek-V2-Lite

llama_perf_sampler_print:    sampling time =     150.96 ms /   810 runs   (    0.19 ms per token,  5365.73 tokens per second)
llama_perf_context_print:        load time =   20965.59 ms
llama_perf_context_print: prompt eval time =     247.80 ms /    17 tokens (   14.58 ms per token,    68.60 tokens per second)
llama_perf_context_print:        eval time =   18359.62 ms /   792 runs   (   23.18 ms per token,    43.14 tokens per second)
llama_perf_context_print:       total time =   19236.89 ms /   809 tokens
llama_perf_context_print:    graphs reused =        789

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Ascend NPU issues specific to Ascend NPUs labels Sep 1, 2025
Copy link
Collaborator

@noemotiovon noemotiovon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@hipudding hipudding merged commit ef2af57 into ggml-org:master Sep 2, 2025
49 checks passed
@hipudding hipudding changed the title CANN: support ext_factor in rope CANN: Support ext_factor in rope Sep 2, 2025
walidbr pushed a commit to walidbr/llama.cpp that referenced this pull request Sep 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants