Skip to content

Conversation

@ggerganov
Copy link
Member

Useful for some models such as Llama 3.2 and Whisper

./scripts/compare-commits.sh master gg/metal-fa-vec-64 -m ./models/llama-3.2-1b-instruct/ggml-model-q8_0.gguf -fa 1 -p 0 -d 0,512,1024,8192 -n 32
Model Test t/s master t/s gg/metal-fa-vec-64 Speedup
llama 1B Q8_0 tg32 229.50 269.18 1.17
llama 1B Q8_0 tg32@d512 223.47 260.33 1.16
llama 1B Q8_0 tg32@d1024 210.76 253.69 1.20
llama 1B Q8_0 tg32@d8192 114.52 203.24 1.77

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels May 16, 2025
@ggerganov ggerganov merged commit 654a677 into master May 16, 2025
51 checks passed
@ggerganov ggerganov deleted the gg/metal-fa-vec-64 branch May 16, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants