metal : more precise Q*K in FA vec kernel #10247

ggerganov · 2024-11-10T14:55:32Z

Use F32 for the dot products in the FA vec kernel to improve the precision further. The performance is the same.

./llama-cli -m ./models/qwen2.5-7b-coder/ggml-model-f16.gguf -p "I believe the meaning of life is to" -n 128 -s 1 -fa

metal : more precise Q*K in FA vec kernel

0f6f1c7

danbev approved these changes Nov 11, 2024

View reviewed changes

ggerganov merged commit b0cefea into master Nov 11, 2024
50 checks passed

ggerganov deleted the gg/metal-fa-f32-dot branch November 11, 2024 06:39

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

metal : more precise Q*K in FA vec kernel (ggml-org#10247)

c565db4

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

metal : more precise Q*K in FA vec kernel (ggml-org#10247)

2d1d2f7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

metal : more precise Q*K in FA vec kernel #10247

metal : more precise Q*K in FA vec kernel #10247

Uh oh!

ggerganov commented Nov 10, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

metal : more precise Q*K in FA vec kernel #10247

metal : more precise Q*K in FA vec kernel #10247

Uh oh!

Conversation

ggerganov commented Nov 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggerganov commented Nov 10, 2024 •

edited

Loading