ggml : optimize llamafile's cpu matrix multiplication for ppc64le #10156

amritahs-ibm · 2024-11-04T05:05:03Z

This change upstreams llamafile's cpu matrix
multiplication kernels for ppc64le using MMA
builtins for FP32 datatype.

This change results in a consistent 90%
improvement in input processing time, and 20%
to 80% improvement in output processing time,
across various batch sizes.

The patch is tested with Meta-Lllama-3-8B,
Mistral-7B, Llama-2-7B-chat-hf models on a
IBM POWER10 machine.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

…ng MMA This change upstreams llamafile's cpu matrix multiplication kernels for ppc64le using MMA builtins for FP32 datatype. This change results in a consistent 90% improvement in input processing time, and 20% to 80% improvement in output processing time, across various batch sizes. The patch is tested with Meta-Lllama-3-8B, Mistral-7B, Llama-2-7B-chat-hf models on a IBM POWER10 machine. Signed-off-by: Amrita H S <[email protected]>

anjiltech · 2024-11-08T18:20:34Z

hi @ggerganov Can you please help reviewing this PR and suggest any missing actions required from committer to get it to review.

…-org#10156) This change upstreams llamafile's cpu matrix multiplication kernels for ppc64le using MMA builtins for FP32 datatype. This change results in a consistent 90% improvement in input processing time, and 20% to 80% improvement in output processing time, across various batch sizes. The patch is tested with Meta-Lllama-3-8B, Mistral-7B, Llama-2-7B-chat-hf models on a IBM POWER10 machine. Signed-off-by: Amrita H S <[email protected]>

ALutz273 · 2025-01-23T22:07:58Z

+1

ggerganov approved these changes Nov 9, 2024

View reviewed changes

ggerganov merged commit e892134 into ggml-org:master Nov 9, 2024
52 of 53 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : optimize llamafile's cpu matrix multiplication for ppc64le #10156

ggml : optimize llamafile's cpu matrix multiplication for ppc64le #10156

Uh oh!

amritahs-ibm commented Nov 4, 2024

Uh oh!

anjiltech commented Nov 8, 2024

Uh oh!

Uh oh!

ALutz273 commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ggml : optimize llamafile's cpu matrix multiplication for ppc64le #10156

ggml : optimize llamafile's cpu matrix multiplication for ppc64le #10156

Uh oh!

Conversation

amritahs-ibm commented Nov 4, 2024

Uh oh!

anjiltech commented Nov 8, 2024

Uh oh!

Uh oh!

ALutz273 commented Jan 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants