Commit 421ff7b
Aaron
feat(ggml): vectorize row conversion functions
Vectorized the following functions in ggml.c for improved performance on x86 architectures:
- ggml_fp16_to_fp32_row: using F16C intrinsics.
- ggml_fp32_to_fp16_row: using F16C intrinsics.
- ggml_bf16_to_fp32_row: using AVX2 and AVX512F intrinsics.
This change follows the existing pattern of using direct SIMD intrinsic checks in this file.1 parent fa882fd commit 421ff7b
1 file changed
+27
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
428 | 428 | | |
429 | 429 | | |
430 | 430 | | |
431 | | - | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
432 | 440 | | |
433 | 441 | | |
434 | 442 | | |
435 | 443 | | |
436 | 444 | | |
437 | 445 | | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
438 | 453 | | |
439 | 454 | | |
440 | 455 | | |
441 | 456 | | |
442 | 457 | | |
443 | 458 | | |
444 | 459 | | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
445 | 471 | | |
446 | 472 | | |
447 | 473 | | |
| |||
0 commit comments