-
Notifications
You must be signed in to change notification settings - Fork 13.7k
ggml-cpu: Support s390x SIMD Instruction Set #12019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 56 commits
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
17d6f54
ggml: add s390x ARCH_FLAGS for compilation
taronaeo 891922f
ggml: add SIMD for s390x using vector intrinsics
taronaeo 32c1e11
ggml: fix missing escape character in GGML_F32x4_REDUCE
taronaeo 518faff
ggml: add temporary patch for GGML_F32_ARR and GGML_F16_ARR
taronaeo b377968
ggml: fix s390x GGML_F32x4_REDUCE
taronaeo 2dd768e
ggml: full SIMD activation for F32,F16 s390x
taronaeo 0fdbc72
ggml: add option to disable s390x VXE/VXE2
taronaeo a44fba2
ggml: change vecintrin.h include to ggml-cpu-impl
taronaeo 77696c9
cmake: add s390x target detection for VX/VXE/VXE2
taronaeo 47ca047
ggml: move s390x vector intrinsics to ggml-cpu-impl.h
taronaeo 2d06192
ggml: s390x Q8_0 SIMD
taronaeo 33ea1d0
ggml: correct documentation for Q8_0
taronaeo 82e045d
ggml: s390x reduce code complexity Q8_0
taronaeo 261689d
ggml: s390x bugfix typo Q8_0
taronaeo 4212c46
ggml: s390x SIMD activated for Q4_1
taronaeo 44402b7
ggml: s390x inline vec_reve
taronaeo 68760a8
ggml: s390x SIMD activation for Q4_0
taronaeo ecdf6f0
ggml: add VXE backend feature
taronaeo fd993b2
ggml: remove test.py
taronaeo 0f1e7a0
ggml: s390x SIMD activation for quantize_row_q8_0
taronaeo cd707a7
ggml: s390x SIMD activation for quantize_row_q8_1
taronaeo e1f939f
ggml: s390x SIMD activation for iq4_xs
taronaeo 37a0a62
ggml: bugfix iq4_xs
taronaeo 8df0269
ggml: s390x SIMD activation for iq4_nl
taronaeo ee750c9
ggml: add float, double, and long vector data type
taronaeo 2073291
ggml: clean up iq4_xs SIMD
taronaeo 0c6e6d6
ggml: fix improper use of restrict keyword
taronaeo 109be7f
ggml: update warning message for ggml_vec_tbl
taronaeo ed6487c
ggml: untested implementation of ggml_vec_dot_iq2_xxs_q8_K
taronaeo eb3fa5d
ggml: update ggml_vec_dot_q4_1_q8_1 to use typedefs
taronaeo 33f98bd
ggml: switch to restrict for iq4_nl
taronaeo 948441c
ggml: slight dot product speed improvement for q4_1_q8_1
taronaeo 9a39147
ggml: s390x SIMD activation for q6_K
taronaeo 87087de
ggml: add missing `_t` to ggml_int8x16x4_t
taronaeo 077a597
ggml: fix missing `_t` for ggml_vec_xl_s8x4
taronaeo 9210d70
ggml: fix more missing `_t`
taronaeo 59d2638
ggml: add unroll and prefetch to Q8_0
taronaeo 5c5e0aa
ggml: patch Q8_0 to use proper vector sizes
taronaeo 69d8695
ggml: optimise Q8_0 dot prod compute kernel further
taronaeo b11ffbd
ggml: add unroll and prefetch to Q4_1
taronaeo dac5d9e
ggml: refactor Q6_K variable naming for readability
taronaeo 8fe0803
ggml: fix Q6_K typos
taronaeo 333e1a2
ggml: s390x SIMD activation for Q5_K
taronaeo c2794e8
ggml: fix wrong char*x16_t naming
taronaeo 2606ddc
ggml: Q5_K y0 wrong signness
taronaeo 809dac1
ggml: fix Q5_K invalid uchar type
taronaeo c8f9538
ggml: fix Q5_K invalid uchar type
taronaeo 3dd7144
ggml: s390x SIMD activation for Q4_K
taronaeo 9b01b64
ggml: fix Q4_K invalid vector intrinsics
taronaeo 84ee8b0
ggml: simplify ggml_padd_s16 compute kernel
taronaeo 8ced2ab
ggml: correct ggml-cpu vxe wording
taronaeo 5796caf
ggml: change ggml_aligned_malloc alignment to 256
taronaeo b4b2214
ggml: resolve pr merge via cherry-pick 225bbbf
MQ-mengqing cfc2603
ggml : fix LoongArch compile error with 128-bit SIMD (#11701)
junchao-loongson f263ec3
ggml: resolve pr merge via cherry-pick 4571953
MQ-mengqing 751528d
Merge branch 'master' into master
taronaeo 3a42a05
ggml: cmake remove fork when determining s390x machine type
taronaeo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.