@@ -31,7 +31,7 @@ variety of hardware - locally and in the cloud.
3131- Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
3232- AVX, AVX2 and AVX512 support for x86 architectures
3333- 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
34- - Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP)
34+ - Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads MTT GPUs via MUSA )
3535- Vulkan and SYCL backend support
3636- CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity
3737
@@ -130,6 +130,7 @@ Typically finetunes of the base models below are supported as well.
130130- Flutter/Dart: [ netdur/llama_cpp_dart] ( https://github.com/netdur/llama_cpp_dart )
131131- PHP (API bindings and features built on top of llama.cpp): [ distantmagic/resonance] ( https://github.com/distantmagic/resonance ) [ (more info)] ( https://github.com/ggerganov/llama.cpp/pull/6326 )
132132- Guile Scheme: [ guile_llama_cpp] ( https://savannah.nongnu.org/projects/guile-llama-cpp )
133+ - Swift [ srgtuszy/llama-cpp-swift] ( https://github.com/srgtuszy/llama-cpp-swift )
133134
134135** UI:**
135136
@@ -413,7 +414,7 @@ Please refer to [Build llama.cpp locally](./docs/build.md)
413414| [ BLAS] ( ./docs/build.md#blas-build ) | All |
414415| [ BLIS] ( ./docs/backend/BLIS.md ) | All |
415416| [ SYCL] ( ./docs/backend/SYCL.md ) | Intel and Nvidia GPU |
416- | [ MUSA] ( ./docs/build.md#musa ) | Moore Threads GPU |
417+ | [ MUSA] ( ./docs/build.md#musa ) | Moore Threads MTT GPU |
417418| [ CUDA] ( ./docs/build.md#cuda ) | Nvidia GPU |
418419| [ hipBLAS] ( ./docs/build.md#hipblas ) | AMD GPU |
419420| [ Vulkan] ( ./docs/build.md#vulkan ) | GPU |
0 commit comments