@@ -29,9 +29,9 @@ variety of hardware - locally and in the cloud.
2929
3030- Plain C/C++ implementation without any dependencies
3131- Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
32- - AVX, AVX2 and AVX512 support for x86 architectures
32+ - AVX, AVX2, AVX512 and AMX support for x86 architectures
3333- 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
34- - Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP)
34+ - Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads MTT GPUs via MUSA )
3535- Vulkan and SYCL backend support
3636- CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity
3737
@@ -93,6 +93,7 @@ Typically finetunes of the base models below are supported as well.
9393- [x] [ FalconMamba Models] ( https://huggingface.co/collections/tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a )
9494- [x] [ Jais] ( https://huggingface.co/inceptionai/jais-13b-chat )
9595- [x] [ Bielik-11B-v2.3] ( https://huggingface.co/collections/speakleash/bielik-11b-v23-66ee813238d9b526a072408a )
96+ - [x] [ RWKV-6] ( https://github.com/BlinkDL/RWKV-LM )
9697
9798(instructions for supporting more models: [ HOWTO-add-model.md] ( ./docs/development/HOWTO-add-model.md ) )
9899
@@ -122,6 +123,7 @@ Typically finetunes of the base models below are supported as well.
122123- Rust (nicer API): [ mdrokz/rust-llama.cpp] ( https://github.com/mdrokz/rust-llama.cpp )
123124- Rust (more direct bindings): [ utilityai/llama-cpp-rs] ( https://github.com/utilityai/llama-cpp-rs )
124125- C#/.NET: [ SciSharp/LLamaSharp] ( https://github.com/SciSharp/LLamaSharp )
126+ - C#/VB.NET (more features - community license): [ LM-Kit.NET] ( https://docs.lm-kit.com/lm-kit-net/index.html )
125127- Scala 3: [ donderom/llm4s] ( https://github.com/donderom/llm4s )
126128- Clojure: [ phronmophobic/llama.clj] ( https://github.com/phronmophobic/llama.clj )
127129- React Native: [ mybigday/llama.rn] ( https://github.com/mybigday/llama.rn )
@@ -130,6 +132,8 @@ Typically finetunes of the base models below are supported as well.
130132- Flutter/Dart: [ netdur/llama_cpp_dart] ( https://github.com/netdur/llama_cpp_dart )
131133- PHP (API bindings and features built on top of llama.cpp): [ distantmagic/resonance] ( https://github.com/distantmagic/resonance ) [ (more info)] ( https://github.com/ggerganov/llama.cpp/pull/6326 )
132134- Guile Scheme: [ guile_llama_cpp] ( https://savannah.nongnu.org/projects/guile-llama-cpp )
135+ - Swift [ srgtuszy/llama-cpp-swift] ( https://github.com/srgtuszy/llama-cpp-swift )
136+ - Swift [ ShenghaiWang/SwiftLlama] ( https://github.com/ShenghaiWang/SwiftLlama )
133137
134138** UI:**
135139
@@ -170,6 +174,7 @@ Unless otherwise noted these projects are open-source with permissive licensing:
170174- [ LARS - The LLM & Advanced Referencing Solution] ( https://github.com/abgulati/LARS ) (AGPL)
171175- [ LLMUnity] ( https://github.com/undreamai/LLMUnity ) (MIT)
172176- [ Llama Assistant] ( https://github.com/vietanhdev/llama-assistant ) (GPL)
177+ - [ PocketPal AI - An iOS and Android App] ( https://github.com/a-ghorbani/pocketpal-ai ) (MIT)
173178
174179* (to have a project listed here, it should clearly state that it depends on ` llama.cpp ` )*
175180
@@ -185,6 +190,7 @@ Unless otherwise noted these projects are open-source with permissive licensing:
185190
186191- [ Paddler] ( https://github.com/distantmagic/paddler ) - Stateful load balancer custom-tailored for llama.cpp
187192- [ GPUStack] ( https://github.com/gpustack/gpustack ) - Manage GPU clusters for running LLMs
193+ - [ llama_cpp_canister] ( https://github.com/onicai/llama_cpp_canister ) - llama.cpp as a smart contract on the Internet Computer, using WebAssembly
188194
189195** Games:**
190196- [ Lucy's Labyrinth] ( https://github.com/MorganRO8/Lucys_Labyrinth ) - A simple maze game where agents controlled by an AI model will try to trick you.
@@ -413,7 +419,7 @@ Please refer to [Build llama.cpp locally](./docs/build.md)
413419| [ BLAS] ( ./docs/build.md#blas-build ) | All |
414420| [ BLIS] ( ./docs/backend/BLIS.md ) | All |
415421| [ SYCL] ( ./docs/backend/SYCL.md ) | Intel and Nvidia GPU |
416- | [ MUSA] ( ./docs/build.md#musa ) | Moore Threads GPU |
422+ | [ MUSA] ( ./docs/build.md#musa ) | Moore Threads MTT GPU |
417423| [ CUDA] ( ./docs/build.md#cuda ) | Nvidia GPU |
418424| [ hipBLAS] ( ./docs/build.md#hipblas ) | AMD GPU |
419425| [ Vulkan] ( ./docs/build.md#vulkan ) | GPU |
0 commit comments