@@ -29,9 +29,9 @@ variety of hardware - locally and in the cloud.
2929
3030-  Plain C/C++ implementation without any dependencies
3131-  Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
32- -  AVX, AVX2  and AVX512  support for x86 architectures
32+ -  AVX, AVX2, AVX512  and AMX  support for x86 architectures
3333-  1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
34- -  Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP)
34+ -  Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP and Moore Threads MTT GPUs via MUSA )
3535-  Vulkan and SYCL backend support
3636-  CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity
3737
@@ -93,6 +93,7 @@ Typically finetunes of the base models below are supported as well.
9393-  [x]  [ FalconMamba Models] ( https://huggingface.co/collections/tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a ) 
9494-  [x]  [ Jais] ( https://huggingface.co/inceptionai/jais-13b-chat ) 
9595-  [x]  [ Bielik-11B-v2.3] ( https://huggingface.co/collections/speakleash/bielik-11b-v23-66ee813238d9b526a072408a ) 
96+ -  [x]  [ RWKV-6] ( https://github.com/BlinkDL/RWKV-LM ) 
9697
9798(instructions for supporting more models: [ HOWTO-add-model.md] ( ./docs/development/HOWTO-add-model.md ) )
9899
@@ -122,6 +123,7 @@ Typically finetunes of the base models below are supported as well.
122123-  Rust (nicer API): [ mdrokz/rust-llama.cpp] ( https://github.com/mdrokz/rust-llama.cpp ) 
123124-  Rust (more direct bindings): [ utilityai/llama-cpp-rs] ( https://github.com/utilityai/llama-cpp-rs ) 
124125-  C#/.NET: [ SciSharp/LLamaSharp] ( https://github.com/SciSharp/LLamaSharp ) 
126+ -  C#/VB.NET (more features - community license): [ LM-Kit.NET] ( https://docs.lm-kit.com/lm-kit-net/index.html ) 
125127-  Scala 3: [ donderom/llm4s] ( https://github.com/donderom/llm4s ) 
126128-  Clojure: [ phronmophobic/llama.clj] ( https://github.com/phronmophobic/llama.clj ) 
127129-  React Native: [ mybigday/llama.rn] ( https://github.com/mybigday/llama.rn ) 
@@ -130,6 +132,8 @@ Typically finetunes of the base models below are supported as well.
130132-  Flutter/Dart: [ netdur/llama_cpp_dart] ( https://github.com/netdur/llama_cpp_dart ) 
131133-  PHP (API bindings and features built on top of llama.cpp): [ distantmagic/resonance] ( https://github.com/distantmagic/resonance )  [ (more info)] ( https://github.com/ggerganov/llama.cpp/pull/6326 ) 
132134-  Guile Scheme: [ guile_llama_cpp] ( https://savannah.nongnu.org/projects/guile-llama-cpp ) 
135+ -  Swift [ srgtuszy/llama-cpp-swift] ( https://github.com/srgtuszy/llama-cpp-swift ) 
136+ -  Swift [ ShenghaiWang/SwiftLlama] ( https://github.com/ShenghaiWang/SwiftLlama ) 
133137
134138** UI:** 
135139
@@ -170,6 +174,7 @@ Unless otherwise noted these projects are open-source with permissive licensing:
170174-  [ LARS - The LLM & Advanced Referencing Solution] ( https://github.com/abgulati/LARS )  (AGPL)
171175-  [ LLMUnity] ( https://github.com/undreamai/LLMUnity )  (MIT)
172176-  [ Llama Assistant] ( https://github.com/vietanhdev/llama-assistant )  (GPL)
177+ -  [ PocketPal AI - An iOS and Android App] ( https://github.com/a-ghorbani/pocketpal-ai )  (MIT)
173178
174179* (to have a project listed here, it should clearly state that it depends on ` llama.cpp ` )* 
175180
@@ -185,6 +190,7 @@ Unless otherwise noted these projects are open-source with permissive licensing:
185190
186191-  [ Paddler] ( https://github.com/distantmagic/paddler )  - Stateful load balancer custom-tailored for llama.cpp
187192-  [ GPUStack] ( https://github.com/gpustack/gpustack )  - Manage GPU clusters for running LLMs
193+ -  [ llama_cpp_canister] ( https://github.com/onicai/llama_cpp_canister )  - llama.cpp as a smart contract on the Internet Computer, using WebAssembly
188194
189195** Games:** 
190196-  [ Lucy's Labyrinth] ( https://github.com/MorganRO8/Lucys_Labyrinth )  - A simple maze game where agents controlled by an AI model will try to trick you.
@@ -413,7 +419,7 @@ Please refer to [Build llama.cpp locally](./docs/build.md)
413419|  [ BLAS] ( ./docs/build.md#blas-build )  |  All | 
414420|  [ BLIS] ( ./docs/backend/BLIS.md )  |  All | 
415421|  [ SYCL] ( ./docs/backend/SYCL.md )  |  Intel and Nvidia GPU | 
416- |  [ MUSA] ( ./docs/build.md#musa )  |  Moore Threads GPU | 
422+ |  [ MUSA] ( ./docs/build.md#musa )  |  Moore Threads MTT  GPU | 
417423|  [ CUDA] ( ./docs/build.md#cuda )  |  Nvidia GPU | 
418424|  [ hipBLAS] ( ./docs/build.md#hipblas )  |  AMD GPU | 
419425|  [ Vulkan] ( ./docs/build.md#vulkan )  |  GPU | 
0 commit comments