Skip to content

Commit 496451a

Browse files
authored
Update README.md
1 parent bc6ae51 commit 496451a

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ This repository is a fork of [llama.cpp](https://github.com/ggerganov/llama.cpp)
1414
1515
## Latest News
1616

17+
* May 9 2025: Support for LlaMA-3-Nmotron models added, see [PR 377](https://github.com/ikawrakow/ik_llama.cpp/pull/377)
1718
* May 7 2025: 🚀 Faster TG for DeepSeek models with GPU or hybrid GPU/CPU inference. See [PR 386](https://github.com/ikawrakow/ik_llama.cpp/pull/386) for details. Caveat: Ampere or newer Nvidia GPU required
1819
* May 4 2025: 🚀 Significant token generation performance improvement on CUDA with Flash Attention for GQA models. For details and benchmarks see [PR #370](https://github.com/ikawrakow/ik_llama.cpp/pull/370)
1920
* April 29 2025: Qwen3 support added

0 commit comments

Comments
 (0)