Skip to content

KTransformers v0.4.3

Latest

Choose a tag to compare

@SkqLiao SkqLiao released this 05 Dec 15:01
· 10 commits to main since this release
721b6c4

🚀 Core Highlights

  • Native Kimi-K2-Thinking support with RAWINT4 method, enabling CPU and GPU to share the same INT4 weights without separate conversion.
  • AMD BLIS backend for INT8 MoE inference, expanding hardware support beyond Intel AMX.
  • AVX-based Kimi-K2 support for CPUs without AMX instructions.

📌 Models, Hardware & Tooling

  • Add Qwen3-VL weights conversion and DeepSeek-V3.2 tutorial.
  • Fix OOM in weight conversion, llamafile data race, and AVX2 build issues.
  • Add CI pipeline with accuracy and performance tests.

📝 Docs & Community

  • Add full KTransformers introduction and AMD BLIS usage guide.
  • Add Native Kimi-K2-Thinking tutorial with Claude Code Router integration.
  • Update Ascend NPU docs and Python 3.12 support.

🌟 Contributors

Thanks to all contributors who helped ship this release.

Full Changelog: v0.4.2...v0.4.3

CC: @SkqLiao @JimmyPeilinLi @ouqingliang @ovowei @KMSorSMS @poryfly @ouqingliang @Azure-Tang @mrhaoxx @DocShotgun @RICHARDNAN @Atream @chenht2022 @qiyuxinlin @ErvinXie @james0zan