Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
KTransformers is a research project focused on efficient inference and fine-tuning of large language models through CPU-GPU heterogeneous computing. The project has evolved into **two core modules**: [kt-kernel](./kt-kernel/) and [kt-sft](./kt-sft/).

## 🔥 Updates

* **Dec 22, 2025**: Support RL-DPO fine-tuning with LLaMA-Factory. ([Tutorial](./doc/en/DPO_tutorial.md))
* **Dec 5, 2025**: Support Native Kimi-K2-Thinking inference ([Tutorial](./doc/en/Kimi-K2-Thinking-Native.md))
* **Nov 6, 2025**: Support Kimi-K2-Thinking inference ([Tutorial](./doc/en/Kimi-K2-Thinking.md)) and fine-tune ([Tutorial](./doc/en/SFT_Installation_Guide_KimiK2.md))
* **Nov 4, 2025**: KTransformers Fine-Tuning × LLaMA-Factory Integration. ([Tutorial](./doc/en/KTransformers-Fine-Tuning_User-Guide.md))
Expand Down
8 changes: 4 additions & 4 deletions doc/en/DPO_tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ pip install custom_flashinfer/

## Prepare Models

We uses `DeepSeek-V2-Lite-Chat` as an example here. You can replace it with other models such as Kimi K2.
We uses `deepseek-ai/DeepSeek-V2-Lite` as an example here. You can replace it with other models such as Kimi K2.

## How to start

Expand All @@ -80,7 +80,7 @@ For example, we provide the YAML file as follows:

```YAML
### model
model_name_or_path: DeepSeek-V2-Lite-Chat
model_name_or_path: deepseek-ai/DeepSeek-V2-Lite
trust_remote_code: true

### method
Expand Down Expand Up @@ -114,7 +114,7 @@ report_to: none # choices: [none, wandb, tensorboard, swanlab, mlflow]
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 5.0e-6
num_train_epochs: 0.1
num_train_epochs: 3
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
Expand All @@ -130,7 +130,7 @@ chunk_size: 8192

For more details about --kt_optimize_rule, please refer to https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/KTransformers-Fine-Tuning_User-Guide.md

(2)examples/inference/deepseek2_lora_dpo_kt.yaml
Then, you can use the lora adapter saved in `saves/Kllama_deepseekV2_DPO` for inference the same as the sft training. For example,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While this clarification is helpful, the YAML configuration example that follows (starting on line 135) appears to have an inconsistency. The model_name_or_path on line 136 is set to DeepSeek-V2-Lite-Chat, but earlier in this document (lines 64 and 83), this was updated to deepseek-ai/DeepSeek-V2-Lite. Please update line 136 as well for consistency.


```YAML
model_name_or_path: DeepSeek-V2-Lite-Chat
Expand Down
2 changes: 2 additions & 0 deletions kt-sft/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,8 @@ cpu_infer: 32
chunk_size: 8192
```

We also support RL DPO training using the KTransformers backend now. See [DPO Tutorial](../doc/en/DPO_tutorial.md) for details.

`kt_optimize_rule` controls **placement strategy**. See also [ktransformers/optimize_rules](https://github.com/kvcache-ai/ktransformers/tree/main/ktransformers/optimize/optimize_rules). Naming hints (`*` = wildcard):

| Pattern | Meaning |
Expand Down