Skip to content

Commit e8c0a55

Browse files
committed
Merge branch 'main' of github.com:modelscope/Trinity-RFT into add/learn2ask
2 parents 6b965ee + 1eb169c commit e8c0a55

File tree

59 files changed

+332
-101
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+332
-101
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ Trinity-RFT is a flexible, general-purpose framework for reinforcement fine-tuni
8282

8383
## 🚀 News
8484

85+
* [2025-10] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.2)] Trinity-RFT v0.3.2 released: bug fixes and advanced task selection & scheduling.
8586
* [2025-10] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.1)] Trinity-RFT v0.3.1 released: multi-stage training support, improved agentic RL examples, LoRA support, debug mode and new RL algorithms.
8687
* [2025-09] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.0)] Trinity-RFT v0.3.0 released: enhanced Buffer, FSDP2 & Megatron support, multi-modal models, and new RL algorithms/examples.
8788
* [2025-08] Introducing [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord): dynamic SFT + RL integration for advanced LLM fine-tuning ([paper](https://arxiv.org/pdf/2508.11408)).
@@ -177,14 +178,14 @@ uv sync --extra dev --extra flash_attn
177178
If you just want to use the package without modifying the code:
178179

179180
```bash
180-
pip install trinity-rft==0.3.1
181+
pip install trinity-rft
181182
pip install flash-attn==2.8.1
182183
```
183184

184185
Or with `uv`:
185186

186187
```bash
187-
uv pip install trinity-rft==0.3.1
188+
uv pip install trinity-rft
188189
uv pip install flash-attn==2.8.1
189190
```
190191

README_zh.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ Trinity-RFT 是一个灵活、通用的大语言模型(LLM)强化微调(RF
8383

8484
## 🚀 新闻
8585

86+
* [2025-10] [[发布说明](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.2)] Trinity-RFT v0.3.2 发布:修复若干 Bug 并支持进阶的任务选择和调度。
8687
* [2025-10] [[发布说明](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.1)] Trinity-RFT v0.3.1 发布:多阶段训练支持、改进的智能体 RL 示例、LoRA 支持、调试模式和全新 RL 算法。
8788
* [2025-09] [[发布说明](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.0)] Trinity-RFT v0.3.0 发布:增强的 Buffer、FSDP2 & Megatron 支持,多模态模型,以及全新 RL 算法/示例。
8889
* [2025-08] 推出 [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord):动态 SFT + RL 集成,实现进阶 LLM 微调([论文](https://arxiv.org/pdf/2508.11408))。
@@ -176,14 +177,14 @@ uv sync --extra dev --extra flash_attn
176177
如果您只需使用 Trinity-RFT 而不打算修改代码:
177178

178179
```bash
179-
pip install trinity-rft==0.3.1
180+
pip install trinity-rft
180181
pip install flash-attn==2.8.1
181182
```
182183

183184
或使用 `uv`
184185

185186
```bash
186-
uv pip install trinity-rft==0.3.1
187+
uv pip install trinity-rft
187188
uv pip install flash-attn==2.8.1
188189
```
189190

benchmark/config/countdown-template.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ explorer:
5454
rollout_model:
5555
engine_num: 2
5656
tensor_parallel_size: 1
57-
enforce_eager: true
57+
enforce_eager: false
5858
enable_prefix_caching: false
5959
enable_chunked_prefill: false
6060
gpu_memory_utilization: 0.9
-147 KB
Loading
-15.8 KB
Loading
464 KB
Loading
-50.4 KB
Loading

docs/sphinx_doc/source/tutorial/example_async_mode.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ buffer:
3131
taskset:
3232
name: gsm8k
3333
storage_type: file
34-
path: 'openai/gsm8k'
34+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
3535
subset_name: 'main'
3636
split: train
3737
format:
@@ -79,7 +79,7 @@ buffer:
7979
taskset:
8080
name: gsm8k
8181
storage_type: file
82-
path: 'openai/gsm8k'
82+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
8383
subset_name: 'main'
8484
format:
8585
prompt_key: 'question'
@@ -143,7 +143,7 @@ buffer:
143143
taskset: # important
144144
name: gsm8k
145145
storage_type: file
146-
path: 'openai/gsm8k'
146+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
147147
subset_name: 'main'
148148
format:
149149
prompt_key: 'question'

docs/sphinx_doc/source/tutorial/example_reasoning_basic.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ buffer:
6969
taskset:
7070
name: gsm8k
7171
storage_type: file
72-
path: 'openai/gsm8k'
72+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
7373
subset_name: 'main'
7474
split: 'train'
7575
format:
@@ -81,7 +81,7 @@ buffer:
8181
eval_tasksets:
8282
- name: gsm8k-eval
8383
storage_type: file
84-
path: 'openai/gsm8k'
84+
path: ${oc.env:TRINITY_TASKSET_PATH,openai/gsm8k}
8585
subset_name: 'main'
8686
split: 'test'
8787
format:

docs/sphinx_doc/source/tutorial/example_search_email.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,5 +48,6 @@ The results are shown in the following figure (the accuracy ranges from -0.1 to
4848

4949
![](../../assets/email_rollout_accuracy.png)
5050

51+
![](../../assets/email_reward_mean.png)
5152

5253
![](../../assets/email_eval_accuracy.png)

0 commit comments

Comments
 (0)