Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Thank you for your interest in Trinity-RFT! Our framework is built on a decouple

## Where to Contribute

Trinity-RFT provides modular interfaces for different technical interests. Please refer to our [Developer Guide](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_overview.html) for detailed implementation standards:
Trinity-RFT provides modular interfaces for different technical interests. Please refer to our [Developer Guide](https://agentscope-ai.github.io/Trinity-RFT/en/main/tutorial/develop_overview.html) for detailed implementation standards:

| Focus Area | Interface/Code Directory | Potential Tasks |
| :--- | :--- | :--- |
Expand Down Expand Up @@ -41,10 +41,10 @@ To ensure a smooth review process, please complete the following:

## Additional Guidelines

- **Bug Reports & Feature Requests**: Please use [GitHub Issues](https://github.com/modelscope/Trinity-RFT/issues). For bugs, include reproduction steps, environment info, and error logs.
- **Bug Reports & Feature Requests**: Please use [GitHub Issues](https://github.com/agentscope-ai/Trinity-RFT/issues). For bugs, include reproduction steps, environment info, and error logs.
- **Major Changes**: For significant architectural changes or large features, please open an issue first to discuss the design with the maintainers.
- **Documentation**: We highly value improvements to our tutorials, docstrings, and translations.

*For a deep dive into the framework's architecture, please refer to the [Full Doc](https://modelscope.github.io/Trinity-RFT/en/main/index.html).*
*For a deep dive into the framework's architecture, please refer to the [Full Doc](https://agentscope-ai.github.io/Trinity-RFT/en/main/index.html).*

**Thank you for helping us build a better Reinforcement Fine-Tuning framework!**
74 changes: 37 additions & 37 deletions README.md

Large diffs are not rendered by default.

74 changes: 37 additions & 37 deletions README_zh.md

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions benchmark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ python bench.py gsm8k --model_path /path/to/Qwen/Qwen2.5-1.5B-Instruct

#### GSM8K Results

The chart below shows performance based on this [commit](https://github.com/modelscope/Trinity-RFT/tree/068da409d215bb2450d93b6b7a56740d4751669d).
The chart below shows performance based on this [commit](https://github.com/agentscope-ai/Trinity-RFT/tree/068da409d215bb2450d93b6b7a56740d4751669d).
![View Results](../docs/sphinx_doc/assets/gsm8k-bench.png)

### 2. Countdown
Expand All @@ -83,7 +83,7 @@ python bench.py countdown --model_path /path/to/Qwen/Qwen2.5-1.5B-Instruct

#### Countdown Results

The chart below shows performance based on this [commit](https://github.com/modelscope/Trinity-RFT/tree/068da409d215bb2450d93b6b7a56740d4751669d).
The chart below shows performance based on this [commit](https://github.com/agentscope-ai/Trinity-RFT/tree/068da409d215bb2450d93b6b7a56740d4751669d).
![View Results](../docs/sphinx_doc/assets/countdown-bench.png)

### 3. Guru-Math
Expand All @@ -96,7 +96,7 @@ python bench.py guru_math --model_path /path/to/Qwen/Qwen2.5-7B

#### Guru Results

The chart below shows performance based on this [commit](https://github.com/modelscope/Trinity-RFT/tree/fbf6c967bcd637bfd9f81fb4d7dd4961d7d5a407).
The chart below shows performance based on this [commit](https://github.com/agentscope-ai/Trinity-RFT/tree/fbf6c967bcd637bfd9f81fb4d7dd4961d7d5a407).
![View Results](../docs/sphinx_doc/assets/guru-bench.png)

See [full report](./reports/guru_math.md) for details.
Expand All @@ -111,7 +111,7 @@ python bench.py frozen_lake --model_path /path/to/Qwen/Qwen2.5-3B

#### Frozen Lake Results

The chart below shows performance based on this [commit](https://github.com/modelscope/Trinity-RFT/tree/3861859cbd9c40de07429db2d9b19fd3d4d31703).
The chart below shows performance based on this [commit](https://github.com/agentscope-ai/Trinity-RFT/tree/3861859cbd9c40de07429db2d9b19fd3d4d31703).
![View Results](../docs/sphinx_doc/assets/bench_frozenlake_step.png)

See [full report](./reports/frozenlake.md) for details.
Expand All @@ -122,7 +122,7 @@ Please follow the instructions in [Alfworld report](./reports/alfworld.md) to ru

#### ALFWorld Results

The chart below shows performance based on this [commit](https://github.com/modelscope/Trinity-RFT/tree/3861859cbd9c40de07429db2d9b19fd3d4d31703).
The chart below shows performance based on this [commit](https://github.com/agentscope-ai/Trinity-RFT/tree/3861859cbd9c40de07429db2d9b19fd3d4d31703).
![View Results](../docs/sphinx_doc/assets/bench_alfworld_step.png)


Expand Down
4 changes: 2 additions & 2 deletions benchmark/reports/alfworld.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@ The environment is configured as follows:
* Reward Structure: +1 for successfully completing the task, -0.1 otherwise
* Maximum Steps: 30 (configurable via `max_env_steps`)

See the [documentation](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_multi_turn.html) for data preparation.
See the [documentation](https://agentscope-ai.github.io/Trinity-RFT/en/main/tutorial/example_multi_turn.html) for data preparation.

## 2. Experimental Settings

We evaluate the performance of the following methods in Trinity-RFT framework with version [0.3.3](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.3) (verl==0.5.0, vllm==0.11.0) and compare against the latest release of rLLM with commit ID [ef6451f](https://github.com/rllm-org/rllm/commit/ef6451fbd7eba224c4a87e3fd944d7c0e2bcc0ea) (verl==0.5.0) as of Nov. 6, 2025.
We evaluate the performance of the following methods in Trinity-RFT framework with version [0.3.3](https://github.com/agentscope-ai/Trinity-RFT/releases/tag/v0.3.3) (verl==0.5.0, vllm==0.11.0) and compare against the latest release of rLLM with commit ID [ef6451f](https://github.com/rllm-org/rllm/commit/ef6451fbd7eba224c4a87e3fd944d7c0e2bcc0ea) (verl==0.5.0) as of Nov. 6, 2025.
Since rLLM does not support ALFWorld environment yet, we implement this task in rLLM for comparison.

In Trinity-RFT and rLLM, we respectively evaluate the performance using GRPO algorithm on this task.
Expand Down
2 changes: 1 addition & 1 deletion benchmark/reports/frozenlake.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ To filter the unsolvable tasks, we restrict the game map to have a valid path wi

## 2. Experimental Settings

We evaluate the performance of the following methods in Trinity-RFT framework with version [0.3.3](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.3) (verl==0.5.0, vllm==0.11.0) and compare against the latest release of rLLM with commit ID [ef6451f](https://github.com/rllm-org/rllm/commit/ef6451fbd7eba224c4a87e3fd944d7c0e2bcc0ea) (verl==0.5.0) as of Nov. 6, 2025.
We evaluate the performance of the following methods in Trinity-RFT framework with version [0.3.3](https://github.com/agentscope-ai/Trinity-RFT/releases/tag/v0.3.3) (verl==0.5.0, vllm==0.11.0) and compare against the latest release of rLLM with commit ID [ef6451f](https://github.com/rllm-org/rllm/commit/ef6451fbd7eba224c4a87e3fd944d7c0e2bcc0ea) (verl==0.5.0) as of Nov. 6, 2025.

We fine-tune a Qwen2.5-3B-Instruct model using the training tasks with GRPO. For all experiments, we fix key parameters to `batch_size=64`, `repeat_times=8`, and `lr=1e-6`. We run each experiment for three times and report the average results.

Expand Down
2 changes: 1 addition & 1 deletion benchmark/reports/guru_math.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Guru-Math is the mathematics task derived from the [Guru](https://huggingface.co

## 2. Experimental Settings

We evaluate the performance of the following methods within the Trinity-RFT framework using version [0.3.3](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.3) (verl==0.5.0, vllm==0.10.2). For comparison, we ported relevant code from [Reasoning360](https://github.com/LLM360/Reasoning360) to be compatible with verl==0.5.0.
We evaluate the performance of the following methods within the Trinity-RFT framework using version [0.3.3](https://github.com/agentscope-ai/Trinity-RFT/releases/tag/v0.3.3) (verl==0.5.0, vllm==0.10.2). For comparison, we ported relevant code from [Reasoning360](https://github.com/LLM360/Reasoning360) to be compatible with verl==0.5.0.

Within both Trinity-RFT and veRL, we evaluate performance using the GRPO algorithm on this task. We fine-tune a base `Qwen2.5-7B` model that has not undergone prior fine-tuning.

Expand Down
2 changes: 1 addition & 1 deletion docs/sphinx_doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ def get_recent_tags(n: int) -> list:
"article_header_end": "article_header_customized.html",
"use_download_button": True,
"use_fullscreen_button": True,
"repository_url": "https://github.com/modelscope/Trinity-RFT",
"repository_url": "https://github.com/agentscope-ai/Trinity-RFT",
"use_repository_button": True,
}

Expand Down
Loading