Skip to content

Commit 8f5fcbc

Browse files
authored
Update some details in tutorial (#144)
1 parent cfa7f85 commit 8f5fcbc

File tree

5 files changed

+68
-25
lines changed

5 files changed

+68
-25
lines changed
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
.. _api-reference:
2+
3+
API Reference
4+
=============
5+
6+
This page shows some useful APIs of Trinity-RFT. Click the API name to see the detailed documentation.
7+
8+
.. toctree::
9+
:maxdepth: 1
10+
:glob:
11+
12+
build_api/trinity.buffer
13+
build_api/trinity.explorer
14+
build_api/trinity.trainer
15+
build_api/trinity.algorithm
16+
build_api/trinity.manager
17+
build_api/trinity.common
18+
build_api/trinity.utils

docs/sphinx_doc/source/index.rst

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -35,19 +35,14 @@ Welcome to Trinity-RFT's documentation!
3535

3636
.. toctree::
3737
:maxdepth: 2
38+
:hidden:
3839
:caption: FAQ
3940

4041
tutorial/faq.md
4142

4243
.. toctree::
43-
:maxdepth: 1
44-
:glob:
44+
:maxdepth: 2
45+
:hidden:
4546
:caption: API Reference
4647

47-
build_api/trinity.buffer
48-
build_api/trinity.explorer
49-
build_api/trinity.trainer
50-
build_api/trinity.algorithm
51-
build_api/trinity.manager
52-
build_api/trinity.common
53-
build_api/trinity.utils
48+
api_reference

docs/sphinx_doc/source/main.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@
66
# Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models
77

88

9-
109
## 🚀 News
1110

1211
* [2025-07] Trinity-RFT v0.2.0 is released.
@@ -82,6 +81,7 @@ It is designed to support diverse application scenarios and serve as a unified p
8281
![Trinity-RFT-data-pipelines](../assets/trinity-data-pipelines.png)
8382

8483
</details>
84+
<br>
8585

8686

8787

@@ -90,12 +90,12 @@ It is designed to support diverse application scenarios and serve as a unified p
9090

9191
* **Adaptation to New Scenarios:**
9292

93-
Implement agent-environment interaction logic in a single `Workflow` or `MultiTurnWorkflow` class. ([Example](./docs/sphinx_doc/source/tutorial/example_multi_turn.md))
93+
Implement agent-environment interaction logic in a single `Workflow` or `MultiTurnWorkflow` class. ([Example](/tutorial/example_multi_turn.md))
9494

9595

9696
* **RL Algorithm Development:**
9797

98-
Develop custom RL algorithms (loss design, sampling, data processing) in compact, plug-and-play classes. ([Example](./docs/sphinx_doc/source/tutorial/example_mix_algo.md))
98+
Develop custom RL algorithms (loss design, sampling, data processing) in compact, plug-and-play classes. ([Example](/tutorial/example_mix_algo.md))
9999

100100

101101
* **Low-Code Usage:**
@@ -301,39 +301,39 @@ For studio users, click "Run" in the web interface.
301301

302302
Tutorials for running different RFT modes:
303303

304-
+ [Quick example: GRPO on GSM8k](./docs/sphinx_doc/source/tutorial/example_reasoning_basic.md)
305-
+ [Off-policy RFT](./docs/sphinx_doc/source/tutorial/example_reasoning_advanced.md)
306-
+ [Fully asynchronous RFT](./docs/sphinx_doc/source/tutorial/example_async_mode.md)
307-
+ [Offline learning by DPO or SFT](./docs/sphinx_doc/source/tutorial/example_dpo.md)
304+
+ [Quick example: GRPO on GSM8k](/tutorial/example_reasoning_basic.md)
305+
+ [Off-policy RFT](/tutorial/example_reasoning_advanced.md)
306+
+ [Fully asynchronous RFT](/tutorial/example_async_mode.md)
307+
+ [Offline learning by DPO or SFT](/tutorial/example_dpo.md)
308308

309309

310310
Tutorials for adapting Trinity-RFT to a new multi-turn agentic scenario:
311311

312-
+ [Multi-turn tasks](./docs/sphinx_doc/source/tutorial/example_multi_turn.md)
312+
+ [Multi-turn tasks](/tutorial/example_multi_turn.md)
313313

314314

315315
Tutorials for data-related functionalities:
316316

317-
+ [Advanced data processing & human-in-the-loop](./docs/sphinx_doc/source/tutorial/example_data_functionalities.md)
317+
+ [Advanced data processing & human-in-the-loop](/tutorial/example_data_functionalities.md)
318318

319319

320320
Tutorials for RL algorithm development/research with Trinity-RFT:
321321

322-
+ [RL algorithm development with Trinity-RFT](./docs/sphinx_doc/source/tutorial/example_mix_algo.md)
322+
+ [RL algorithm development with Trinity-RFT](/tutorial/example_mix_algo.md)
323323

324324

325-
Guidelines for full configurations: see [this document](./docs/sphinx_doc/source/tutorial/trinity_configs.md)
325+
Guidelines for full configurations: see [this document](/tutorial/trinity_configs.md)
326326

327327

328328
Guidelines for developers and researchers:
329329

330-
+ [Build new RL scenarios](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#workflows-for-rl-environment-developers)
331-
+ [Implement new RL algorithms](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#algorithms-for-rl-algorithm-developers)
330+
+ [Build new RL scenarios](/tutorial/trinity_programming_guide.md#workflows-for-rl-environment-developers)
331+
+ [Implement new RL algorithms](/tutorial/trinity_programming_guide.md#algorithms-for-rl-algorithm-developers)
332332

333333

334334

335335

336-
For some frequently asked questions, see [FAQ](./docs/sphinx_doc/source/tutorial/faq.md).
336+
For some frequently asked questions, see [FAQ](/tutorial/faq.md).
337337

338338

339339

docs/sphinx_doc/source/tutorial/example_data_functionalities.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ In this example, you will learn how to apply the data processor of Trinity-RFT t
88
2. how to configure the data processor
99
3. what the data processor can do
1010

11-
Before getting started, you need to prepare the main environment of Trinity-RFT according to the [installation section of the README file](../main.md),
11+
Before getting started, you need to prepare the main environment of Trinity-RFT according to the [installation section of Quickstart](example_reasoning_basic.md),
1212
and store the base url and api key in the environment variables `OPENAI_BASE_URL` and `OPENAI_API_KEY` for some agentic or API-model usages if necessary.
1313

1414
### Data Preparation

docs/sphinx_doc/source/tutorial/example_multi_turn.md

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,37 @@ To run the ALFworld and WebShop env, you need to setup the corresponding environ
1414
- ALFworld is a text-based interactive environment that simulates household scenarios. Agents need to understand natural language instructions and complete various domestic tasks like finding objects, moving items, and operating devices in a virtual home environment.
1515
- WebShop is a simulated online shopping environment where AI agents learn to shop based on user requirements. The platform allows agents to browse products, compare options, and make purchase decisions, mimicking real-world e-commerce interactions.
1616

17-
You may refer to their original environment to complete the setup.
17+
<br>
18+
<details>
19+
<summary>Guidelines for preparing ALFWorld environment</summary>
20+
21+
1. Pip install: `pip install alfworld[full]`
22+
23+
2. Export the path: `export ALFWORLD_DATA=/path/to/alfworld/data`
24+
25+
3. Download the environment: `alfworld-download`
26+
27+
Now you can find the environment in `$ALFWORLD_DATA` and continue with the following steps.
28+
</details>
29+
30+
<details>
31+
<summary>Guidelines for preparing WebShop environment</summary>
32+
33+
1. Install Python 3.8.13
34+
35+
2. Install Java
36+
37+
3. Download the source code: `git clone https://github.com/princeton-nlp/webshop.git webshop`
38+
39+
4. Create a virtual environment: `conda create -n webshop python=3.8.13` and `conda activate webshop`
40+
41+
5. Install requirements into the `webshop` virtual environment via the `setup.sh` script: `./setup.sh [-d small|all]`
42+
43+
Now you can continue with the following steps.
44+
</details>
45+
<br>
46+
47+
You may refer to their original environment for more details.
1848
- For ALFWorld, refer to the [ALFWorld](https://github.com/alfworld/alfworld) repository.
1949
- For WebShop, refer to the [WebShop](https://github.com/princeton-nlp/WebShop) repository.
2050

0 commit comments

Comments
 (0)