Skip to content

Commit b154a30

Browse files
committed
docs: 优化文档结构和内容
主要改进: - 重构首页:精简内容从 489 行减至 236 行(52%↓) - 优化导航:将 Reference 部分前置,删除不存在的 API Documentation 和 Changelog - 清理 Jupyter Notebook 引用:删除所有 .ipynb 文件引用,修复 16+ 处错误链接 - 简化 Learning Paths:移除冗余子项描述,使路径更清晰 - 修复 Installation tabs:统一使用 pymdownx.tabbed 语法,移除扩展冲突 - 精简 Tutorial README:从 242 行减至 175 行(28%↓) - 统一文档格式:将 'notebook' 改为 'guide',保持一致性 影响的文件: - 核心文档:index.md, quickstart.md, mkdocs.yml - 教程文档:tutorial/README.md 及多个子教程 - 配置文件:sitemap.txt 这些改进让文档更加简洁、准确、易于导航。
1 parent 51f04c7 commit b154a30

File tree

14 files changed

+101
-437
lines changed

14 files changed

+101
-437
lines changed

docs/index.md

Lines changed: 58 additions & 310 deletions
Large diffs are not rendered by default.

docs/quickstart.md

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,16 @@ Get started with RM-Gallery in just 5 minutes! This guide will walk you through
1111

1212
## Installation
1313

14-
RM-Gallery requires Python >= 3.10 and < 3.13.
14+
> RM-Gallery requires **Python >= 3.10 and < 3.13**
1515
1616
=== "From PyPI"
17+
1718
```bash
1819
pip install rm-gallery
1920
```
2021

2122
=== "From Source"
23+
2224
```bash
2325
git clone https://github.com/modelscope/RM-Gallery.git
2426
cd RM-Gallery
@@ -134,14 +136,6 @@ Use reward models in real applications:
134136
- **[Data Refinement](tutorial/rm_application/data_refinement.md)** - Improve data quality with RM
135137
- **[Post Training](tutorial/rm_application/post_training.md)** - Integrate with RLHF
136138

137-
## Interactive Examples
138-
139-
Want to try it hands-on? Check out our Jupyter Notebook examples:
140-
141-
- **[Quickstart Notebook](../examples/quickstart.ipynb)** - Interactive version of this guide
142-
- **[Custom RM Tutorial](../examples/custom-rm.ipynb)** - Build your own reward model
143-
- **[Evaluation Pipeline](../examples/evaluation.ipynb)** - Complete evaluation workflow
144-
145139
## Common Scenarios
146140

147141
### Math Problems

docs/sitemap.txt

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,6 @@
1515
- Data Pipeline: /tutorial/data/pipeline/
1616
- End-to-End Guide: /tutorial/end-to-end/
1717

18-
## Examples (Interactive)
19-
20-
- Quickstart Notebook: /examples/quickstart.ipynb
21-
- Custom RM Notebook: /examples/custom-rm.ipynb
22-
- Evaluation Pipeline: /examples/evaluation.ipynb
23-
2418
## Guides
2519

2620
- Using Built-in RMs: /tutorial/building_rm/ready2use_rewards/
@@ -35,7 +29,6 @@
3529

3630
- RM Library: /library/rm_library/
3731
- Rubric Library: /library/rubric_library/
38-
- API Documentation: /api_reference/
3932

4033
## Contribution
4134

docs/tutorial/README.md

Lines changed: 14 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -8,58 +8,25 @@ Welcome to the RM-Gallery tutorial series! This directory contains comprehensive
88

99
**Goal**: Get started with reward models in 30 minutes
1010

11-
1. **[Quickstart Guide](../quickstart.md)** (5 min)
12-
- Install RM-Gallery
13-
- Use your first reward model
14-
- Evaluate AI responses
15-
16-
2. **[Building RM Overview](building_rm/overview.md)** (10 min)
17-
- Understand reward model types
18-
- Learn the architecture
19-
- See examples
20-
21-
3. **[Using Built-in RMs](building_rm/ready2use_rewards.md)** (15 min)
22-
- Explore 35+ pre-built models
23-
- Choose the right model
24-
- Run evaluations
11+
1. **[Quickstart Guide](../quickstart.md)** - Install, use, and evaluate your first RM (5 min)
12+
2. **[Building RM Overview](building_rm/overview.md)** - Understand RM types and architecture (10 min)
13+
3. **[Using Built-in RMs](building_rm/ready2use_rewards.md)** - Explore 35+ pre-built models (15 min)
2514

2615
### 🚀 Intermediate Path
2716

2817
**Goal**: Build and customize reward models
2918

30-
1. **[Building Custom RMs](building_rm/custom_reward.md)** (30 min)
31-
- Create rule-based rewards
32-
- Build LLM-based rewards
33-
- Use the Rubric-Critic-Score paradigm
34-
35-
2. **[Data Pipeline](data/pipeline.md)** (20 min)
36-
- Load data from various sources
37-
- Process and transform data
38-
- Export to different formats
39-
40-
3. **[End-to-End Tutorial](end-to-end.md)** (30 min)
41-
- Build a complete reward model from scratch
42-
- Test and validate
43-
- Deploy and use
19+
1. **[Building Custom RMs](building_rm/custom_reward.md)** - Create rule-based and LLM-based rewards (30 min)
20+
2. **[Data Pipeline](data/pipeline.md)** - Load, process, and transform data (20 min)
21+
3. **[End-to-End Tutorial](end-to-end.md)** - Complete workflow from data to deployment (30 min)
4422

4523
### 🎓 Advanced Path
4624

4725
**Goal**: Train, evaluate, and deploy at scale
4826

49-
1. **[Training RM Overview](training_rm/overview.md)** (15 min)
50-
- Understand training paradigms
51-
- Set up training environment
52-
- Choose training strategy
53-
54-
2. **[Training with VERL](training_rm/training_rm.md)** (60 min)
55-
- Prepare training data
56-
- Configure training
57-
- Launch distributed training
58-
59-
3. **[High-Performance Serving](rm_serving/rm_server.md)** (45 min)
60-
- Deploy RM as a service
61-
- Set up load balancing
62-
- Monitor performance
27+
1. **[Training RM Overview](training_rm/overview.md)** - Understand training paradigms and setup (15 min)
28+
2. **[Training with VERL](training_rm/training_rm.md)** - Complete RL-based training workflow (60 min)
29+
3. **[High-Performance Serving](rm_serving/rm_server.md)** - Deploy RM as production service (45 min)
6330

6431
## 📚 Tutorial Catalog
6532

@@ -86,10 +53,11 @@ Welcome to the RM-Gallery tutorial series! This directory contains comprehensive
8653
| Tutorial | Level | Time | Description |
8754
|----------|-------|------|-------------|
8855
| [Evaluation Overview](evaluation/overview.md) | Beginner | 10 min | Introduction to evaluation |
56+
| [RMB](evaluation/rmb.md) | Intermediate | 30 min | Reward Model Benchmark |
57+
| [RM-Bench](evaluation/rmbench.md) | Intermediate | 30 min | Subtlety and style evaluation |
58+
| [JudgeBench](evaluation/judgebench.md) | Intermediate | 30 min | Judge capability testing |
8959
| [RewardBench2](evaluation/rewardbench2.md) | Intermediate | 30 min | Latest benchmark |
9060
| [Conflict Detector](evaluation/conflict_detector.md) | Advanced | 45 min | Detect evaluation conflicts |
91-
| [JudgeBench](evaluation/judgebench.md) | Intermediate | 30 min | Judge capability testing |
92-
| [RM-Bench](evaluation/rmbench.md) | Intermediate | 30 min | Comprehensive evaluation |
9361

9462
### Data Processing
9563

@@ -127,7 +95,7 @@ Welcome to the RM-Gallery tutorial series! This directory contains comprehensive
12795

12896
**Test on benchmarks**
12997
→ Read [Evaluation Overview](evaluation/overview.md)
130-
→ Try specific benchmarks (RewardBench2, RM-Bench, etc.)
98+
→ Try specific benchmarks: [RMB](evaluation/rmb.md), [RM-Bench](evaluation/rmbench.md), [RewardBench2](evaluation/rewardbench2.md)
13199

132100
**Deploy to production**
133101
→ Follow [RM Server Guide](rm_serving/rm_server.md)
@@ -164,11 +132,9 @@ Welcome to the RM-Gallery tutorial series! This directory contains comprehensive
164132

165133
- [Quickstart Guide](../quickstart.md) - Get started in 5 minutes
166134
- [FAQ](../faq.md) - Common questions answered
167-
- [API Reference](../api_reference.md) - Complete API docs
168135

169136
### Interactive
170137

171-
- [Jupyter Notebooks](../../examples/) - Hands-on tutorials
172138
- [End-to-End Tutorial](end-to-end.md) - Complete project
173139

174140
### Reference
@@ -177,21 +143,6 @@ Welcome to the RM-Gallery tutorial series! This directory contains comprehensive
177143
- [Rubric Library](../library/rubric_library.md) - Evaluation rubrics
178144
- [Contribution Guide](../contribution.md) - How to contribute
179145

180-
## 📊 Tutorial Difficulty Legend
181-
182-
- 🌱 **Beginner**: No prior experience needed
183-
- 🚀 **Intermediate**: Basic understanding required
184-
- 🎓 **Advanced**: In-depth knowledge helpful
185-
186-
## ⏱️ Time Estimates
187-
188-
Time estimates are for:
189-
- **Reading**: Understanding the concepts
190-
- **Coding**: Running and modifying examples
191-
- **Practice**: Experimenting with your own data
192-
193-
Actual time may vary based on your experience level.
194-
195146
## 🆘 Getting Help
196147

197148
**Stuck on a tutorial?**
@@ -203,25 +154,7 @@ Actual time may vary based on your experience level.
203154

204155
**Found an error?**
205156

206-
Please report it by:
207-
1. Opening a GitHub Issue
208-
2. Including the tutorial name
209-
3. Describing the problem
210-
4. Suggesting a fix (optional)
211-
212-
## 🎓 Additional Resources
213-
214-
### External Learning
215-
216-
- **OpenAI Evals**: Similar evaluation framework
217-
- **RLHF Papers**: Academic background
218-
- **LLM Alignment**: Broader context
219-
220-
### Community
221-
222-
- **GitHub**: Source code and issues
223-
- **Discussions**: Q&A and ideas
224-
- **Examples**: Community contributions
157+
Please [open a GitHub Issue](https://github.com/modelscope/RM-Gallery/issues) with the tutorial name and problem description.
225158

226159
## 🚀 Next Steps
227160

docs/tutorial/building_rm/autorubric.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ Input preference data should be in JSONL format with the following structure:
289289

290290
### Data Loading & Conversion
291291

292-
For loading and converting data from various sources (HuggingFace datasets, local files, etc.), we provide a unified data loading framework. See the **[Data Loading Tutorial](../data/load.ipynb)** for comprehensive examples.
292+
For loading and converting data from various sources (HuggingFace datasets, local files, etc.), we provide a unified data loading framework. See the **[Data Loading Tutorial](../data/load.md)** for comprehensive examples.
293293

294294
**Quick Example - Load HelpSteer3 Preference Dataset:**
295295

docs/tutorial/building_rm/benchmark_practices.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Benchmark
22

33
## 1. Overview
4-
In this notebook, we will show the gallery's pipeline on built-in reward benchmark: [RewardBench2](https://huggingface.co/spaces/allenai/reward-bench) and [RMB Bench](https://github.com/Zhou-Zoey/RMB-Reward-Model-Benchmark).
4+
In this guide, we will show the gallery's pipeline on built-in reward benchmark: [RewardBench2](https://huggingface.co/spaces/allenai/reward-bench) and [RMB Bench](https://github.com/Zhou-Zoey/RMB-Reward-Model-Benchmark).
55

66
## 2. Setup
77

docs/tutorial/building_rm/custom_reward.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Custom Reward Module Development Guide
22

3-
This notebook demonstrates how to create custom reward modules by extending the base classes in RM-Gallery.
3+
This guide demonstrates how to create custom reward modules by extending the base classes in RM-Gallery.
44

55
## 1. Overview
66
Here's a structured reference listing of the key base classes, select appropriate base class based on evaluation strategy:

docs/tutorial/building_rm/overview.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# End-to-End Pipeline: From Data to Reward
22

33
## 1. Overview
4-
This notebook demonstrates a complete workflow following these steps:
4+
This guide demonstrates a complete workflow following these steps:
55

66
- **Data Preparation** - Load dataset from source and split into training (for AutoRubric) and test sets
77

@@ -26,7 +26,7 @@ os.environ["BASE_URL"] = ""
2626
## 3. Data Preparation
2727

2828
We'll start by loading our dataset using the flexible data loading module.
29-
You can read more from [Data Loading](../data/load.ipynb).
29+
You can read more from [Data Loading](../data/load.md).
3030

3131
```python
3232
# Implementation by creating base class
@@ -154,7 +154,7 @@ generated_reward_module = BaseHarmlessnessListWiseReward(
154154
```
155155

156156
### 4.3. Customize Your Reward
157-
See more details in [Reward Customization](./custom_reward.ipynb).
157+
See more details in [Reward Customization](./custom_reward.md).
158158

159159
```python
160160
from typing import List

docs/tutorial/end-to-end.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -567,9 +567,8 @@ cd RM-Gallery/examples/end_to_end/
567567
## Additional Resources
568568

569569
- 📚 [Full Documentation](../index.md)
570-
- 💻 [Interactive Notebooks](../../examples/)
571570
- 🤝 [Community Forum](https://github.com/modelscope/RM-Gallery/discussions)
572-
- 📝 [API Reference](../api_reference.md)
571+
- [FAQ](../faq.md)
573572

574573
Happy building! 🚀
575574

docs/tutorial/evaluation/overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,7 +261,7 @@ Each benchmark page provides detailed setup instructions, code examples, and res
261261

262262
## Additional Resources
263263

264-
- **[Building RM Overview](../building_rm/overview.ipynb)** - Learn how to build reward models
264+
- **[Building RM Overview](../building_rm/overview.md)** - Learn how to build reward models
265265
- **[RM Library](../../library/rm_library.md)** - Pre-built reward models
266-
- **[Best Practices](../building_rm/benchmark_practices.ipynb)** - Evaluation best practices
266+
- **[Best Practices](../building_rm/benchmark_practices.md)** - Evaluation best practices
267267

0 commit comments

Comments
 (0)