Skip to content

Commit caade55

Browse files
committed
docs: rename reward_models to judge_models for consistency
1 parent c90907e commit caade55

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

docs/building_graders/training_reward_models.md renamed to docs/building_graders/training_judge_models.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Train Reward Models
1+
# Train Judge Models
22

33
Train judge models using three approaches: **SFT** for foundation learning, **Bradley-Terry** for scalar preference scoring, and **GRPO** for generative evaluation with reasoning.
44

@@ -10,7 +10,7 @@ OpenJudge provides training pipelines for building custom judge models. Each met
1010
| Method | Output Type | Training Data | Interpretable | Best For |
1111
|--------|-------------|---------------|---------------|----------|
1212
| **SFT** | Generative (text) | Demonstrations | ✅ Yes | Model initialization, response generation |
13-
| **Bradley-Terry** | Scalar score | Preference pairs | ❌ No | RLHF reward modeling, ranking |
13+
| **Bradley-Terry** | Scalar score | Preference pairs | ❌ No | RLHF judge modeling, ranking |
1414
| **GRPO** | Generative (text) | Labeled responses | ✅ Yes | Interpretable evaluation with reasoning |
1515

1616
**Common Requirements:**

mkdocs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ nav:
3535
- Overview: building_graders/overview.md
3636
- Create Custom Graders: building_graders/create_custom_graders.md
3737
- Generate Rubrics as Graders: building_graders/generate_rubrics_as_graders.md
38-
- Train Reward Models: building_graders/training_reward_models.md
38+
- Train Judge Models: building_graders/training_judge_models.md
3939

4040
- Running Graders:
4141
- Run Grading Tasks: running_graders/run_tasks.md

0 commit comments

Comments
 (0)