Skip to content

Commit f31c365

Browse files
authored
docs: Add how-to guide for aligning LLM-as-Judge (#2348)
## Issue Link / Problem Description Adds documentation and examples for aligning LLM-as-Judge evaluators with human expert judgments. This addresses a common challenge where LLM judges may not align well with human evaluations, leading to unreliable automated assessments. ## Changes Made - Add comprehensive how-to guide: `docs/howtos/applications/align-llm-as-judge.md` - Step-by-step instructions for measuring and improving judge alignment - Add complete evaluation example: `examples/ragas_examples/judge_alignment/` - `evals.py`: Baseline and improved judge metrics with alignment measurement - `__init__.py`: Module initialization with main entry points - Documentation covers: - Why judge alignment matters - Dataset structure and loading - Defining judge and alignment metrics - Running baseline evaluations - Iterating on judge prompts to improve alignment ## Testing ### How to Test - [x] Manual testing steps: 1. Run baseline evaluation: `uv run python -m ragas_examples.judge_alignment` 2. Verify alignment metrics are calculated correctly 3. Build docs locally: `make serve-docs` and navigate to the new guide
1 parent 9deccc8 commit f31c365

File tree

7 files changed

+674
-41
lines changed

7 files changed

+674
-41
lines changed

.cursor/commands/git-pr.md

Lines changed: 4 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,4 @@
1-
Review the PR and create PR description in this format. Just output the PR description in a text block.
2-
3-
PR Description:
4-
5-
## Issue Link / Problem Description
6-
<!-- Link to related issue or describe the problem this PR solves -->
7-
- Fixes #[issue_number]
8-
- OR describe the issue: What problem does this solve? How can it be replicated?
9-
10-
## Changes Made
11-
<!-- Describe what you changed and why -->
12-
-
13-
-
14-
-
15-
16-
## Testing
17-
<!-- Describe how this should be tested -->
18-
### How to Test
19-
- [ ] Automated tests added/updated
20-
- [ ] Manual testing steps:
21-
1.
22-
2.
23-
3.
24-
25-
## References
26-
<!-- Link to related issues, discussions, forums, or external resources -->
27-
- Related issues:
28-
- Documentation:
29-
- External references:
30-
31-
## Screenshots/Examples (if applicable)
32-
<!-- Add screenshots or code examples showing the change -->
33-
34-
---
35-
<!--
36-
Thank you for contributing to Ragas!
37-
Please fill out the sections above as completely as possible.
38-
The more information you provide, the faster your PR can be reviewed and merged.
39-
-->
1+
Make sure you are on a branch other than main.
2+
Commit changes.
3+
Run make format.
4+
Create PR using gh cli following .github/pull_request_template.md template

.gitignore

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -166,8 +166,7 @@ cython_debug/
166166

167167
# Cursor
168168
.cursorignore
169-
.cursor/*
170-
!.cursor/rules/
169+
.cursor/plans
171170

172171
# Ragas specific
173172
_experiments/

0 commit comments

Comments
 (0)