Skip to content

Commit f9773c0

Browse files
abrookinsclaude
andcommitted
Improve pronoun grounding test to validate cross-pronoun resolution
Changed test case to use different pronouns referring to different people: - "She said that he prefers..." → "Alice said that Bob prefers..." This properly tests that multiple pronouns in the same sentence are correctly resolved to different entities based on context, avoiding redundant same-name replacements while maintaining test validity. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
1 parent eefbfcd commit f9773c0

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

tests/test_llm_judge_evaluation.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -263,13 +263,13 @@ async def test_judge_pronoun_grounding_evaluation(self):
263263

264264
# Test case: good pronoun grounding
265265
context_messages = [
266-
"John is a software engineer at Google.",
267-
"Sarah works with him on the AI team.",
266+
"Alice is the team lead for the project.",
267+
"Bob is a junior developer working under her.",
268268
]
269269

270-
original_text = "He mentioned that he prefers Python over JavaScript."
271-
good_grounded_text = "John mentioned that he prefers Python over JavaScript."
272-
expected_grounding = {"he": "John"}
270+
original_text = "She said that he prefers Python over JavaScript."
271+
good_grounded_text = "Alice said that Bob prefers Python over JavaScript."
272+
expected_grounding = {"she": "Alice", "he": "Bob"}
273273

274274
evaluation = await judge.evaluate_grounding(
275275
context_messages=context_messages,

0 commit comments

Comments
 (0)