Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions examples/agent_patterns/llm_as_a_judge.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
story_outline_generator = Agent(
name="story_outline_generator",
instructions=(
"You generate a very short story outline based on the user's input."
"You generate a very short story outline based on the user's input. "
"If there is any feedback provided, use it to improve the outline."
),
)
Expand All @@ -30,9 +30,9 @@ class EvaluationFeedback:
evaluator = Agent[None](
name="evaluator",
instructions=(
"You evaluate a story outline and decide if it's good enough."
"If it's not good enough, you provide feedback on what needs to be improved."
"Never give it a pass on the first try. After 5 attempts, you can give it a pass if story outline is good enough - do not go for perfection"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the instructions allow running 5+ times, so the changes in this PR are inconsistent. if you remove the max_attempts etc., we are happy to merge other changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. The max_attempts logic has been removed, and only the instruction formatting fixes are kept.

"You evaluate a story outline and decide if it's good enough. "
"If it's not good enough, you provide feedback on what needs to be improved. "
"Never give it a pass on the first try. After 5 attempts, you can give it a pass if the story outline is good enough - do not go for perfection"
),
output_type=EvaluationFeedback,
)
Expand All @@ -46,6 +46,8 @@ async def main() -> None:

# We'll run the entire workflow in a single trace
with trace("LLM as a judge"):
max_attempts = 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as i mentioned above, please remove this additional logic, which is not necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I clicked the review request by mistake before reverting the changes.

attempts = 0
while True:
story_outline_result = await Runner.run(
story_outline_generator,
Expand All @@ -61,8 +63,10 @@ async def main() -> None:

print(f"Evaluator score: {result.score}")

if result.score == "pass":
print("Story outline is good enough, exiting.")
attempts += 1
# break on pass or when we've tried max_attempts times
if result.score == "pass" or attempts >= max_attempts:
print(f"Exiting after {attempts} attempt{'s' if attempts != 1 else ''}.")
break

print("Re-running with feedback")
Expand Down