Commit 1c6560f
committed
Switch judge evaluator from GPT-4o to Claude to avoid OpenAI quota limits
The judge evaluator was using GPT-4o (OpenAI) which hit quota limits during testing.
Switching to Claude 3.5 Sonnet (Anthropic) to continue running CI tests without interruption.
Changes:
- Added --judge-model claude-3-5-sonnet-20241022 flag
- Added --judge-provider anthropic flag
- Judge evaluations will now use Claude instead of GPT-4o
Benefits:
- Avoids OpenAI API quota limits
- Uses available Anthropic credits
- Same evaluation quality (both are frontier models)1 parent 4031579 commit 1c6560f
1 file changed
+2
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
294 | 294 | | |
295 | 295 | | |
296 | 296 | | |
| 297 | + | |
| 298 | + | |
297 | 299 | | |
298 | 300 | | |
299 | 301 | | |
| |||
0 commit comments