You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix test_maze assertion (9, not 13) and handle exit 127 in test runner
- Fix test_maze in both Python and TypeScript A* examples: the shortest
path through the maze is 9 steps, not 13. All agents found the correct
answer but the test was wrong.
- Detect exit 127 (command not found) in test runner and mark as skipped
instead of failed — prevents false penalties in Copeland scoring
- Hard-error in preflight when test command doesn't exist (exit 127)
instead of just warning — saves API tokens
- Skip tests criterion in Copeland pairwise comparison when both agents
have skipped tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments