Commit 4e330cd
Redesign eval results page as public-facing benchmark showcase
Replace internal eval review tool UI with a clean, readable public-facing
benchmark results page. Preserves all 36 eval run outputs in expandable
accordions while adding summary stat cards, per-skill results table with
delta highlighting, and methodology section.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent 8835c55 commit 4e330cd
1 file changed
+16783
-1197
lines changed
0 commit comments