Skip to content

Commit 681ab32

Browse files
Dashboard style overhaul + expanded game-completion eval and stopping criteria
Dashboard: - New `pawn/dashboard/theme.py` centralizes color palette, layout, and styling so charts and the Solara shell share one coherent system. - charts.py refactored to pull from the theme; titles bolded; log-scale error_rate_chart added with optional log-linear fit overlays (desaturated dashed lines, half-life shown in legend). - sol.py reorganized into sections, with the Game Integrity section pairing the error rate chart against the patience chart. - val_accuracy_chart trimmed to Top-1/Top-5 (legal/late-legal moved to the error rate chart so accuracy scale is readable). Game completion eval: - Fully vectorized via `_game_completion_chunk` + `_aggregate_game_completion`: no Python per-game loop, processes the full val set in batch_size chunks, peak memory independent of val_games. - Adds min/max/median forfeit-ply statistics across games that actually forfeited (0 if none). Surfaced in val log line as `forfeit [min-max med N]`. - Runs over the full validation set (was limited to 64 games). Stopping criteria: - Patience now also resets on improvements to game_completion_rate and avg_plies_completed, not just val_loss and late_legal_move_rate. - best_game_completion and best_avg_plies_completed persisted in checkpoint state so they survive resume. - Trainer logs `patience` and `legality_late_ply` into the training-config record so downstream consumers (dashboard) can see them. Tests updated for the theme refactor (layer_color moved to theme module, titles wrapped in <b>...</b> for bold rendering).
1 parent e73f6ff commit 681ab32

File tree

5 files changed

+1418
-444
lines changed

5 files changed

+1418
-444
lines changed

0 commit comments

Comments
 (0)