Skip to content

[PB] Clean up grade function flow #38

Merged
thesofakillers merged 1 commit intomainfrom
pb/release-cleanup-grading
Jul 17, 2025
Merged

[PB] Clean up grade function flow #38
thesofakillers merged 1 commit intomainfrom
pb/release-cleanup-grading

Conversation

@thesofakillers
Copy link
Copy Markdown
Contributor

@thesofakillers thesofakillers commented Jul 17, 2025

goal was to get paperbench.nano.task.PBTask:grade() as clean and minimalist as possible, ideally so that you can read it in natural language and know what's going on.

  • Various variable renames and docstring/comment updates
  • Removed a misleading comment
  • Abstracting pieces of logic to more readable helper functions
  • Keeping function-specific logs inside the functions themselves
  • Some _run_reproduce-specific cleanup:
    • we were re-selecting the checkpoint and running checks on its existence in run_reproduce, despite JUST doing that in grade, so we removed it since it's redundant
    • Then we mimic the shape of how we call judge, i.e.
      • starting the computer inside the fn, so we only have it running for when it is needed
      • use a _should_reproduce helper function to handle the conditionals

@thesofakillers thesofakillers merged commit 8b2e75a into main Jul 17, 2025
5 checks passed
@thesofakillers thesofakillers deleted the pb/release-cleanup-grading branch July 18, 2025 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant