Skip to content

[PB] Add BasicAgent solver#89

Merged
thesofakillers merged 1 commit intomainfrom
pb/basicagent-solver
Nov 28, 2025
Merged

[PB] Add BasicAgent solver#89
thesofakillers merged 1 commit intomainfrom
pb/basicagent-solver

Conversation

@thesofakillers
Copy link
Copy Markdown
Contributor

@thesofakillers thesofakillers commented Nov 28, 2025

This PR creates a PythonCodingSolver port of our plonked AISI Basic Agent in both it's original and iterative variants.

To be clear, this ports the aisi basic logic agent logic into a proper solver as expected by nanoeval -- API calls happen on the host, and we dont do any plonking and there is no agent image.

The only part missing is the web browser tool. For this we have opted to just rely on the browser tool usually afforded by API's, e.g. the responses API's web_search_preview tool. This unfort means you can't evaluate e.g. o1 with web searching since it doesn't support that tool.

You'll find the port under paperbench/agents/basicagent/.

This PR does not yet delete paperbench/agents/aisi-basic-agent/ since deleting many files will lead to clunky/sluggish Files changed tab. This PR will soon be followed with a handful of other PR(s) removing stale code and cleaning up paperbench

We have deleted ExternalPythonCodingSolver which was the solver we used for plonking.

Some other notes worth pointing out:

  • We had an agents/ and solvers/ folder. I have consolidated these into a single agents/ folder. We will consolidate these back into a solvers/ folder in a future PR about this. I know this sounds silly but bear with me
  • The PR also fixes a bunch of bugs in turn completers

Expect a few follow up PRs next week.

@thesofakillers thesofakillers merged commit 066fa38 into main Nov 28, 2025
5 checks passed
@thesofakillers thesofakillers deleted the pb/basicagent-solver branch December 1, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant