🔬 Nano-scientist

Nano. Lean. Four loops, one paper.

An autonomous research agent that turns a topic into a peer-reviewed technical report — within a dollar budget you set.

Built on PocketFlow, and inspired by karpathy/autoresearch: fix the budget, run the loops, let the agent figure out the rest.

✨ Why Nano-scientist


Budget-first control	You set dollar limits; the agent adapts depth and report type automatically.
Four autonomous loops	Literature → Experimentation → Writing → Compiling, each self-terminating on quality gate or budget.
Tight architecture	~4 source files, 7 nodes, no central planner.
Research-to-PDF pipeline	Produces LaTeX, BibTeX, artifacts, and final PDF in one run.

🧪 Showcases

Sample reports generated by Nano-scientist at --budget 1:

Topic	Report
What techniques effectively bridge the performance gap between small languge models and LLMs regarding resolved rates in automated bug fixing?	report.pdf
Challenge taxonomy for the Lean 4 theorem prover: clustering, labeling, and trend analysis across GitHub issues.	report.pdf
Comparative analysis of five AI coding agents across 933k pull requests (AIDev).	report.pdf

🧠 How it works

flowchart TD
    I([Initializer\nzero LLM calls]) -->|literature| LIT

    subgraph LIT_LOOP["  Literature Loop  "]
        LIT[LiteratureReviewLoop\ndecide → skill → quality gate]
        LIT -->|"next iter"| LIT
    end

    LIT -->|"goal met · budget low"| EXP

    subgraph EXP_LOOP["  Experiment Loop  "]
        EXP[ExperimentationLoop\ndecide → skill → quality gate]
        EXP -->|"next iter"| EXP
    end

    EXP -->|"goal met · budget low"| WR

    subgraph WRITE_LOOP["  Writing Loop  "]
        WR[WritingLoop\nwrite sections → review pass → fix]
    end

    WR -->|compile| CT

    subgraph COMPILE["  Compiling Loop  "]
        CT[CompileTeX\npdflatex + bibtex]
        FT[FixTeX\npatch errors]
        CT -->|fix| FT
        FT -->|compile| CT
    end

    CT -->|done| F([Finisher\ncost_log · summary])
    FT -->|done| F

Stage breakdown

Stage	What happens
Initializer	Infers report type from budget, sets up `outputs/<uuid>/` — zero LLM calls
LiteratureReviewLoop	Each iter: LLM picks `skill\|done` → executes skill → quality gate checks goal; exits on goal met or budget low
ExperimentationLoop	Same autonomous loop pattern over experiment skills; exits on goal met or budget low
WritingLoop	Writes all required sections, runs a peer-review pass, addresses major comments, assembles `.tex`
CompilingLoop	`pdflatex` + `bibtex`; on error or undefined citations, FixTeX patches and recompiles (up to 2 fix attempts)
Finisher	Writes `cost_log.json` + `summary.json`, prints total cost

🚀 Quickstart

# 1) Clone
git clone https://github.com/AI4Scientist/nano-scientist
cd nano-scientist

# 2) Install dependencies
pip install -r requirements.txt

# 3) Add API keys
cp .env.example .env
# edit .env (minimum: OPENROUTER_API_KEY)

# 4) Run
# Option A — pass a topic string directly
python main.py "CRISPR off-target effects in primary T cells" --budget 1.00

# Option B — pass a research proposal .md file
python main.py proposal.md --budget 1.00

Output lands in outputs/<uuid>/:

outputs/
└── <uuid>/
    ├── report.tex         # assembled LaTeX source
    ├── report.pdf         # final PDF (if pdflatex installed)
    ├── references.bib     # deduplicated BibTeX
    ├── artifacts/         # per-skill markdown outputs
    ├── figures/           # generated plots / images
    ├── data/              # collected CSV / JSON data
    ├── scripts/           # executed code blocks
    ├── traj.txt           # full stdout trace of the run
    ├── history.json       # step-by-step execution log
    ├── cost_log.json      # per-step token costs
    └── summary.json       # final run summary

🖥️ CLI reference

python main.py [topic] [options]

Arguments:
  topic                 Research topic — either a plain string or a path to a
                        .md file whose content is read as the topic.
                        Optional when using --list-skills.

Options:
  -b, --budget FLOAT    Spend limit in USD  (default: $5.00)
  -o, --output DIR      Output directory    (default: outputs/)
  -e, --env FILE        Path to .env file   (default: .env)
  --list-skills         Print available skills and exit

Examples:

# Topic as a string
python main.py "CRISPR off-target effects in primary T cells" --budget 1.00

# Topic from a research proposal file
python main.py proposal.md --budget 1.00

# List available skills
python main.py --list-skills

Budget

Every run targets a full 8-section paper. The budget controls depth, not report type — more budget means more skill calls, more citations, and more revision rounds. Loops terminate when estimated remaining LLM calls drop below a threshold (not a fixed dollar floor), so the agent always spends as much as it can usefully spend.

🧩 Skills

Each skill is a folder under skills/ with SKILL.md (lazy-loaded at runtime). Skills with allowed-tools: Bash get a real tool-calling loop with bash execution and error feedback.

Skill	What it produces
`academic-slides`	Academic slide decks and conference talks: narrative arc, slide structure, visual hierarchy, .pptx generation
`evo-memory`	Persistent research memory across cycles: Ideation Memory and Experimentation Memory via IDE/IVE/ESE evolution
`experiment-craft`	Debugging and iteration on existing experiments: 5-step diagnostic flow, structured experiment logging
`experiment-iterative-coder`	Iterative code refinement via plan→code→evaluate→refine cycles with lint/test scoring
`experiment-pipeline`	Structured 4-stage experiment execution: baseline, hyperparameter tuning, proposed method, ablation study
`paper-navigator`	Find and read academic papers: keyword search, citation traversal, arXiv monitoring, SOTA lookup
`paper-planning`	Pre-writing paper planning: story design, experiment planning, figure design, 4-week timeline
`paper-rebuttal`	Peer-review rebuttals: score diagnosis, comment prioritization, champion strategy, 18 tactical writing rules
`paper-review`	Self-review before submission: 5-aspect checklist, adversarial stress-testing, figure/table quality checks
`paper-writing`	Academic paper sections: 11-step workflow with LaTeX templates and section-by-section guidance
`research-ideation`	End-to-end ideation: literature grounding, multi-persona generation, ELO ranking, proposal expansion
`research-survey`	Structured literature survey reports: outline generation, draft, section expansion, final assembly
`study-workflow`	Publication-quality research workflow diagram (Research + Writing swim-lanes) as a PNG via gpt-5.4-image-2

Add a skill

Create skills/my-skill/SKILL.md with YAML frontmatter:

---
id: my-skill
description: One-line description shown in the planner.
allowed-tools: Bash        # grants bash tool-calling with error feedback
required-keys: [HF_TOKEN]  # optional; skill is filtered out if missing
---

Your skill instructions here.

Add to skills/skills.json:

{ "id": "my-skill", "description": "One-line description shown in the planner." }

🔐 Environment variables

Variable	Required	Used for
`OPENROUTER_API_KEY`	Required	Core LLM inference (all nodes)
`HF_TOKEN`	Skill-gated	Skills that access Hugging Face Hub
`GITHUB_TOKEN`	Skill-gated	Skills that query GitHub repos/issues
`OPENAI_API_KEY`	Skill-gated	Skills that use OpenAI-compatible endpoints

Only OPENROUTER_API_KEY is strictly required. Missing skill keys automatically filter out dependent skills.

Optional tuning variables:

Variable	Default	Purpose
`MODEL_NAME`	—	Override the inference model
`INFERENCE_BASE_URL`	—	Point to a custom OpenAI-compatible endpoint
`INPUT_TOKEN_COST_PER_MILLION`	—	Estimate remaining LLM calls
`OUTPUT_TOKEN_COST_PER_MILLION`	—	Estimate remaining LLM calls
`LOOKBACK`	`3`	History steps visible per LLM call
`MAX_REVIEW_ROUNDS`	`1`	Writing review/revision passes
`MAX_TOOL_ROUNDS`	`16`	Max bash tool-calling rounds per skill execution
`MAX_LOOP_ITERATIONS`	`20`	Max iterations per research loop
`MIN_CALLS_TO_CONTINUE`	`3`	Stop a loop when estimated remaining calls falls below this

🗂️ Project layout

nano-scientist/
├── main.py              # CLI entry point
├── src/
│   ├── flow.py          # PocketFlow wiring (3 loops + compile/fix)
│   ├── nodes.py         # 7 nodes + loop helpers (_run_loop, _quality_gate, _write_section …)
│   └── utils.py         # LLM client, cost tracking, BibTeX utils
├── skills/              # 13 modular research skills
│   ├── skills.json      # skill index (id + description)
│   └── <skill-name>/
│       └── SKILL.md     # instructions + optional YAML frontmatter
├── outputs/             # generated reports (git-ignored)
└── .env                 # API keys (git-ignored)

📌 Citation

If you use Nano-scientist in your research, please cite:

@software{nano_scientist2026,
  title  = {Nano-scientist: Autonomous Research Agent for Budget-Constrained Scientific Reports},
  author = {{AI4Scientist Team}},
  year   = {2026},
  url    = {https://github.com/AI4Scientist/nano-scientist}
}

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
showcases		showcases
skills		skills
src		src
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔬 Nano-scientist

Nano. Lean. Four loops, one paper.

✨ Why Nano-scientist

🧪 Showcases

🧠 How it works

Stage breakdown

🚀 Quickstart

🖥️ CLI reference

Budget

🧩 Skills

Add a skill

🔐 Environment variables

🗂️ Project layout

📌 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔬 Nano-scientist

Nano. Lean. Four loops, one paper.

✨ Why Nano-scientist

🧪 Showcases

🧠 How it works

Stage breakdown

🚀 Quickstart

🖥️ CLI reference

Budget

🧩 Skills

Add a skill

🔐 Environment variables

🗂️ Project layout

📌 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages