Skip to content

AI4Scientist/nano-scientist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

110 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔬 Nano-scientist

Nano. Lean. Four loops, one paper.

Nano-scientist

Core files Nodes Loops Skills Budget driven

An autonomous research agent that turns a topic into a peer-reviewed technical report — within a dollar budget you set.

Built on PocketFlow, and inspired by karpathy/autoresearch: fix the budget, run the loops, let the agent figure out the rest.


✨ Why Nano-scientist

Budget-first control You set dollar limits; the agent adapts depth and report type automatically.
Four autonomous loops Literature → Experimentation → Writing → Compiling, each self-terminating on quality gate or budget.
Tight architecture ~4 source files, 7 nodes, no central planner.
Research-to-PDF pipeline Produces LaTeX, BibTeX, artifacts, and final PDF in one run.

🧪 Showcases

Sample reports generated by Nano-scientist at --budget 1:

Topic Report
What techniques effectively bridge the performance gap between small languge models and LLMs regarding resolved rates in automated bug fixing? report.pdf
Challenge taxonomy for the Lean 4 theorem prover: clustering, labeling, and trend analysis across GitHub issues. report.pdf
Comparative analysis of five AI coding agents across 933k pull requests (AIDev). report.pdf

🧠 How it works

flowchart TD
    I([Initializer\nzero LLM calls]) -->|literature| LIT

    subgraph LIT_LOOP["  Literature Loop  "]
        LIT[LiteratureReviewLoop\ndecide → skill → quality gate]
        LIT -->|"next iter"| LIT
    end

    LIT -->|"goal met · budget low"| EXP

    subgraph EXP_LOOP["  Experiment Loop  "]
        EXP[ExperimentationLoop\ndecide → skill → quality gate]
        EXP -->|"next iter"| EXP
    end

    EXP -->|"goal met · budget low"| WR

    subgraph WRITE_LOOP["  Writing Loop  "]
        WR[WritingLoop\nwrite sections → review pass → fix]
    end

    WR -->|compile| CT

    subgraph COMPILE["  Compiling Loop  "]
        CT[CompileTeX\npdflatex + bibtex]
        FT[FixTeX\npatch errors]
        CT -->|fix| FT
        FT -->|compile| CT
    end

    CT -->|done| F([Finisher\ncost_log · summary])
    FT -->|done| F
Loading

Stage breakdown

Stage What happens
Initializer Infers report type from budget, sets up outputs/<uuid>/ — zero LLM calls
LiteratureReviewLoop Each iter: LLM picks skill|done → executes skill → quality gate checks goal; exits on goal met or budget low
ExperimentationLoop Same autonomous loop pattern over experiment skills; exits on goal met or budget low
WritingLoop Writes all required sections, runs a peer-review pass, addresses major comments, assembles .tex
CompilingLoop pdflatex + bibtex; on error or undefined citations, FixTeX patches and recompiles (up to 2 fix attempts)
Finisher Writes cost_log.json + summary.json, prints total cost

🚀 Quickstart

# 1) Clone
git clone https://github.com/AI4Scientist/nano-scientist
cd nano-scientist

# 2) Install dependencies
pip install -r requirements.txt

# 3) Add API keys
cp .env.example .env
# edit .env (minimum: OPENROUTER_API_KEY)

# 4) Run
# Option A — pass a topic string directly
python main.py "CRISPR off-target effects in primary T cells" --budget 1.00

# Option B — pass a research proposal .md file
python main.py proposal.md --budget 1.00

Output lands in outputs/<uuid>/:

outputs/
└── <uuid>/
    ├── report.tex         # assembled LaTeX source
    ├── report.pdf         # final PDF (if pdflatex installed)
    ├── references.bib     # deduplicated BibTeX
    ├── artifacts/         # per-skill markdown outputs
    ├── figures/           # generated plots / images
    ├── data/              # collected CSV / JSON data
    ├── scripts/           # executed code blocks
    ├── traj.txt           # full stdout trace of the run
    ├── history.json       # step-by-step execution log
    ├── cost_log.json      # per-step token costs
    └── summary.json       # final run summary

🖥️ CLI reference

python main.py [topic] [options]

Arguments:
  topic                 Research topic — either a plain string or a path to a
                        .md file whose content is read as the topic.
                        Optional when using --list-skills.

Options:
  -b, --budget FLOAT    Spend limit in USD  (default: $5.00)
  -o, --output DIR      Output directory    (default: outputs/)
  -e, --env FILE        Path to .env file   (default: .env)
  --list-skills         Print available skills and exit

Examples:

# Topic as a string
python main.py "CRISPR off-target effects in primary T cells" --budget 1.00

# Topic from a research proposal file
python main.py proposal.md --budget 1.00

# List available skills
python main.py --list-skills

Budget

Every run targets a full 8-section paper. The budget controls depth, not report type — more budget means more skill calls, more citations, and more revision rounds. Loops terminate when estimated remaining LLM calls drop below a threshold (not a fixed dollar floor), so the agent always spends as much as it can usefully spend.


🧩 Skills

Each skill is a folder under skills/ with SKILL.md (lazy-loaded at runtime). Skills with allowed-tools: Bash get a real tool-calling loop with bash execution and error feedback.

Skill What it produces
academic-slides Academic slide decks and conference talks: narrative arc, slide structure, visual hierarchy, .pptx generation
evo-memory Persistent research memory across cycles: Ideation Memory and Experimentation Memory via IDE/IVE/ESE evolution
experiment-craft Debugging and iteration on existing experiments: 5-step diagnostic flow, structured experiment logging
experiment-iterative-coder Iterative code refinement via plan→code→evaluate→refine cycles with lint/test scoring
experiment-pipeline Structured 4-stage experiment execution: baseline, hyperparameter tuning, proposed method, ablation study
paper-navigator Find and read academic papers: keyword search, citation traversal, arXiv monitoring, SOTA lookup
paper-planning Pre-writing paper planning: story design, experiment planning, figure design, 4-week timeline
paper-rebuttal Peer-review rebuttals: score diagnosis, comment prioritization, champion strategy, 18 tactical writing rules
paper-review Self-review before submission: 5-aspect checklist, adversarial stress-testing, figure/table quality checks
paper-writing Academic paper sections: 11-step workflow with LaTeX templates and section-by-section guidance
research-ideation End-to-end ideation: literature grounding, multi-persona generation, ELO ranking, proposal expansion
research-survey Structured literature survey reports: outline generation, draft, section expansion, final assembly
study-workflow Publication-quality research workflow diagram (Research + Writing swim-lanes) as a PNG via gpt-5.4-image-2

Add a skill

  1. Create skills/my-skill/SKILL.md with YAML frontmatter:
---
id: my-skill
description: One-line description shown in the planner.
allowed-tools: Bash        # grants bash tool-calling with error feedback
required-keys: [HF_TOKEN]  # optional; skill is filtered out if missing
---

Your skill instructions here.
  1. Add to skills/skills.json:
{ "id": "my-skill", "description": "One-line description shown in the planner." }

🔐 Environment variables

Variable Required Used for
OPENROUTER_API_KEY Required Core LLM inference (all nodes)
HF_TOKEN Skill-gated Skills that access Hugging Face Hub
GITHUB_TOKEN Skill-gated Skills that query GitHub repos/issues
OPENAI_API_KEY Skill-gated Skills that use OpenAI-compatible endpoints

Only OPENROUTER_API_KEY is strictly required. Missing skill keys automatically filter out dependent skills.

Optional tuning variables:

Variable Default Purpose
MODEL_NAME Override the inference model
INFERENCE_BASE_URL Point to a custom OpenAI-compatible endpoint
INPUT_TOKEN_COST_PER_MILLION Estimate remaining LLM calls
OUTPUT_TOKEN_COST_PER_MILLION Estimate remaining LLM calls
LOOKBACK 3 History steps visible per LLM call
MAX_REVIEW_ROUNDS 1 Writing review/revision passes
MAX_TOOL_ROUNDS 16 Max bash tool-calling rounds per skill execution
MAX_LOOP_ITERATIONS 20 Max iterations per research loop
MIN_CALLS_TO_CONTINUE 3 Stop a loop when estimated remaining calls falls below this

🗂️ Project layout

nano-scientist/
├── main.py              # CLI entry point
├── src/
│   ├── flow.py          # PocketFlow wiring (3 loops + compile/fix)
│   ├── nodes.py         # 7 nodes + loop helpers (_run_loop, _quality_gate, _write_section …)
│   └── utils.py         # LLM client, cost tracking, BibTeX utils
├── skills/              # 13 modular research skills
│   ├── skills.json      # skill index (id + description)
│   └── <skill-name>/
│       └── SKILL.md     # instructions + optional YAML frontmatter
├── outputs/             # generated reports (git-ignored)
└── .env                 # API keys (git-ignored)

📌 Citation

If you use Nano-scientist in your research, please cite:

@software{nano_scientist2026,
  title  = {Nano-scientist: Autonomous Research Agent for Budget-Constrained Scientific Reports},
  author = {{AI4Scientist Team}},
  year   = {2026},
  url    = {https://github.com/AI4Scientist/nano-scientist}
}

About

An autonomous research agent that turns a topic into a peer-reviewed technical report

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors