Auto Paper Populator

Distributable Python tooling for researchers who want to:

discover new papers on a schedule
rank them against their own research interests
ingest them into Zotero
attach structured summary notes
optionally attach linked or real PDF files
keep local reports and run state
back up or clean up Zotero attachment storage

Fastest Start

This is the shortest working path for a local Zotero + local Ollama setup.

1. Clone the repo and create a local config

git clone https://github.com/ayushkanwal/autopaper-researcher.git
cd autopaper-researcher
cp examples/ollama_local.env .env.local

2. Edit `.env.local`

Set at least these values:

ZOTERO_USER_ID=your_zotero_user_id
ZOTERO_API_KEY=your_zotero_api_key
AUTOPAPER_COLLECTION_NAME=autopaper-ingested
AUTOPAPER_QUERIES_JSON=["all:\"multivariate time series\" AND (all:forecasting OR all:prediction)"]
AUTOPAPER_BOOST_TERMS_JSON=["multivariate","forecasting","retrieval"]
AUTOPAPER_PENALTY_TERMS_JSON=["stock price","cryptocurrency"]

3. Start Ollama

ollama pull qwen2.5:3b-instruct
ollama serve

4. Run a dry-run

./scripts/bootstrap_and_run.sh --env-file .env.local --dry-run

5. Run a live ingest

./scripts/bootstrap_and_run.sh --env-file .env.local

What this does:

searches arXiv, PubMed, and OpenAlex
ranks results using your preset and keyword overrides
summarizes selected papers with local Ollama
adds them to Zotero with structured notes
optionally imports real PDFs if AUTOPAPER_ATTACH_REAL_PDFS=true

Architecture

flowchart LR
    A["Preset + .env.local"] --> B["Query Builder"]
    B --> C["Search Sources"]
    C --> C1["arXiv"]
    C --> C2["PubMed"]
    C --> C3["OpenAlex"]
    C1 --> D["Normalize + Merge Duplicates"]
    C2 --> D
    C3 --> D
    D --> E["Generic Ranking Engine"]
    E --> F["Top Candidates"]
    F --> G["Zotero Dedupe Check"]
    G --> H["Summary Provider"]
    H --> H1["offline"]
    H --> H2["OpenAI-compatible / Ollama"]
    H --> H3["custom command"]
    H1 --> I["Structured Note"]
    H2 --> I
    H3 --> I
    G --> J["PDF / Code Discovery"]
    I --> K["Zotero Ingest"]
    J --> K
    K --> L["Collection Item + Note + Optional PDF"]
    K --> M["Markdown / JSON Report"]
    K --> N["SQLite State"]

Core Concepts

Search

Paper discovery uses the built-in source adapters:

arXiv
PubMed
OpenAlex

Ranking

The core ranking engine is intentionally generic.

Researcher-specific relevance should be controlled with:

built-in presets
AUTOPAPER_BOOST_TERMS_JSON
AUTOPAPER_PENALTY_TERMS_JSON
AUTOPAPER_INCLUDE_TERMS_JSON
AUTOPAPER_EXCLUDE_TERMS_JSON

Summaries

Summary generation happens after papers are found and ranked.

Supported modes:

offline
openai_compatible
command

For local Ollama, use:

AUTOPAPER_SUMMARIZER=openai_compatible
AUTOPAPER_SUMMARY_FALLBACK=offline
AUTOPAPER_LLM_BASE_URL=http://127.0.0.1:11434/v1
AUTOPAPER_LLM_API_KEY=
AUTOPAPER_LLM_MODEL=qwen2.5:3b-instruct

Zotero

Zotero is the write target in v1.

The tool can:

create collections if needed
add paper items
attach summary notes
attach PDF links
optionally import real PDF files

Install Options

Default: one-command runner

Use the bootstrap script if you want the shortest path:

./scripts/bootstrap_and_run.sh --env-file .env.local --dry-run

The script:

creates .venv
uses the source tree directly
installs the package only if needed
validates config
runs the requested command

Manual install

Use this if you want direct CLI control:

python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Then run commands such as:

autopaper validate-config --env-file .env.local
autopaper run-once --env-file .env.local --dry-run
autopaper run-once --env-file .env.local

Configuration Reference

Required

ZOTERO_USER_ID
ZOTERO_API_KEY

Main research profile

AUTOPAPER_PROFILE_NAME
AUTOPAPER_QUERY_PRESET
AUTOPAPER_QUERIES_JSON
AUTOPAPER_SOURCES
AUTOPAPER_COLLECTION_NAME
AUTOPAPER_MAX_NEW
AUTOPAPER_MAX_RESULTS_PER_QUERY
AUTOPAPER_MIN_RELEVANCE_SCORE

Ranking overrides

AUTOPAPER_BOOST_TERMS_JSON
AUTOPAPER_PENALTY_TERMS_JSON
AUTOPAPER_INCLUDE_TERMS_JSON
AUTOPAPER_EXCLUDE_TERMS_JSON

Summary provider

AUTOPAPER_SUMMARIZER
AUTOPAPER_SUMMARY_FALLBACK
AUTOPAPER_LLM_BASE_URL
AUTOPAPER_LLM_API_KEY
AUTOPAPER_LLM_MODEL
AUTOPAPER_LLM_TIMEOUT_SECONDS
AUTOPAPER_SUMMARY_COMMAND

Scheduling

AUTOPAPER_SCHEDULE_CRON
AUTOPAPER_TIMEZONE
AUTOPAPER_RUN_ON_START

Output and behavior

AUTOPAPER_REPORT_DIR
AUTOPAPER_STATE_DIR
AUTOPAPER_ATTACH_REAL_PDFS
AUTOPAPER_ENABLE_GITHUB_SEARCH

Source-specific optional settings

PUBMED_EMAIL
PUBMED_API_KEY
OPENALEX_MAILTO
GITHUB_TOKEN

Built-In Presets

Current presets:

general_time_series_v1
livestock_decision_support_v1
livestock_decision_support_v2

List them from the CLI:

PYTHONPATH=src python3 -m autopaper.cli list-presets

Show exactly what your env resolves to:

PYTHONPATH=src python3 -m autopaper.cli print-effective-config --env-file .env.local

Common Commands

Validate config

./scripts/bootstrap_and_run.sh --env-file .env.local --dry-run

Or manually:

PYTHONPATH=src python3 -m autopaper.cli validate-config --env-file .env.local

One ingest pass

./scripts/bootstrap_and_run.sh --env-file .env.local

Or manually:

PYTHONPATH=src python3 -m autopaper.cli run-once --env-file .env.local

Foreground scheduler

./scripts/bootstrap_and_run.sh --env-file .env.local --daemon

Or manually:

PYTHONPATH=src python3 -m autopaper.cli daemon --env-file .env.local --run-on-start

macOS Scheduling

Generate a launchd plist:

PYTHONPATH=src python3 -m autopaper.cli write-launchd-plist --env-file .env.local --output ~/Library/LaunchAgents/com.autopaper.researcher.plist

Load it:

launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.autopaper.researcher.plist
launchctl enable gui/$(id -u)/com.autopaper.researcher
launchctl kickstart -k gui/$(id -u)/com.autopaper.researcher

Unload it:

launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/com.autopaper.researcher.plist

The launchd service keeps the daemon alive. The ingest schedule still comes from AUTOPAPER_SCHEDULE_CRON.

Zotero Attachment Backup and Cleanup

Back up stored attachments:

PYTHONPATH=src python3 -m autopaper.cli backup-attachments --env-file .env.local --dry-run

Back up and delete remote attachments after successful backup:

PYTHONPATH=src python3 -m autopaper.cli backup-attachments --env-file .env.local --delete-remote

Preview a My Library purge:

PYTHONPATH=src python3 -m autopaper.cli purge-attachments --env-file .env.local --dry-run

Run a live purge of My Library file attachments:

PYTHONPATH=src python3 -m autopaper.cli purge-attachments --env-file .env.local --confirm-my-library

Important:

purge-attachments is intentionally My Library only
it deletes matching remote attachment items
it attempts to empty My Library trash
it removes matching local Zotero storage directories

Example Files

.env.example
examples/researcher_template.env
examples/livestock_decision_support.env
examples/ollama_local.env

Repository Layout

src/autopaper/      package source
scripts/            wrapper scripts
examples/           env templates
.env.example        starter config
pyproject.toml      package metadata

Notes

Do not commit real .env.local files or API keys.
Reports and state are written locally and are ignored by .gitignore.
This repo intentionally excludes private reports, backups, outputs, and local machine artifacts.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
examples		examples
scripts		scripts
src/autopaper		src/autopaper
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Auto Paper Populator

Fastest Start

1. Clone the repo and create a local config

2. Edit .env.local

3. Start Ollama

4. Run a dry-run

5. Run a live ingest

Architecture

Core Concepts

Search

Ranking

Summaries

Zotero

Install Options

Default: one-command runner

Manual install

Configuration Reference

Required

Main research profile

Ranking overrides

Summary provider

Scheduling

Output and behavior

Source-specific optional settings

Built-In Presets

Common Commands

Validate config

One ingest pass

Foreground scheduler

macOS Scheduling

Zotero Attachment Backup and Cleanup

Example Files

Repository Layout

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Edit `.env.local`

Packages