Distributable Python tooling for researchers who want to:
- discover new papers on a schedule
- rank them against their own research interests
- ingest them into Zotero
- attach structured summary notes
- optionally attach linked or real PDF files
- keep local reports and run state
- back up or clean up Zotero attachment storage
This is the shortest working path for a local Zotero + local Ollama setup.
git clone https://github.com/ayushkanwal/autopaper-researcher.git
cd autopaper-researcher
cp examples/ollama_local.env .env.localSet at least these values:
ZOTERO_USER_ID=your_zotero_user_id
ZOTERO_API_KEY=your_zotero_api_key
AUTOPAPER_COLLECTION_NAME=autopaper-ingested
AUTOPAPER_QUERIES_JSON=["all:\"multivariate time series\" AND (all:forecasting OR all:prediction)"]
AUTOPAPER_BOOST_TERMS_JSON=["multivariate","forecasting","retrieval"]
AUTOPAPER_PENALTY_TERMS_JSON=["stock price","cryptocurrency"]ollama pull qwen2.5:3b-instruct
ollama serve./scripts/bootstrap_and_run.sh --env-file .env.local --dry-run./scripts/bootstrap_and_run.sh --env-file .env.localWhat this does:
- searches
arXiv,PubMed, andOpenAlex - ranks results using your preset and keyword overrides
- summarizes selected papers with local Ollama
- adds them to Zotero with structured notes
- optionally imports real PDFs if
AUTOPAPER_ATTACH_REAL_PDFS=true
flowchart LR
A["Preset + .env.local"] --> B["Query Builder"]
B --> C["Search Sources"]
C --> C1["arXiv"]
C --> C2["PubMed"]
C --> C3["OpenAlex"]
C1 --> D["Normalize + Merge Duplicates"]
C2 --> D
C3 --> D
D --> E["Generic Ranking Engine"]
E --> F["Top Candidates"]
F --> G["Zotero Dedupe Check"]
G --> H["Summary Provider"]
H --> H1["offline"]
H --> H2["OpenAI-compatible / Ollama"]
H --> H3["custom command"]
H1 --> I["Structured Note"]
H2 --> I
H3 --> I
G --> J["PDF / Code Discovery"]
I --> K["Zotero Ingest"]
J --> K
K --> L["Collection Item + Note + Optional PDF"]
K --> M["Markdown / JSON Report"]
K --> N["SQLite State"]
Paper discovery uses the built-in source adapters:
arXivPubMedOpenAlex
The core ranking engine is intentionally generic.
Researcher-specific relevance should be controlled with:
- built-in presets
AUTOPAPER_BOOST_TERMS_JSONAUTOPAPER_PENALTY_TERMS_JSONAUTOPAPER_INCLUDE_TERMS_JSONAUTOPAPER_EXCLUDE_TERMS_JSON
Summary generation happens after papers are found and ranked.
Supported modes:
offlineopenai_compatiblecommand
For local Ollama, use:
AUTOPAPER_SUMMARIZER=openai_compatible
AUTOPAPER_SUMMARY_FALLBACK=offline
AUTOPAPER_LLM_BASE_URL=http://127.0.0.1:11434/v1
AUTOPAPER_LLM_API_KEY=
AUTOPAPER_LLM_MODEL=qwen2.5:3b-instructZotero is the write target in v1.
The tool can:
- create collections if needed
- add paper items
- attach summary notes
- attach PDF links
- optionally import real PDF files
Use the bootstrap script if you want the shortest path:
./scripts/bootstrap_and_run.sh --env-file .env.local --dry-runThe script:
- creates
.venv - uses the source tree directly
- installs the package only if needed
- validates config
- runs the requested command
Use this if you want direct CLI control:
python3 -m venv .venv
source .venv/bin/activate
pip install -e .Then run commands such as:
autopaper validate-config --env-file .env.local
autopaper run-once --env-file .env.local --dry-run
autopaper run-once --env-file .env.localZOTERO_USER_IDZOTERO_API_KEY
AUTOPAPER_PROFILE_NAMEAUTOPAPER_QUERY_PRESETAUTOPAPER_QUERIES_JSONAUTOPAPER_SOURCESAUTOPAPER_COLLECTION_NAMEAUTOPAPER_MAX_NEWAUTOPAPER_MAX_RESULTS_PER_QUERYAUTOPAPER_MIN_RELEVANCE_SCORE
AUTOPAPER_BOOST_TERMS_JSONAUTOPAPER_PENALTY_TERMS_JSONAUTOPAPER_INCLUDE_TERMS_JSONAUTOPAPER_EXCLUDE_TERMS_JSON
AUTOPAPER_SUMMARIZERAUTOPAPER_SUMMARY_FALLBACKAUTOPAPER_LLM_BASE_URLAUTOPAPER_LLM_API_KEYAUTOPAPER_LLM_MODELAUTOPAPER_LLM_TIMEOUT_SECONDSAUTOPAPER_SUMMARY_COMMAND
AUTOPAPER_SCHEDULE_CRONAUTOPAPER_TIMEZONEAUTOPAPER_RUN_ON_START
AUTOPAPER_REPORT_DIRAUTOPAPER_STATE_DIRAUTOPAPER_ATTACH_REAL_PDFSAUTOPAPER_ENABLE_GITHUB_SEARCH
PUBMED_EMAILPUBMED_API_KEYOPENALEX_MAILTOGITHUB_TOKEN
Current presets:
general_time_series_v1livestock_decision_support_v1livestock_decision_support_v2
List them from the CLI:
PYTHONPATH=src python3 -m autopaper.cli list-presetsShow exactly what your env resolves to:
PYTHONPATH=src python3 -m autopaper.cli print-effective-config --env-file .env.local./scripts/bootstrap_and_run.sh --env-file .env.local --dry-runOr manually:
PYTHONPATH=src python3 -m autopaper.cli validate-config --env-file .env.local./scripts/bootstrap_and_run.sh --env-file .env.localOr manually:
PYTHONPATH=src python3 -m autopaper.cli run-once --env-file .env.local./scripts/bootstrap_and_run.sh --env-file .env.local --daemonOr manually:
PYTHONPATH=src python3 -m autopaper.cli daemon --env-file .env.local --run-on-startGenerate a launchd plist:
PYTHONPATH=src python3 -m autopaper.cli write-launchd-plist --env-file .env.local --output ~/Library/LaunchAgents/com.autopaper.researcher.plistLoad it:
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.autopaper.researcher.plist
launchctl enable gui/$(id -u)/com.autopaper.researcher
launchctl kickstart -k gui/$(id -u)/com.autopaper.researcherUnload it:
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/com.autopaper.researcher.plistThe launchd service keeps the daemon alive. The ingest schedule still comes from AUTOPAPER_SCHEDULE_CRON.
Back up stored attachments:
PYTHONPATH=src python3 -m autopaper.cli backup-attachments --env-file .env.local --dry-runBack up and delete remote attachments after successful backup:
PYTHONPATH=src python3 -m autopaper.cli backup-attachments --env-file .env.local --delete-remotePreview a My Library purge:
PYTHONPATH=src python3 -m autopaper.cli purge-attachments --env-file .env.local --dry-runRun a live purge of My Library file attachments:
PYTHONPATH=src python3 -m autopaper.cli purge-attachments --env-file .env.local --confirm-my-libraryImportant:
purge-attachmentsis intentionally My Library only- it deletes matching remote attachment items
- it attempts to empty My Library trash
- it removes matching local Zotero storage directories
.env.exampleexamples/researcher_template.envexamples/livestock_decision_support.envexamples/ollama_local.env
src/autopaper/ package source
scripts/ wrapper scripts
examples/ env templates
.env.example starter config
pyproject.toml package metadata
- Do not commit real
.env.localfiles or API keys. - Reports and state are written locally and are ignored by
.gitignore. - This repo intentionally excludes private reports, backups, outputs, and local machine artifacts.