PubMed CLI Tools

Unix-style command-line tools for searching and parsing PubMed articles. Designed for researchers and AI agents who want quick access to publication data without leaving the terminal.

# Search, parse, and filter
pm search "CRISPR cancer therapy" | pm fetch | pm parse | jq '.title'

# Full pipeline: search to PDF download
pm search "CRISPR review" --max 5 | pm fetch | pm parse | pm download --output-dir ./pdfs/

Prerequisites

uv (Python package manager)
Python >= 3.12 (installed automatically by uv if needed)
Optional: jq for advanced JSON filtering (sudo apt install jq or brew install jq)

Installation

Install directly from GitHub (no PyPI release):

uv tool install git+https://github.com/lescientifik/pm-tools.git

This installs the pm command globally. You can then run pm from anywhere.

Development install (for contributors)

git clone https://github.com/lescientifik/pm-tools.git
cd pm-tools
uv sync

With a development install, prefix all commands with uv run (e.g., uv run pm search ...).

Getting Started

After installation, the first thing to do is run --help to discover available commands and options:

# Show all available commands and general usage
pm --help

# Show detailed help for a specific command (options, input/output format, examples)
pm search --help
pm fetch --help
pm parse --help
pm filter --help
pm download --help
pm cite --help
pm diff --help
pm quick --help

Every command supports -h / --help. This is the best way to learn what each command does, what options it accepts, and how to use it. When in doubt, always run --help first.

Commands

All commands are subcommands of pm:

Command	Input	Output	Purpose
`pm search`	Query string	PMIDs	Search PubMed
`pm fetch`	PMIDs (stdin)	XML	Download article data
`pm parse`	XML (stdin)	JSONL	Extract structured data
`pm filter`	JSONL (stdin)	JSONL	Filter by year/journal/author
`pm diff`	Two JSONL files	JSONL	Compare article collections
`pm download`	JSONL/PMIDs	PDFs	Download Open Access PDFs
`pm cite`	PMIDs (stdin)	CSL-JSON	Generate bibliography citations
`pm quick`	Query string	JSONL	One-command search pipeline

Run pm <command> --help for detailed options, input/output formats, and examples for each command.

Quick Examples

# Simplest: one command for quick results
pm quick "CRISPR cancer therapy"

# Search and get titles
pm search "machine learning diagnosis" --max 10 | pm fetch | pm parse | jq -r '.title'

# Filter to recent Nature papers with abstracts
pm search "quantum computing" --max 50 | pm fetch | pm parse | \
  pm filter --year 2024- --journal nature --has-abstract

# Save results to JSONL for later use
pm search "alzheimer biomarkers" --max 100 | pm fetch | pm parse > papers.jsonl

# Export to CSV
pm search "alzheimer biomarkers" --max 100 | pm fetch | pm parse | \
  jq -r '[.pmid, .year, .journal, .title] | @csv' > papers.csv

Filtering Results

pm filter lets you filter parsed articles without writing jq queries:

# Filter by year (exact, range, or open-ended)
pm filter --year 2024           # Exact year
pm filter --year 2020-2024      # Range
pm filter --year 2020-          # 2020 and later

# Filter by journal (case-insensitive substring)
pm filter --journal nature
pm filter --journal "cell reports"

# Filter by author (case-insensitive, matches any author)
pm filter --author zhang

# Boolean filters
pm filter --has-abstract        # Must have abstract
pm filter --has-doi             # Must have DOI

# Combine filters (AND logic)
pm filter --year 2023- --journal nature --has-abstract

# Verbose mode shows filter stats
pm filter --year 2024 -v        # Output: "15/50 articles passed filters"

Quick Search with pm quick

For interactive use when you just want to see results quickly:

# Basic quick search (default 100 results)
pm quick "CRISPR cancer therapy"

# Limit results
pm quick --max 20 "machine learning diagnosis"

# Verbose mode shows progress
pm quick -v "protein folding"

pm quick is a convenience wrapper that runs the full pipeline (pm search | pm fetch | pm parse) in one command. For programmatic use or custom filtering, use the individual commands.

Daily Research Workflows

Track Your Favorite Authors

# Papers by a specific researcher
pm search "Doudna JA[author]" --max 10 | pm fetch | pm parse | \
  jq -r '"\(.year) - \(.title[0:70])..."'

# Multiple authors (collaborations)
pm search "(Zhang F[author]) AND (Bhattacharya D[author])" | \
  pm fetch | pm parse | jq '.title'

Journal Watch

Monitor specific journals for topics you care about:

# Recent Cell papers on organoids
pm search "organoids AND Cell[journal]" --max 20 | pm fetch | pm parse | \
  pm filter --year 2024- | jq -r '.title'

# Compare publication counts across journals
pm search "immunotherapy" --max 200 | pm fetch | pm parse | \
  jq -r '.journal' | sort | uniq -c | sort -rn | head -10

Literature Review Helper

Build a reading list with abstracts:

# Generate markdown reading list
pm search "CAR-T cell therapy review" --max 15 | pm fetch | pm parse | \
  jq -r '"## \(.title)\n**\(.journal)** (\(.year)) - PMID: \(.pmid)\n\n\(.abstract // "No abstract")\n\n---\n"' \
  > reading-list.md

# Find review articles specifically
pm search "neuroplasticity AND review[pt]" --max 10 | pm fetch | pm parse | \
  jq -r '.title'

Quick Reference Lookup

# Look up a specific PMID
echo "12345678" | pm fetch | pm parse | jq .

# Batch lookup from a file
cat pmids.txt | pm fetch | pm parse > articles.jsonl

# Get DOI for citation
pm search "Yamanaka induced pluripotent" --max 1 | pm fetch | pm parse | \
  jq -r '"DOI: \(.doi)\nTitle: \(.title)"'

# Get full citation in CSL-JSON format
echo "12345678" | pm cite | jq '.'

Download Open Access PDFs

# Preview what would be downloaded (dry-run)
pm search "CRISPR review" --max 10 | pm fetch | pm parse | \
  pm download --dry-run

# Download PDFs to a directory
pm search "open access[filter] AND immunotherapy" --max 20 | \
  pm fetch | pm parse | pm download --output-dir ./papers/

# Download with Unpaywall fallback (more coverage, requires email)
pm search "machine learning radiology" --max 10 | pm fetch | pm parse | \
  pm download --output-dir ./pdfs/ --email you@university.edu

# Download from PMID list (auto-converts to DOI/PMCID)
cat pmids.txt | pm download --output-dir ./pdfs/

Sources: pm download tries PMC Open Access first, then falls back to Unpaywall (if --email provided). Not all articles have free PDFs available.

Generate Bibliography Citations

# Get CSL-JSON citations for specific PMIDs
pm cite 28012456 29886577 > citations.jsonl

# Pipeline: search -> cite
pm search "CRISPR review" --max 10 | pm cite > citations.jsonl

# Convert to Pandoc-compatible bibliography
jq -s '.' citations.jsonl > bibliography.json

# Use with Pandoc
pandoc paper.md --citeproc --bibliography=bibliography.json -o paper.pdf

Output format (CSL-JSON):

{
  "id": "pmid:28012456",
  "type": "article-journal",
  "title": "Article title...",
  "author": [{"family": "Smith", "given": "John"}],
  "container-title": "Nature",
  "issued": {"date-parts": [[2024, 3, 15]]},
  "volume": "627",
  "page": "123-130",
  "PMID": "28012456",
  "DOI": "10.1038/xxxxx"
}

pm cite vs pm parse:

Feature	pm parse	pm cite
Abstract	Yes	No
Page numbers	No	Yes
Volume/Issue	No	Yes
Citation tools	Needs conversion	Direct (Zotero, Pandoc)

Use pm cite for generating bibliographies; pm parse for content analysis.

Advanced Patterns

Build a Local Database

# Fetch your entire research area (be patient, respects rate limits)
pm search "your niche topic" --max 1000 | pm fetch | pm parse > my-field.jsonl

# Then query locally (instant!)
pm filter --year 2020- < my-field.jsonl
pm filter --author smith --has-abstract < my-field.jsonl

# Or use jq for complex queries
jq 'select(.abstract | test("novel"; "i"))' my-field.jsonl

Publication Trends

# Papers per year for a topic
pm search "microbiome gut brain" --max 500 | pm fetch | pm parse | \
  jq -r '.year' | sort | uniq -c | sort -k2

# Output:
#   12 2018
#   34 2019
#   67 2020
#  145 2021
#  203 2022

Integration with Other Tools

# Desktop notification for new papers (Linux)
pm search "your topic AND 2024[dp]" --max 5 | pm fetch | pm parse | \
  jq -r '.title' | head -1 | xargs -I {} notify-send "New Paper" "{}"

# Email yourself a digest
pm search "CRISPR 2024" --max 10 | pm fetch | pm parse | \
  jq -r '"- \(.title) (\(.journal))"' | \
  mail -s "Daily PubMed Digest" you@email.com

# Pipe to fzf for interactive selection
pm search "protein folding" --max 50 | pm fetch | pm parse | \
  jq -r '"\(.pmid)\t\(.title)"' | \
  fzf --preview 'echo {} | cut -f1 | xargs -I {} curl -s "https://pubmed.ncbi.nlm.nih.gov/{}"'

Working with Baseline Files

For bulk analysis, download PubMed baseline files directly:

# Parse local baseline file (30,000 articles)
zcat pubmed25n0001.xml.gz | pm parse > baseline.jsonl

# Find all papers from a specific institution
jq 'select(.authors[]? | test("Harvard"))' baseline.jsonl

Comparing Article Collections

Use pm diff to compare two JSONL files and find added, removed, or changed articles:

# Stream all differences as JSONL
pm diff baseline_v1.jsonl baseline_v2.jsonl

# Get list of new PMIDs (for fetching updates)
pm diff old.jsonl new.jsonl | jq -r 'select(.status=="added") | .pmid' | pm fetch | pm parse > new_articles.jsonl

# Filter to just changed articles
pm diff old.jsonl new.jsonl | jq 'select(.status=="changed")'

# Summary counts by status
pm diff old.jsonl new.jsonl | jq -s 'group_by(.status) | map({(.[0].status): length}) | add'

# Compare only metadata (ignore abstract changes)
pm diff old.jsonl new.jsonl --ignore abstract

# Quick check if files differ (for scripts)
if pm diff file1.jsonl file2.jsonl --quiet; then
    echo "Files are identical"
else
    echo "Files differ"
fi

Output format: Streaming JSONL with {"pmid":"...","status":"added|removed|changed",...}

Exit codes: 0 = identical, 1 = differences found, 2 = error

Output Format

Each article is output as a JSON object (JSONL format):

{
  "pmid": "12345678",
  "title": "Article title here",
  "authors": ["Smith John", "Doe Jane"],
  "journal": "Nature",
  "year": "2024",
  "date": "2024-03-15",
  "doi": "10.1038/xxxxx",
  "pmcid": "PMC1234567",
  "abstract": "Full abstract text..."
}

Fields doi, pmcid, date, and abstract are omitted when not available.

PubMed Query Syntax

Use standard PubMed search syntax:

Query	Meaning
`cancer AND therapy`	Both terms
`"gene editing"`	Exact phrase
`Smith J[author]`	Author search
`Nature[journal]`	Journal filter
`2024[dp]`	Publication date
`review[pt]`	Publication type
`2020:2024[dp]`	Date range

Tips

Rate Limits: Tools respect NCBI's 3 requests/second limit automatically
Batch Size: pm fetch batches 200 PMIDs per request for efficiency
Large Queries: Use --max to limit results, or paginate with date ranges
Verbose Mode: Add --verbose to pm parse to see progress on large files

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
docs		docs
e2e		e2e
fixtures		fixtures
src/pm_tools		src/pm_tools
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
VERSION		VERSION
cli-design-for-agents.md		cli-design-for-agents.md
demo.md		demo.md
plan.md		plan.md
pyproject.toml		pyproject.toml
spec.md		spec.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PubMed CLI Tools

Prerequisites

Installation

Development install (for contributors)

Getting Started

Commands

Quick Examples

Filtering Results

Quick Search with pm quick

Daily Research Workflows

Track Your Favorite Authors

Journal Watch

Literature Review Helper

Quick Reference Lookup

Download Open Access PDFs

Generate Bibliography Citations

Advanced Patterns

Build a Local Database

Publication Trends

Integration with Other Tools

Working with Baseline Files

Comparing Article Collections

Output Format

PubMed Query Syntax

Tips

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

lescientifik/pm-tools

Folders and files

Latest commit

History

Repository files navigation

PubMed CLI Tools

Prerequisites

Installation

Development install (for contributors)

Getting Started

Commands

Quick Examples

Filtering Results

Quick Search with pm quick

Daily Research Workflows

Track Your Favorite Authors

Journal Watch

Literature Review Helper

Quick Reference Lookup

Download Open Access PDFs

Generate Bibliography Citations

Advanced Patterns

Build a Local Database

Publication Trends

Integration with Other Tools

Working with Baseline Files

Comparing Article Collections

Output Format

PubMed Query Syntax

Tips

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages