A citation knowledge base that lets you find papers based on how the research community describes them—not just their titles or abstracts.
When researchers cite a paper, they describe it in context: what it does, why it matters, how it relates to their work. These citation contexts capture how the community actually thinks about and uses a paper.
bib builds a searchable database of these citation contexts.
When you add a paper, the system extracts every paragraph that cites other work. Each paragraph captures how the source paper describes the cited papers. When you query, you're searching through these descriptions—finding papers based on how other researchers characterize them.
Say Paper A contains this paragraph:
Recent advances in topological data analysis for protein structure [smith2020] have enabled new approaches to understanding folding dynamics.
When you query: "topological data analysis for proteins"
The system finds: smith2020 — because Paper A described it that way.
This is "crowdsourced" citation discovery. A single paper's title might not mention "proteins" at all, but if dozens of papers cite it in the context of protein analysis, that pattern emerges. The more papers you add, the richer and more accurate the citation contexts become.
PDF → Extract paragraphs with citations → Embed citation contexts → Store
- Add PDFs from arXiv, URLs, or local files
- Parse each paper to find paragraphs containing citations
- Embed each citation context capturing how the source describes the cited work
- Index everything for fast semantic search
Query → Match against citation contexts → LLM reranks → Return cited papers
- Your query is embedded and matched against stored citation contexts
- Results are the cited papers that have been described in ways matching your query
- An LLM reranks to identify the most relevant matches
# Add papers to build your citation knowledge base
bib add https://arxiv.org/abs/2301.00001 # From arXiv
bib add https://example.com/paper.pdf # From URL
bib add ~/Downloads/paper.pdf # Local PDF
bib add # From clipboard
# Batch process a directory of PDFs
bib sync ~/Papers/
# Search by citation context (the main feature)
bib query "topological methods for protein folding"
bib query "attention mechanisms in vision" -n 20
# Interactive fuzzy search UI
bib search- Type to filter papers in real-time
Tab/Enter: Switch to browse modej/kor arrows: Navigate resultsEnter: Open PDF/Urlp: Copy PDF to locationd: Delete paperEsc: Exit
bib status # Database statistics
bib config # Setup storage directoriesbrew install antonio-leitao/taps/bibRequires Rust 1.70+
git clone https://github.com/antonio-leitao/bib.git
cd bib
cargo build --release
cargo install --path .Set up your Gemini API key:
export GEMINI_KEY=your_api_key_hereOr add to your shell profile for persistence.
- PDF Parsing: Uses Grobid for structured extraction of citations and paragraphs
- Embeddings & Reranking: Google Gemini for semantic embeddings and LLM-based reranking
- Storage: SQLite database with companion PDF storage
MIT License. See LICENSE file for details.
Antonio Leitao
For bug reports and feature requests, please use the issue tracker.