This project is developed based on the open-source project Resophy. Special thanks to the original author for their interesting creation.
PaperPilot is a next-generation research assistant designed to streamline your academic workflow. By integrating advanced AI capabilities with a robust document management system, PaperPilot helps you discover, read, understand, and manage research papers more efficiently than ever before.
Whether you are tracking the latest ArXiv preprints or deep-diving into complex PDFs, PaperPilot acts as your intelligent co-pilot.
Note: This project was built using Trae Coding Agent via Vibe Coding, leveraging two models (Gemini-3-Pro-Preview for feature development and GPT-5.2 for bug fixes and performance optimization). It is truly incredible for someone like me who hasn't written modern Web frontend code π€― (the last time I wrote Web code was manually coding HTML and CSS during my undergraduate years).
- Seamless Upload: Drag & drop PDF uploads with automatic metadata extraction.
- Organization: Custom categories, folders, and full-text search.
- Zotero Integration: One-click import from Zotero RDF libraries.
- Reading Heatmap: Visualize your reading habits with a GitHub-style contribution graph.
- AI Translation: Generate pixel-perfect English-to-Chinese (and other languages) translations using BabelDOC, preserving original layout and charts.
- AI Interpretation: Deep analysis of papers using MinerU (PDF-to-Markdown) and LLMs to generate structured summaries (Abstract, Methods, Experiments, Conclusions).
- Chat with Paper: Interactive Q&A with your documents to clarify concepts and details.
- Automated Tracking: Schedule daily fetches from specific ArXiv categories (e.g.,
cs.CV,cs.AI). - Smart Filtering: Filter papers by keywords, institution weights, and more.
- AI Summarization: Automatically generate concise summaries for new arrivals.
- Offline Capable: Works even without LLM connections (skips summary/institution details).
Automated daily paper fetching with AI summaries to keep you updated.
Deep full-text analysis and interactive Q&A to bridge language and understanding gaps.
Efficient reading list management and flexible system configuration.
Supports user authentication for secure private access and public network deployment.
- Backend: Python 3.10+, Flask
- Frontend: HTML5, CSS3, Vanilla JS (Responsive)
- Database: SQLite (Metadata), Supabase (Optional Auth)
- AI Core:
We recommend using uv for fast and reliable dependency management.
- Python 3.10 or higher
uvpackage manager
-
Clone the Repository
git clone https://github.com/flyflypeng/PaperPilot cd PaperPilot -
Initialize Environment
uv venv source .venv/bin/activate # Linux/macOS # .venv\Scripts\activate # Windows
-
Install Dependencies
Option A: Standard (Client-only) Suitable if you use external APIs for AI tasks.
uv pip install -e ".[local]"Option B: Full Server (Local AI) Includes dependencies for local MinerU and VLM inference.
uv pip install -e ".[server]" -
Supabase Auth (Optional) PaperPilotβs login/sign-up is powered by Supabase Auth (Email + Password). When enabled, the frontend uses
supabase-jsin the browser to obtain a session token and attachesAuthorization: Bearer <access_token>to/api/*requests; the backend validates the token via Supabase/auth/v1/user. IfSUPABASE_URLandSUPABASE_ANON_KEYare not configured, auth is disabled by default: the login overlay is hidden, and/api/*endpoints do not require anAuthorizationheader, so you can enter the management UI directly.- Create a Supabase project and get API values
- Create a project at https://supabase.com/
- In the Supabase dashboard, open Project Settings β API
- Copy Project URL as
SUPABASE_URL - Copy Project API keys β anon public as
SUPABASE_ANON_KEY
- Enable Email auth
- Go to Authentication β Providers
- Enable Email (Email/Password)
- For local/private deployments, you can disable email confirmations in Authentication β Settings to avoid requiring email verification after sign-up
- Configure redirect URLs (important)
- Go to Authentication β URL Configuration
- Set Site URL to your site origin, for example:
- Local:
http://localhost:7191 - Production:
https://your-domain.com - Add allowed callback URLs in Redirect URLs (at least include your site root), for example:
http://localhost:7191/https://your-domain.com/
- Enable auth in PaperPilot Copy and edit environment variables in the project root:
cp .env.example .env # Edit .env and fill in SUPABASE_URL and SUPABASE_ANON_KEYRestart the server. If configured correctly, you will see the login/sign-up entry on the page.
Security notes
- Use only the
anon publickey; never putservice_rolekeys into.envor ship them to the browser - This project injects
SUPABASE_URLandSUPABASE_ANON_KEYinto pages for browser-side login, which is expected
-
Run the Application
Then start the application:
python app.py
Access the web interface at
http://localhost:7191(default port).Custom Launch Arguments:
app.pysupports the following command-line arguments for custom configuration:Argument Default Description --papers-dir./papersPath to the papers directory (absolute or relative) --host0.0.0.0Server listening address --port7191Server listening port --debugFalseEnable debug mode (for development) Typical Configuration Examples:
-
Specify Data Storage Location (useful for mounted data volumes):
python app.py --papers-dir /mnt/data/my_papers
-
Change Server Port (if the default port is occupied):
python app.py --port 8080
-
Allow Local Access Only (for enhanced security):
python app.py --host 127.0.0.1
-
PaperPilot is designed to be configurable directly from the Web UI.
Navigate to the Settings tab to configure:
- LLM Provider: Set your API Key, Base URL, and Model Name (e.g., GPT-4, Qwen, DeepSeek).
- MinerU: Choose between Local instance or Cloud API.
Configure your research interests:
- Categories: Select ArXiv categories to monitor.
- Keywords: Define keywords for filtering and highlighting.
- Schedule: Set the automatic fetch interval.
- Multi-User Support: The current version of PaperPilot is designed for individuals or small teams and does not yet fully support multi-tenancy. While it supports authentication via Supabase, all users share the same backend configuration and paper library. It is recommended to deploy in a private network or trusted environment.
- BabelDOC Translation: The English-Chinese parallel translation feature based on BabelDOC has high memory consumption and a long processing time. It is recommended to use this feature primarily for papers that require intensive reading.
Contributions are welcome! This roadmap is a living document and will evolve with user needs and available time.
- Reading annotations: highlights, comments, bookmarks, and one-click quote snippets
- Paper chat improvements: grounded answers with page/paragraph/snippet references
- Daily ArXiv rules upgrade: keyword combinations, exclusions, regex
- Deployment & ops: Docker Compose, automated backup/restore, health checks
- Semantic search: local embeddings index (optional vector store) with cross-library search
- Personal knowledge base: turn notes/summaries into a queryable research log (topics/timeline)
- Job queue & progress center: unified queue for translation/interpretation/indexing with retries and priorities
- Multi-tenancy: isolate libraries and settings per user/team
- Paper recommendations & graph: citation-network and reading-behavior signals
- Collaborative workspace: shared folders, team annotations, access control
- Multimodal understanding: structured extraction for figures/equations/tables and searchability
- Mobile/PWA: offline reading and cross-device sync
- Faster PDF rendering, page cache, and on-demand loading for long documents
- Incremental indexing for global search (avoid full rescans)
- Configuration validation and one-click diagnostics (LLM/MinerU/BabelDOC)
- Observability: job timings, failure reasons, and basic metrics
This project is licensed under the CC BY-NC 4.0 License. See the LICENSE file for details.







