Skip to content

A Fully Open-Source, AI-Native Paper Reading & Management Platform

License

Notifications You must be signed in to change notification settings

flyflypeng/PaperPilot

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

206 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PaperPilot

A Fully Open-Source, AI-Native Paper Reading & Management Platform

License: CC BY-NC 4.0 Python 3.10+ Status

English | δΈ­ζ–‡


This project is developed based on the open-source project Resophy. Special thanks to the original author for their interesting creation.

πŸ“– Introduction

<iframe src="//player.bilibili.com/player.html?bvid=BV1kGcGzdEYk&page=1&high_quality=1" scrolling="no" border="0" frameborder="no" allowfullscreen="true" width="100%"></iframe>

PaperPilot is a next-generation research assistant designed to streamline your academic workflow. By integrating advanced AI capabilities with a robust document management system, PaperPilot helps you discover, read, understand, and manage research papers more efficiently than ever before.

Whether you are tracking the latest ArXiv preprints or deep-diving into complex PDFs, PaperPilot acts as your intelligent co-pilot.

Note: This project was built using Trae Coding Agent via Vibe Coding, leveraging two models (Gemini-3-Pro-Preview for feature development and GPT-5.2 for bug fixes and performance optimization). It is truly incredible for someone like me who hasn't written modern Web frontend code 🀯 (the last time I wrote Web code was manually coding HTML and CSS during my undergraduate years).

✨ Key Features

πŸ“š Smart Paper Management

  • Seamless Upload: Drag & drop PDF uploads with automatic metadata extraction.
  • Organization: Custom categories, folders, and full-text search.
  • Zotero Integration: One-click import from Zotero RDF libraries.
  • Reading Heatmap: Visualize your reading habits with a GitHub-style contribution graph.

πŸ€– AI-Powered Reading Assistant

  • AI Translation: Generate pixel-perfect English-to-Chinese (and other languages) translations using BabelDOC, preserving original layout and charts.
  • AI Interpretation: Deep analysis of papers using MinerU (PDF-to-Markdown) and LLMs to generate structured summaries (Abstract, Methods, Experiments, Conclusions).
  • Chat with Paper: Interactive Q&A with your documents to clarify concepts and details.

πŸ“‘ Daily ArXiv Radar

  • Automated Tracking: Schedule daily fetches from specific ArXiv categories (e.g., cs.CV, cs.AI).
  • Smart Filtering: Filter papers by keywords, institution weights, and more.
  • AI Summarization: Automatically generate concise summaries for new arrivals.
  • Offline Capable: Works even without LLM connections (skips summary/institution details).

πŸ“Έ Feature Showcase

πŸ“‘ Daily ArXiv Tracking

Automated daily paper fetching with AI summaries to keep you updated.

πŸ€– AI Interpretation & Chat

Deep full-text analysis and interactive Q&A to bridge language and understanding gaps.

πŸ“š Management & Configuration

Efficient reading list management and flexible system configuration.

πŸ” Secure Authentication & Access Control

Supports user authentication for secure private access and public network deployment.

πŸ› οΈ Tech Stack

  • Backend: Python 3.10+, Flask
  • Frontend: HTML5, CSS3, Vanilla JS (Responsive)
  • Database: SQLite (Metadata), Supabase (Optional Auth)
  • AI Core:
    • MinerU (High-fidelity PDF parsing)
    • BabelDOC (Document Translation)
    • OpenAI-compatible LLM Interface

πŸš€ Installation

We recommend using uv for fast and reliable dependency management.

Prerequisites

  • Python 3.10 or higher
  • uv package manager

Steps

  1. Clone the Repository

    git clone https://github.com/flyflypeng/PaperPilot
    cd PaperPilot
  2. Initialize Environment

    uv venv
    source .venv/bin/activate  # Linux/macOS
    # .venv\Scripts\activate   # Windows
  3. Install Dependencies

    Option A: Standard (Client-only) Suitable if you use external APIs for AI tasks.

    uv pip install -e ".[local]"

    Option B: Full Server (Local AI) Includes dependencies for local MinerU and VLM inference.

    uv pip install -e ".[server]"
  4. Supabase Auth (Optional) PaperPilot’s login/sign-up is powered by Supabase Auth (Email + Password). When enabled, the frontend uses supabase-js in the browser to obtain a session token and attaches Authorization: Bearer <access_token> to /api/* requests; the backend validates the token via Supabase /auth/v1/user. If SUPABASE_URL and SUPABASE_ANON_KEY are not configured, auth is disabled by default: the login overlay is hidden, and /api/* endpoints do not require an Authorization header, so you can enter the management UI directly.

    1. Create a Supabase project and get API values
    • Create a project at https://supabase.com/
    • In the Supabase dashboard, open Project Settings β†’ API
    • Copy Project URL as SUPABASE_URL
    • Copy Project API keys β†’ anon public as SUPABASE_ANON_KEY
    1. Enable Email auth
    • Go to Authentication β†’ Providers
    • Enable Email (Email/Password)
    • For local/private deployments, you can disable email confirmations in Authentication β†’ Settings to avoid requiring email verification after sign-up
    1. Configure redirect URLs (important)
    • Go to Authentication β†’ URL Configuration
    • Set Site URL to your site origin, for example:
    • Local: http://localhost:7191
    • Production: https://your-domain.com
    • Add allowed callback URLs in Redirect URLs (at least include your site root), for example:
    • http://localhost:7191/
    • https://your-domain.com/
    1. Enable auth in PaperPilot Copy and edit environment variables in the project root:
    cp .env.example .env
    # Edit .env and fill in SUPABASE_URL and SUPABASE_ANON_KEY

    Restart the server. If configured correctly, you will see the login/sign-up entry on the page.

    Security notes

    • Use only the anon public key; never put service_role keys into .env or ship them to the browser
    • This project injects SUPABASE_URL and SUPABASE_ANON_KEY into pages for browser-side login, which is expected
  5. Run the Application

    Then start the application:

    python app.py

    Access the web interface at http://localhost:7191 (default port).

    Custom Launch Arguments: app.py supports the following command-line arguments for custom configuration:

    Argument Default Description
    --papers-dir ./papers Path to the papers directory (absolute or relative)
    --host 0.0.0.0 Server listening address
    --port 7191 Server listening port
    --debug False Enable debug mode (for development)

    Typical Configuration Examples:

    • Specify Data Storage Location (useful for mounted data volumes):

      python app.py --papers-dir /mnt/data/my_papers
    • Change Server Port (if the default port is occupied):

      python app.py --port 8080
    • Allow Local Access Only (for enhanced security):

      python app.py --host 127.0.0.1

βš™οΈ Configuration

PaperPilot is designed to be configurable directly from the Web UI.

Agentic Settings

Navigate to the Settings tab to configure:

  • LLM Provider: Set your API Key, Base URL, and Model Name (e.g., GPT-4, Qwen, DeepSeek).
  • MinerU: Choose between Local instance or Cloud API.

Daily ArXiv

Configure your research interests:

  • Categories: Select ArXiv categories to monitor.
  • Keywords: Define keywords for filtering and highlighting.
  • Schedule: Set the automatic fetch interval.

⚠️ Important Notes

  • Multi-User Support: The current version of PaperPilot is designed for individuals or small teams and does not yet fully support multi-tenancy. While it supports authentication via Supabase, all users share the same backend configuration and paper library. It is recommended to deploy in a private network or trusted environment.
  • BabelDOC Translation: The English-Chinese parallel translation feature based on BabelDOC has high memory consumption and a long processing time. It is recommended to use this feature primarily for papers that require intensive reading.

πŸ—ΊοΈ Roadmap

Contributions are welcome! This roadmap is a living document and will evolve with user needs and available time.

Near-Term (Next)

  • Reading annotations: highlights, comments, bookmarks, and one-click quote snippets
  • Paper chat improvements: grounded answers with page/paragraph/snippet references
  • Daily ArXiv rules upgrade: keyword combinations, exclusions, regex
  • Deployment & ops: Docker Compose, automated backup/restore, health checks

Mid-Term (Mid)

  • Semantic search: local embeddings index (optional vector store) with cross-library search
  • Personal knowledge base: turn notes/summaries into a queryable research log (topics/timeline)
  • Job queue & progress center: unified queue for translation/interpretation/indexing with retries and priorities

Long-Term (Future)

  • Multi-tenancy: isolate libraries and settings per user/team
  • Paper recommendations & graph: citation-network and reading-behavior signals
  • Collaborative workspace: shared folders, team annotations, access control
  • Multimodal understanding: structured extraction for figures/equations/tables and searchability
  • Mobile/PWA: offline reading and cross-device sync

Performance & UX (Ongoing)

  • Faster PDF rendering, page cache, and on-demand loading for long documents
  • Incremental indexing for global search (avoid full rescans)
  • Configuration validation and one-click diagnostics (LLM/MinerU/BabelDOC)
  • Observability: job timings, failure reasons, and basic metrics

πŸ“„ License

This project is licensed under the CC BY-NC 4.0 License. See the LICENSE file for details.


Made with ❀️ by the PaperPilot Team

About

A Fully Open-Source, AI-Native Paper Reading & Management Platform

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 42.7%
  • JavaScript 38.6%
  • CSS 9.5%
  • HTML 9.2%