Skip to content

Commit 25bd5c0

Browse files
authored
Update README.md
1 parent 18a0fa1 commit 25bd5c0

File tree

1 file changed

+15
-50
lines changed

1 file changed

+15
-50
lines changed

README.md

Lines changed: 15 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -4,64 +4,50 @@
44
</a>
55
</div>
66

7+
78
# PageIndex MCP
89

9-
A Model Context Protocol (MCP) server for **PageIndex** - Next-Generation Reasoning-based RAG.
10+
Want to chat with PDF on Claude but got limit reached error? You can add your file to PageIndex to seamlessly chat with long PDFs on your Claude desktop.
1011

11-
For an overview and quick start, check out the [PageIndex MCP](https://pageindex.ai/mcp) project page.
12+
- Support local and online PDFs
13+
- Free 1000 pages
14+
- Unlimited conversations
15+
16+
For more information about PageIndex MCP, check out the [PageIndex MCP](https://pageindex.ai/mcp) project page.
1217

13-
## What is PageIndex?
18+
# What is PageIndex?
1419

1520
<div align="center">
1621
<a href="https://pageindex.ai/mcp">
1722
<img src="https://docs.pageindex.ai/images/cookbook/vectorless-rag.png" width="80%">
1823
</a>
1924
</div>
2025

21-
PageIndex is a revolutionary document processing system that uses **reasoning-based RAG** instead of traditional vector-based similarity search. Unlike conventional RAG systems that rely on semantic similarity, PageIndex uses multi-step reasoning and tree search to retrieve information like a human expert would.
22-
23-
### Key Advantages over Vector-based RAG
26+
PageIndex is a vectorless **reasoning-based RAG** system which uses multi-step reasoning and tree search to retrieve information like a human expert would. It has the following properties:
2427

25-
- **Higher Accuracy**: Relevance beyond similarity - ideal for domain-specific documents where semantics are similar
28+
- **Higher Accuracy**: Relevance beyond similarity -
2629
- **Better Transparency**: Clear reasoning trajectory with traceable search paths
2730
- **Like A Human**: Retrieve information like a human expert navigates documents
2831
- **No Vector DB**: No extra infrastructure overhead
2932
- **No Chunking**: Preserve full document context and structure
3033
- **No Top-K**: Retrieve all relevant passages automatically
3134

32-
## Features
33-
34-
- **Local PDF Processing**: Upload local PDF files directly without manual uploads
35-
- **URL Support**: Process documents from URLs
36-
- **Full PageIndex Integration**: Access all PageIndex capabilities (OCR, tree generation, reasoning-based retrieval)
37-
- **Secure OAuth Authentication**: OAuth 2.1 with PKCE and automatic token refresh
38-
- **TypeScript**: Full type safety with MCP SDK
39-
- **Desktop Extension (DXT)**: One-click installation for Claude Desktop with secure configuration
40-
41-
## Usage
42-
43-
### Getting Started
4435

45-
The PageIndex MCP server uses OAuth 2.1 authentication for secure access. When you first run the server, it will guide you through the authentication process by opening your browser to authorize the application.
36+
---
37+
# PageIndex MCP Setup
38+
See [PageIndex MCP](https://pageindex.ai/mcp) for full video guidances.
4639

47-
### For Claude Desktop (Recommended)
40+
### 1. For Claude Desktop (Recommended)
4841

4942
**One-Click Installation with Desktop Extension (DXT):**
5043

5144
1. Download the latest `.dxt` file from [Releases](https://github.com/VectifyAI/pageindex-mcp/releases)
5245
2. Double-click the `.dxt` file to install automatically in Claude Desktop
5346
3. The OAuth authentication will be handled automatically when you first use the extension
5447

55-
**Benefits of DXT Installation:**
56-
57-
- **No technical setup** - just download and double-click
58-
- **Secure OAuth authentication** - handled automatically through your browser
59-
- **Automatic updates** - extensions update seamlessly
60-
- **Full local PDF support** - upload and process PDFs directly from your computer
61-
6248
This is the easiest way to get started with PageIndex's reasoning-based RAG capabilities.
6349

64-
### For Other MCP-Compatible Clients
50+
### 2. For Other MCP-Compatible Clients
6551

6652
#### Option 1: Local MCP Server (with local PDF upload)
6753

@@ -80,12 +66,6 @@ Add to your MCP configuration:
8066
}
8167
```
8268

83-
**Authentication Process:**
84-
1. When you first connect, the server will automatically open your browser for OAuth authentication
85-
2. Log in to your PageIndex account and authorize the application
86-
3. The authentication tokens are securely stored locally and automatically refreshed
87-
4. Subsequent connections will use the stored credentials automatically
88-
8969
> **Note**: This local server provides full PDF upload capabilities and handles all authentication automatically.
9070
9171
#### Option 2: Direct Connection to PageIndex
@@ -103,10 +83,6 @@ Connect directly to the PageIndex OAuth-enabled MCP server:
10383
}
10484
```
10585

106-
**Authentication Process:**
107-
1. The MCP client will automatically handle the OAuth flow
108-
2. You'll be redirected to authorize the application in your browser
109-
3. Authentication tokens are managed by the MCP client
11086

11187
**For clients that don't support HTTP MCP servers:**
11288

@@ -125,18 +101,7 @@ If your MCP client doesn't support HTTP servers directly, you can use [mcp-remot
125101

126102
> **Note**: Option 1 provides local PDF upload capabilities, while Option 2 only supports PDF processing via URLs (no local file uploads).
127103
128-
## Available Tools
129-
130-
| Tool | Description | Key Parameters |
131-
| -------------------------- | ------------------------------------------------------------------------------------------------ | ---------------------------------------- |
132-
| **process_document** | Upload and process PDF documents from local files or URLs with PageIndex OCR and tree generation | `url` - Local file path or PDF URL |
133-
| **recent_documents** | Get recent documents with status overview | `limit` (optional) |
134-
| **get_document** | Get basic document info and status | `doc_id` |
135-
| **get_document_structure** | Extract hierarchical document structure with configurable detail levels | `doc_id`, `max_depth` (1-10) |
136-
| **get_page_content** | Extract specific page content from processed documents | `doc_id`, `pages` ("5", "3-7", "1,5,10") |
137-
| **remove_document** | Permanently delete documents and associated data | `doc_ids` (array of document IDs) |
138104

139-
> **Quick Example**: Process a local PDF with `process_document`, then extract content with `get_page_content` using the returned document ID.
140105

141106
## License
142107

0 commit comments

Comments
 (0)