Skip to content

Commit 141d2a7

Browse files
committed
docs: update README and MCP integration guide to clarify Node MCP server options, enhance troubleshooting tips, and improve documentation on the new faxbot_pdf tool for PDF text extraction
1 parent c3bfea6 commit 141d2a7

File tree

9 files changed

+262
-113
lines changed

9 files changed

+262
-113
lines changed

README.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,9 @@ Simple fax-sending API with AI integration. Choose your backend:
2323
## AI Assistant Integration
2424
[→ MCP Integration Guide](docs/MCP_INTEGRATION.md)
2525

26-
- Recommended (OCR, avoids base64): use the new Node MCP servers in `node_mcp/` and the `faxbot_pdf` prompt to extract PDF text locally and send as TXT fax.
27-
- `cd node_mcp && npm install && ./scripts/start-stdio.sh`
28-
- Env: `FAX_API_URL`, `API_KEY`, optional `MAX_TEXT_SIZE`
29-
- Legacy servers remain under `api/` (`start-mcp.sh`, `start-mcp-http.sh`) and Python `python_mcp/`.
30-
- OAuth2‑protected SSE MCP servers are available in both Node and Python — see the SSE sections in the MCP guide.
26+
- Node MCP servers live in `node_mcp/` (stdio, HTTP, SSE+OAuth).
27+
- Legacy servers remain under `api/` and Python `python_mcp/`.
28+
- OAuth2‑protected SSE MCP servers are available in both Node and Python.
3129

3230
## Client SDKs
3331
- Python: `pip install faxbot`

docs/MCP_INTEGRATION.md

Lines changed: 27 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,13 @@ Faxbot provides **2 MCP servers × 3 transports = 6 integration options**:
1515
| **HTTP** | mcp_http_server.js | 3001 | API key | Web apps, cloud AI |
1616
| **SSE+OAuth** | mcp_sse_server.js | 3002 | JWT/Bearer | Enterprise, HIPAA |
1717

18-
New (recommended) Node MCP servers with OCR prompt live under `node_mcp/`:
18+
Node MCP servers live under `node_mcp/`:
1919

20-
| Transport | File (node_mcp) | Port | Auth | Notes |
21-
|-----------|------------------|------|------|-------|
22-
| **stdio** | src/servers/stdio.js | N/A | API key | Includes `faxbot_pdf` prompt (OCR) |
23-
| **HTTP** | src/servers/http.js | 3001 | API key | Streamable HTTP + prompts |
24-
| **SSE+OAuth** | src/servers/sse.js | 3002 | JWT/Bearer | Prompts + OAuth2 |
20+
| Transport | File (node_mcp) | Port | Auth |
21+
|-----------|------------------|------|------|
22+
| **stdio** | src/servers/stdio.js | N/A | API key |
23+
| **HTTP** | src/servers/http.js | 3001 | API key |
24+
| **SSE+OAuth** | src/servers/sse.js | 3002 | JWT/Bearer |
2525

2626
**Quick Selection Guide:**
2727
- **stdio**: Desktop AI assistants (Claude Desktop, Cursor) - simplest setup
@@ -39,58 +39,33 @@ Assistant → MCP Server → Faxbot API → Backend (Phaxio or SIP/Asterisk)
3939
- Input: `{ jobId }`
4040
- Output: Formatted job status.
4141

42-
## ⚠️ Critical Limitation: Base64 File Encoding
42+
## Setup
4343

44-
**This is a MAJOR user experience limitation that severely constrains real-world usage:**
45-
46-
### What This Means for Users:
47-
- **You CANNOT just say "fax this PDF file"** to Claude and point to a file on your computer
48-
- **The AI assistant must read your file AND convert it to base64 encoding** before calling the fax tools
49-
- **Large PDFs (>1MB) will consume massive amounts of conversation tokens** and may hit model limits
50-
- **This effectively limits faxing to small documents** (few pages max)
51-
52-
### The Technical Problem:
53-
1. MCP protocol requires `fileContent` parameter as base64-encoded string
54-
2. Claude Desktop/AI assistant must:
55-
- Read the file from your local filesystem (requires filesystem MCP server)
56-
- Encode entire file as base64 in memory
57-
- Pass huge base64 string as tool parameter
58-
- Base64 encoding increases file size by ~33%
59-
60-
### Realistic User Workflow (Legacy Base64 Path):
44+
Start Node MCP (stdio/HTTP/SSE):
6145
```
62-
❌ NOT POSSIBLE: "Hey Claude, fax document.pdf to +1234567890"
63-
64-
✅ ACTUALLY REQUIRED:
65-
1. User: "Please read document.pdf and fax it to +1234567890"
66-
2. Claude: Uses filesystem MCP to read file
67-
3. Claude: Converts file to base64 (consuming massive tokens)
68-
4. Claude: Calls send_fax with giant base64 string
69-
5. Faxbot MCP: Decodes base64 back to original file
46+
cd node_mcp && npm install
47+
FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY ./scripts/start-stdio.sh
48+
FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY MCP_HTTP_PORT=3001 ./scripts/start-http.sh
49+
OAUTH_ISSUER=... OAUTH_AUDIENCE=... FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY MCP_SSE_PORT=3002 ./scripts/start-sse.sh
7050
```
7151

72-
### File Size Impact:
73-
- **Small PDF (100KB)**: ~400KB tokens, usable
74-
- **Typical PDF (1MB)**: ~4MB tokens, may hit limits
75-
- **Large PDF (5MB)**: ~20MB tokens, **will fail**
76-
77-
### Why This Design Was Chosen:
78-
MCP protocol's JSON-based messaging requires binary data as base64. Alternative approaches (file paths, resource URLs) are emerging in the MCP community but not yet standardized for tool parameters.
79-
80-
## OCR Workaround (Recommended): Faxbot Prompt/Tool (Node + Python)
81-
82-
The new Node MCP servers in `node_mcp/` add a prompt-driven workflow that avoids sending base64 data through the conversation. Python MCP servers now include a matching tool for parity.
83-
84-
- Node prompt: `faxbot_pdf`
85-
- Python tool: `faxbot_pdf(pdf_path, to, header_text?)`
86-
- Behavior: Extracts text from the local PDF and sends it as a TXT fax to drastically reduce tokens. If text is not embedded, optional OCR fallback can be enabled.
87-
88-
Example prompt execution (conceptual GetPrompt request):
52+
Start Python MCP (stdio/HTTP/SSE):
8953
```
90-
name: "faxbot_pdf"
91-
arguments: { "pdf_path": "/abs/path/report.pdf", "to": "+15551234567", "header_text": "Acme Clinic" }
54+
cd python_mcp
55+
python -m venv .venv && source .venv/bin/activate
56+
pip install -r requirements.txt
57+
export FAX_API_URL=http://localhost:8080
58+
export API_KEY=your_api_key
59+
python stdio_server.py # stdio
60+
# or: uvicorn http_server:app --host 0.0.0.0 --port 3004
61+
# or: uvicorn server:app --host 0.0.0.0 --port 3003 (SSE+OAuth)
9262
```
9363

64+
## Tools
65+
66+
- send_fax: Send PDF or TXT files to a fax number.
67+
- get_fax_status: Check fax job status.
68+
9469
Start Node MCP servers with OCR support:
9570
```
9671
# stdio (desktop assistants)
@@ -108,6 +83,7 @@ OAUTH_ISSUER=... OAUTH_AUDIENCE=... FAX_API_URL=http://localhost:8080 API_KEY=$A
10883
Notes:
10984
- Set `MAX_TEXT_SIZE` (default 100000 bytes) to control extracted text size. Exceeding text is truncated with a warning.
11085
- This path does not embed base64 in the AI conversation, improving reliability for large PDFs.
86+
- An optional prompt named `faxbot_pdf` is available in Node that returns a message instructing the model to call the tool; the prompt itself does not send the fax.
11187

11288
Python MCP (stdio/HTTP/SSE) with OCR tool:
11389
```

docs/TROUBLESHOOTING.md

Lines changed: 5 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -42,31 +42,18 @@ If you're unsure which MCP transport to use:
4242

4343
### Common MCP Problems
4444

45-
#### Base64 File Handling (MAJOR LIMITATION)
46-
- **"File too large" or token limit errors**: PDFs >1MB will likely fail due to base64 token consumption
47-
- **AI assistant can't find file**: You need BOTH faxbot MCP AND filesystem MCP servers running
48-
- **Workflow confusion**: You can't just say "fax this file" - AI must read file first, then encode as base64
49-
- **Real size limits**:
50-
- 100KB PDF = usable
51-
- 500KB PDF = borderline
52-
- 1MB+ PDF = probably fails
53-
- **Recommended Workaround (OCR)**: Use the Faxbot OCR workflow to extract text locally and send as TXT fax.
54-
- Node prompt: `faxbot_pdf` (in `node_mcp/`)
55-
- Python tool: `faxbot_pdf(pdf_path, to, header_text?)` (in `python_mcp/`)
56-
- Start (Node): `cd node_mcp && npm install && ./scripts/start-stdio.sh`
57-
- Start (Python): `cd python_mcp && python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt && python stdio_server.py`
58-
- Env: set `FAX_API_URL` and `API_KEY`; optional `MAX_TEXT_SIZE`, `FAXBOT_OCR_ENABLE=true`, `FAXBOT_OCR_DPI=200`
45+
#### MCP Usage Tips
46+
- Ensure the main Faxbot API is reachable (`FAX_API_URL`) and your `API_KEY` is set.
47+
- For local files, use tooling that can access your filesystem as needed.
5948

6049
#### Connection & Authentication
6150
- **MCP server not found**: Ensure you’re starting from the correct path:
6251
- Legacy servers: `api/scripts/start-mcp*.sh`
6352
- New servers (recommended): `node_mcp/scripts/start-*.sh`
6453
- Python servers: `python_mcp/` (`stdio_server.py`, `http_server.py`, `server.py`)
6554

66-
### OCR Issues (Python)
67-
- `pytesseract` errors: Install Tesseract OCR (macOS: `brew install tesseract`, Ubuntu: `sudo apt-get install tesseract-ocr`). Ensure `tesseract` is on PATH or set `TESSERACT_CMD`.
68-
- `pypdfium2` missing: Run `pip install -r python_mcp/requirements.txt` within a virtualenv.
69-
- OCR not triggered: Ensure `FAXBOT_OCR_ENABLE=true`. The code only falls back to OCR when extracted text is empty or very short.
55+
### Environment
56+
- `FAX_API_URL`, `API_KEY`: Required for authentication.
7057
- **Authentication failures**:
7158
- stdio: Check `API_KEY` environment variable matches Faxbot API setting
7259
- HTTP: Verify `X-API-Key` header is being passed correctly

node_mcp/README.md

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
1-
# Faxbot Node MCP (OCR Workflow)
1+
# Faxbot Node MCP
22

3-
Node-based MCP servers for Faxbot with smart PDF-to-text extraction to avoid base64 token limits.
3+
Node-based MCP servers for Faxbot.
44

55
Features:
66
- Stdio, Streamable HTTP, and SSE+OAuth transports
7-
- send_fax and get_fax_status tools (backward compatible)
8-
- faxbot_pdf prompt: extracts PDF text locally, sends as TXT fax
7+
- Tools: send_fax, get_fax_status
98

109
## Install
1110

@@ -35,17 +34,14 @@ Environment variables:
3534
./scripts/start-sse.sh
3635
```
3736

38-
## Prompts
39-
40-
- `faxbot_pdf(pdf_path, to, header_text?)`
41-
- Extracts PDF text locally and sends as a text fax.
42-
- Returns a confirmation message and job ID.
43-
4437
## Tools
4538

46-
- `send_fax(to, fileContent(base64), fileName, fileType?)`
39+
- `send_fax(to, filePath?, fileContent?, fileName?, fileType?)`
4740
- `get_fax_status(jobId)`
4841

42+
## Notes
43+
- For local files, tools accept a `filePath` parameter (preferred). Base64 is still supported for compatibility.
44+
4945
## Notes
5046

5147
- Existing `/api` MCP servers remain as fallback. These servers are the new default target for OCR workflows.

node_mcp/src/servers/stdio.js

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ import {
99
ListToolsRequestSchema,
1010
McpError,
1111
} from '@modelcontextprotocol/sdk/types.js';
12-
import { faxTools, handleSendFaxTool, handleGetFaxStatusTool } from '../tools/fax-tools.js';
12+
import { faxTools, handleSendFaxTool, handleGetFaxStatusTool, handleFaxbotPdfTool } from '../tools/fax-tools.js';
1313
import { listPrompts } from '../prompts/index.js';
1414
import { extractTextFromPDF } from '../shared/pdf-extractor.js';
1515
import { sendFax } from '../shared/fax-client.js';
@@ -62,6 +62,8 @@ function buildServer() {
6262
return await handleSendFaxTool(args);
6363
case 'get_fax_status':
6464
return await handleGetFaxStatusTool(args);
65+
case 'faxbot_pdf':
66+
return await handleFaxbotPdfTool(args);
6567
default:
6668
throw new McpError(ErrorCode.MethodNotFound, `Unknown tool: ${name}`);
6769
}
@@ -72,15 +74,14 @@ function buildServer() {
7274
server.setRequestHandler(GetPromptRequestSchema, async (request) => {
7375
const { name, arguments: args } = request.params;
7476
if (name === 'faxbot_pdf') {
75-
const { message, jobId } = await executeFaxbotPdf(args || {});
77+
const { pdf_path, to, header_text } = (args || {});
78+
const instruction = `Plan: Extract text from PDF at ${pdf_path} and send as TXT fax to ${to}.\n` +
79+
`Action: Call the tool 'faxbot_pdf' with the same arguments to execute. Optional header_text: ${header_text || '(none)'}\n` +
80+
`Note: This prompt only describes the plan and does not send the fax itself.`;
7681
return {
7782
messages: [
78-
{
79-
role: 'user',
80-
content: { type: 'text', text: message },
81-
},
83+
{ role: 'user', content: { type: 'text', text: instruction } },
8284
],
83-
_meta: { jobId },
8485
};
8586
}
8687
throw new McpError(ErrorCode.MethodNotFound, `Unknown prompt: ${name}`);

node_mcp/src/shared/pdf-extractor.js

Lines changed: 71 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
import fs from 'fs';
22
import path from 'path';
3+
import os from 'os';
4+
import { execFile } from 'child_process';
35

46
function normalizeWhitespace(text) {
57
if (!text || typeof text !== 'string') return '';
@@ -53,7 +55,15 @@ export async function extractTextFromPDF(filePath) {
5355
throw new Error('Provided path is not a file');
5456
}
5557
const buffer = await fs.promises.readFile(filePath);
56-
return await extractTextFromBuffer(buffer);
58+
const text = await extractTextFromBuffer(buffer);
59+
if (text && text.length >= 32) return text;
60+
// Optional OCR fallback using pdftoppm + tesseract if available/enabled
61+
const ocrEnabled = (process.env.FAXBOT_OCR_ENABLE || 'true').toLowerCase() !== 'false';
62+
if (ocrEnabled) {
63+
const ocr = await ocrPdfWithTesseract(filePath);
64+
if (ocr) return ocr;
65+
}
66+
return text;
5767
} catch (err) {
5868
if (err && (err.code === 'ENOENT' || err.code === 'ENOTDIR')) {
5969
throw new Error(`File not found: ${filePath}`);
@@ -63,3 +73,63 @@ export async function extractTextFromPDF(filePath) {
6373
}
6474

6575
export default { extractTextFromPDF, extractTextFromBuffer };
76+
77+
// Helpers
78+
function which(cmd) {
79+
const sep = process.platform === 'win32' ? ';' : ':';
80+
const exts = process.platform === 'win32' ? (process.env.PATHEXT || '.EXE;.CMD;.BAT').toLowerCase().split(';') : [''];
81+
const paths = (process.env.PATH || '').split(sep);
82+
for (const p of paths) {
83+
const full = path.join(p, cmd);
84+
for (const ext of exts) {
85+
const candidate = full + ext;
86+
try {
87+
fs.accessSync(candidate, fs.constants.X_OK);
88+
return candidate;
89+
} catch {}
90+
}
91+
}
92+
return null;
93+
}
94+
95+
function execFileAsync(cmd, args, options = {}) {
96+
return new Promise((resolve, reject) => {
97+
execFile(cmd, args, options, (error, stdout, stderr) => {
98+
if (error) return reject(Object.assign(error, { stdout, stderr }));
99+
resolve({ stdout, stderr });
100+
});
101+
});
102+
}
103+
104+
async function ocrPdfWithTesseract(pdfPath) {
105+
const pdftoppm = which('pdftoppm');
106+
const tesseract = which('tesseract');
107+
if (!pdftoppm || !tesseract) return '';
108+
const dpi = parseInt(process.env.FAXBOT_OCR_DPI || '200', 10);
109+
const tmp = await fs.promises.mkdtemp(path.join(os.tmpdir(), 'faxbot-ocr-'));
110+
try {
111+
const prefix = path.join(tmp, 'page');
112+
await execFileAsync(pdftoppm, ['-r', String(dpi), '-png', pdfPath, prefix], { maxBuffer: 1024 * 1024 * 64 });
113+
// Collect generated images
114+
const files = await fs.promises.readdir(tmp);
115+
const pngs = files.filter((f) => f.startsWith('page-') && f.endsWith('.png')).sort((a, b) => a.localeCompare(b));
116+
let out = '';
117+
for (const f of pngs) {
118+
const img = path.join(tmp, f);
119+
try {
120+
const { stdout } = await execFileAsync(tesseract, [img, 'stdout']);
121+
out += `\n\n${stdout || ''}`;
122+
} catch {}
123+
}
124+
return normalizeWhitespace(out);
125+
} catch {
126+
return '';
127+
} finally {
128+
// cleanup
129+
try {
130+
const files = await fs.promises.readdir(tmp);
131+
await Promise.all(files.map((f) => fs.promises.unlink(path.join(tmp, f)).catch(() => {})));
132+
await fs.promises.rmdir(tmp).catch(() => {});
133+
} catch {}
134+
}
135+
}

0 commit comments

Comments
 (0)