docs: update README and MCP integration guide to clarify Node MCP server options, enhance troubleshooting tips, and improve documentation on the new faxbot_pdf tool for PDF text extraction

DMontgomery40 · DMontgomery40 · commit 141d2a79bc91 · 2025-09-06T08:03:24.000-06:00
diff --git a/README.md b/README.md
@@ -23,11 +23,9 @@ Simple fax-sending API with AI integration. Choose your backend:
 ## AI Assistant Integration
 [→ MCP Integration Guide](docs/MCP_INTEGRATION.md)
 
-- Recommended (OCR, avoids base64): use the new Node MCP servers in `node_mcp/` and the `faxbot_pdf` prompt to extract PDF text locally and send as TXT fax.
-  - `cd node_mcp && npm install && ./scripts/start-stdio.sh`
-  - Env: `FAX_API_URL`, `API_KEY`, optional `MAX_TEXT_SIZE`
-- Legacy servers remain under `api/` (`start-mcp.sh`, `start-mcp-http.sh`) and Python `python_mcp/`.
-- OAuth2‑protected SSE MCP servers are available in both Node and Python — see the SSE sections in the MCP guide.
+- Node MCP servers live in `node_mcp/` (stdio, HTTP, SSE+OAuth).
+- Legacy servers remain under `api/` and Python `python_mcp/`.
+- OAuth2‑protected SSE MCP servers are available in both Node and Python.
 
 ## Client SDKs
 - Python: `pip install faxbot`
diff --git a/docs/MCP_INTEGRATION.md b/docs/MCP_INTEGRATION.md
@@ -15,13 +15,13 @@ Faxbot provides **2 MCP servers × 3 transports = 6 integration options**:
 | **HTTP** | mcp_http_server.js | 3001 | API key | Web apps, cloud AI |
 | **SSE+OAuth** | mcp_sse_server.js | 3002 | JWT/Bearer | Enterprise, HIPAA |
 
-New (recommended) Node MCP servers with OCR prompt live under `node_mcp/`:
+Node MCP servers live under `node_mcp/`:
 
-| Transport | File (node_mcp) | Port | Auth | Notes |
-|-----------|------------------|------|------|-------|
-| **stdio** | src/servers/stdio.js | N/A | API key | Includes `faxbot_pdf` prompt (OCR) |
-| **HTTP** | src/servers/http.js | 3001 | API key | Streamable HTTP + prompts |
-| **SSE+OAuth** | src/servers/sse.js | 3002 | JWT/Bearer | Prompts + OAuth2 |
+| Transport | File (node_mcp) | Port | Auth |
+|-----------|------------------|------|------|
+| **stdio** | src/servers/stdio.js | N/A | API key |
+| **HTTP** | src/servers/http.js | 3001 | API key |
+| **SSE+OAuth** | src/servers/sse.js | 3002 | JWT/Bearer |
 
 **Quick Selection Guide:**
 - **stdio**: Desktop AI assistants (Claude Desktop, Cursor) - simplest setup
@@ -39,58 +39,33 @@ Assistant → MCP Server → Faxbot API → Backend (Phaxio or SIP/Asterisk)
   - Input: `{ jobId }`
   - Output: Formatted job status.
 
-## ⚠️ Critical Limitation: Base64 File Encoding
+## Setup
 
-**This is a MAJOR user experience limitation that severely constrains real-world usage:**
-
-### What This Means for Users:
-- **You CANNOT just say "fax this PDF file"** to Claude and point to a file on your computer
-- **The AI assistant must read your file AND convert it to base64 encoding** before calling the fax tools
-- **Large PDFs (>1MB) will consume massive amounts of conversation tokens** and may hit model limits
-- **This effectively limits faxing to small documents** (few pages max)
-
-### The Technical Problem:
-1. MCP protocol requires `fileContent` parameter as base64-encoded string
-2. Claude Desktop/AI assistant must:
-   - Read the file from your local filesystem (requires filesystem MCP server)
-   - Encode entire file as base64 in memory  
-   - Pass huge base64 string as tool parameter
-   - Base64 encoding increases file size by ~33%
-
-### Realistic User Workflow (Legacy Base64 Path):
+Start Node MCP (stdio/HTTP/SSE):
 ```
-❌ NOT POSSIBLE: "Hey Claude, fax document.pdf to +1234567890"
-
-✅ ACTUALLY REQUIRED:
-1. User: "Please read document.pdf and fax it to +1234567890" 
-2. Claude: Uses filesystem MCP to read file
-3. Claude: Converts file to base64 (consuming massive tokens)
-4. Claude: Calls send_fax with giant base64 string
-5. Faxbot MCP: Decodes base64 back to original file
+cd node_mcp && npm install
+FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY ./scripts/start-stdio.sh
+FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY MCP_HTTP_PORT=3001 ./scripts/start-http.sh
+OAUTH_ISSUER=... OAUTH_AUDIENCE=... FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY MCP_SSE_PORT=3002 ./scripts/start-sse.sh
 ```
 
-### File Size Impact:
-- **Small PDF (100KB)**: ~400KB tokens, usable
-- **Typical PDF (1MB)**: ~4MB tokens, may hit limits  
-- **Large PDF (5MB)**: ~20MB tokens, **will fail**
-
-### Why This Design Was Chosen:
-MCP protocol's JSON-based messaging requires binary data as base64. Alternative approaches (file paths, resource URLs) are emerging in the MCP community but not yet standardized for tool parameters.
-
-## OCR Workaround (Recommended): Faxbot Prompt/Tool (Node + Python)
-
-The new Node MCP servers in `node_mcp/` add a prompt-driven workflow that avoids sending base64 data through the conversation. Python MCP servers now include a matching tool for parity.
-
-- Node prompt: `faxbot_pdf`
-- Python tool: `faxbot_pdf(pdf_path, to, header_text?)`
-- Behavior: Extracts text from the local PDF and sends it as a TXT fax to drastically reduce tokens. If text is not embedded, optional OCR fallback can be enabled.
-
-Example prompt execution (conceptual GetPrompt request):
+Start Python MCP (stdio/HTTP/SSE):
 ```
-name: "faxbot_pdf"
-arguments: { "pdf_path": "/abs/path/report.pdf", "to": "+15551234567", "header_text": "Acme Clinic" }
+cd python_mcp
+python -m venv .venv && source .venv/bin/activate
+pip install -r requirements.txt
+export FAX_API_URL=http://localhost:8080
+export API_KEY=your_api_key
+python stdio_server.py               # stdio
+# or: uvicorn http_server:app --host 0.0.0.0 --port 3004
+# or: uvicorn server:app --host 0.0.0.0 --port 3003 (SSE+OAuth)
 ```
 
+## Tools
+
+- send_fax: Send PDF or TXT files to a fax number.
+- get_fax_status: Check fax job status.
+
 Start Node MCP servers with OCR support:
 ```
 # stdio (desktop assistants)
@@ -108,6 +83,7 @@ OAUTH_ISSUER=... OAUTH_AUDIENCE=... FAX_API_URL=http://localhost:8080 API_KEY=$A
 Notes:
 - Set `MAX_TEXT_SIZE` (default 100000 bytes) to control extracted text size. Exceeding text is truncated with a warning.
 - This path does not embed base64 in the AI conversation, improving reliability for large PDFs.
+- An optional prompt named `faxbot_pdf` is available in Node that returns a message instructing the model to call the tool; the prompt itself does not send the fax.
 
 Python MCP (stdio/HTTP/SSE) with OCR tool:
 ```
diff --git a/docs/TROUBLESHOOTING.md b/docs/TROUBLESHOOTING.md
@@ -42,31 +42,18 @@ If you're unsure which MCP transport to use:
 
 ### Common MCP Problems
 
-#### Base64 File Handling (MAJOR LIMITATION)
-- **"File too large" or token limit errors**: PDFs >1MB will likely fail due to base64 token consumption
-- **AI assistant can't find file**: You need BOTH faxbot MCP AND filesystem MCP servers running
-- **Workflow confusion**: You can't just say "fax this file" - AI must read file first, then encode as base64
-- **Real size limits**: 
-  - 100KB PDF = usable
-  - 500KB PDF = borderline  
-  - 1MB+ PDF = probably fails
-- **Recommended Workaround (OCR)**: Use the Faxbot OCR workflow to extract text locally and send as TXT fax.
-  - Node prompt: `faxbot_pdf` (in `node_mcp/`)
-  - Python tool: `faxbot_pdf(pdf_path, to, header_text?)` (in `python_mcp/`)
-  - Start (Node): `cd node_mcp && npm install && ./scripts/start-stdio.sh`
-  - Start (Python): `cd python_mcp && python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt && python stdio_server.py`
-  - Env: set `FAX_API_URL` and `API_KEY`; optional `MAX_TEXT_SIZE`, `FAXBOT_OCR_ENABLE=true`, `FAXBOT_OCR_DPI=200`
+#### MCP Usage Tips
+- Ensure the main Faxbot API is reachable (`FAX_API_URL`) and your `API_KEY` is set.
+- For local files, use tooling that can access your filesystem as needed.
 
 #### Connection & Authentication
 - **MCP server not found**: Ensure you’re starting from the correct path:
   - Legacy servers: `api/scripts/start-mcp*.sh`
   - New servers (recommended): `node_mcp/scripts/start-*.sh`
   - Python servers: `python_mcp/` (`stdio_server.py`, `http_server.py`, `server.py`)
 
-### OCR Issues (Python)
-- `pytesseract` errors: Install Tesseract OCR (macOS: `brew install tesseract`, Ubuntu: `sudo apt-get install tesseract-ocr`). Ensure `tesseract` is on PATH or set `TESSERACT_CMD`.
-- `pypdfium2` missing: Run `pip install -r python_mcp/requirements.txt` within a virtualenv.
-- OCR not triggered: Ensure `FAXBOT_OCR_ENABLE=true`. The code only falls back to OCR when extracted text is empty or very short.
+### Environment
+- `FAX_API_URL`, `API_KEY`: Required for authentication.
 - **Authentication failures**: 
   - stdio: Check `API_KEY` environment variable matches Faxbot API setting
   - HTTP: Verify `X-API-Key` header is being passed correctly
diff --git a/node_mcp/README.md b/node_mcp/README.md
@@ -1,11 +1,10 @@
-# Faxbot Node MCP (OCR Workflow)
+# Faxbot Node MCP
 
-Node-based MCP servers for Faxbot with smart PDF-to-text extraction to avoid base64 token limits.
+Node-based MCP servers for Faxbot.
 
 Features:
 - Stdio, Streamable HTTP, and SSE+OAuth transports
-- send_fax and get_fax_status tools (backward compatible)
-- faxbot_pdf prompt: extracts PDF text locally, sends as TXT fax
+- Tools: send_fax, get_fax_status
 
 ## Install
 
@@ -35,17 +34,14 @@ Environment variables:
 ./scripts/start-sse.sh
 ```
 
-## Prompts
-
-- `faxbot_pdf(pdf_path, to, header_text?)`
-  - Extracts PDF text locally and sends as a text fax.
-  - Returns a confirmation message and job ID.
-
 ## Tools
 
-- `send_fax(to, fileContent(base64), fileName, fileType?)`
+- `send_fax(to, filePath?, fileContent?, fileName?, fileType?)`
 - `get_fax_status(jobId)`
 
+## Notes
+- For local files, tools accept a `filePath` parameter (preferred). Base64 is still supported for compatibility.
+
 ## Notes
 
 - Existing `/api` MCP servers remain as fallback. These servers are the new default target for OCR workflows.
diff --git a/node_mcp/src/servers/stdio.js b/node_mcp/src/servers/stdio.js
@@ -9,7 +9,7 @@ import {
   ListToolsRequestSchema,
   McpError,
 } from '@modelcontextprotocol/sdk/types.js';
-import { faxTools, handleSendFaxTool, handleGetFaxStatusTool } from '../tools/fax-tools.js';
+import { faxTools, handleSendFaxTool, handleGetFaxStatusTool, handleFaxbotPdfTool } from '../tools/fax-tools.js';
 import { listPrompts } from '../prompts/index.js';
 import { extractTextFromPDF } from '../shared/pdf-extractor.js';
 import { sendFax } from '../shared/fax-client.js';
@@ -62,6 +62,8 @@ function buildServer() {
         return await handleSendFaxTool(args);
       case 'get_fax_status':
         return await handleGetFaxStatusTool(args);
+      case 'faxbot_pdf':
+        return await handleFaxbotPdfTool(args);
       default:
         throw new McpError(ErrorCode.MethodNotFound, `Unknown tool: ${name}`);
     }
@@ -72,15 +74,14 @@ function buildServer() {
   server.setRequestHandler(GetPromptRequestSchema, async (request) => {
     const { name, arguments: args } = request.params;
     if (name === 'faxbot_pdf') {
-      const { message, jobId } = await executeFaxbotPdf(args || {});
+      const { pdf_path, to, header_text } = (args || {});
+      const instruction = `Plan: Extract text from PDF at ${pdf_path} and send as TXT fax to ${to}.\n` +
+        `Action: Call the tool 'faxbot_pdf' with the same arguments to execute. Optional header_text: ${header_text || '(none)'}\n` +
+        `Note: This prompt only describes the plan and does not send the fax itself.`;
       return {
         messages: [
-          {
-            role: 'user',
-            content: { type: 'text', text: message },
-          },
+          { role: 'user', content: { type: 'text', text: instruction } },
         ],
-        _meta: { jobId },
       };
     }
     throw new McpError(ErrorCode.MethodNotFound, `Unknown prompt: ${name}`);
diff --git a/node_mcp/src/shared/pdf-extractor.js b/node_mcp/src/shared/pdf-extractor.js
@@ -1,5 +1,7 @@
 import fs from 'fs';
 import path from 'path';
+import os from 'os';
+import { execFile } from 'child_process';
 
 function normalizeWhitespace(text) {
   if (!text || typeof text !== 'string') return '';
@@ -53,7 +55,15 @@ export async function extractTextFromPDF(filePath) {
       throw new Error('Provided path is not a file');
     }
     const buffer = await fs.promises.readFile(filePath);
-    return await extractTextFromBuffer(buffer);
+    const text = await extractTextFromBuffer(buffer);
+    if (text && text.length >= 32) return text;
+    // Optional OCR fallback using pdftoppm + tesseract if available/enabled
+    const ocrEnabled = (process.env.FAXBOT_OCR_ENABLE || 'true').toLowerCase() !== 'false';
+    if (ocrEnabled) {
+      const ocr = await ocrPdfWithTesseract(filePath);
+      if (ocr) return ocr;
+    }
+    return text;
   } catch (err) {
     if (err && (err.code === 'ENOENT' || err.code === 'ENOTDIR')) {
       throw new Error(`File not found: ${filePath}`);
@@ -63,3 +73,63 @@ export async function extractTextFromPDF(filePath) {
 }
 
 export default { extractTextFromPDF, extractTextFromBuffer };
+
+// Helpers
+function which(cmd) {
+  const sep = process.platform === 'win32' ? ';' : ':';
+  const exts = process.platform === 'win32' ? (process.env.PATHEXT || '.EXE;.CMD;.BAT').toLowerCase().split(';') : [''];
+  const paths = (process.env.PATH || '').split(sep);
+  for (const p of paths) {
+    const full = path.join(p, cmd);
+    for (const ext of exts) {
+      const candidate = full + ext;
+      try {
+        fs.accessSync(candidate, fs.constants.X_OK);
+        return candidate;
+      } catch {}
+    }
+  }
+  return null;
+}
+
+function execFileAsync(cmd, args, options = {}) {
+  return new Promise((resolve, reject) => {
+    execFile(cmd, args, options, (error, stdout, stderr) => {
+      if (error) return reject(Object.assign(error, { stdout, stderr }));
+      resolve({ stdout, stderr });
+    });
+  });
+}
+
+async function ocrPdfWithTesseract(pdfPath) {
+  const pdftoppm = which('pdftoppm');
+  const tesseract = which('tesseract');
+  if (!pdftoppm || !tesseract) return '';
+  const dpi = parseInt(process.env.FAXBOT_OCR_DPI || '200', 10);
+  const tmp = await fs.promises.mkdtemp(path.join(os.tmpdir(), 'faxbot-ocr-'));
+  try {
+    const prefix = path.join(tmp, 'page');
+    await execFileAsync(pdftoppm, ['-r', String(dpi), '-png', pdfPath, prefix], { maxBuffer: 1024 * 1024 * 64 });
+    // Collect generated images
+    const files = await fs.promises.readdir(tmp);
+    const pngs = files.filter((f) => f.startsWith('page-') && f.endsWith('.png')).sort((a, b) => a.localeCompare(b));
+    let out = '';
+    for (const f of pngs) {
+      const img = path.join(tmp, f);
+      try {
+        const { stdout } = await execFileAsync(tesseract, [img, 'stdout']);
+        out += `\n\n${stdout || ''}`;
+      } catch {}
+    }
+    return normalizeWhitespace(out);
+  } catch {
+    return '';
+  } finally {
+    // cleanup
+    try {
+      const files = await fs.promises.readdir(tmp);
+      await Promise.all(files.map((f) => fs.promises.unlink(path.join(tmp, f)).catch(() => {})));
+      await fs.promises.rmdir(tmp).catch(() => {});
+    } catch {}
+  }
+}
diff --git a/node_mcp/src/tools/fax-tools.js b/node_mcp/src/tools/fax-tools.js
diff --git a/python_mcp/server.py b/python_mcp/server.py
diff --git a/python_mcp/stdio_server.py b/python_mcp/stdio_server.py