Skip to content

Commit c3bfea6

Browse files
committed
docs: update README and MCP integration guide to recommend new Node MCP servers with OCR support; enhance troubleshooting documentation for OCR and transport options
1 parent c62ced2 commit c3bfea6

File tree

10 files changed

+315
-23
lines changed

10 files changed

+315
-23
lines changed

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,11 @@ Simple fax-sending API with AI integration. Choose your backend:
2323
## AI Assistant Integration
2424
[→ MCP Integration Guide](docs/MCP_INTEGRATION.md)
2525

26-
- Quick start: use the scripts in `api/scripts` (`start-mcp.sh`, `start-mcp-http.sh`) or `make mcp-stdio` / `make mcp-http`.
27-
- New: OAuth2‑protected SSE MCP servers for Node and Python — see the SSE sections in the MCP guide.
26+
- Recommended (OCR, avoids base64): use the new Node MCP servers in `node_mcp/` and the `faxbot_pdf` prompt to extract PDF text locally and send as TXT fax.
27+
- `cd node_mcp && npm install && ./scripts/start-stdio.sh`
28+
- Env: `FAX_API_URL`, `API_KEY`, optional `MAX_TEXT_SIZE`
29+
- Legacy servers remain under `api/` (`start-mcp.sh`, `start-mcp-http.sh`) and Python `python_mcp/`.
30+
- OAuth2‑protected SSE MCP servers are available in both Node and Python — see the SSE sections in the MCP guide.
2831

2932
## Client SDKs
3033
- Python: `pip install faxbot`

docs/MCP_INTEGRATION.md

Lines changed: 66 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,14 @@ Faxbot provides **2 MCP servers × 3 transports = 6 integration options**:
1515
| **HTTP** | mcp_http_server.js | 3001 | API key | Web apps, cloud AI |
1616
| **SSE+OAuth** | mcp_sse_server.js | 3002 | JWT/Bearer | Enterprise, HIPAA |
1717

18+
New (recommended) Node MCP servers with OCR prompt live under `node_mcp/`:
19+
20+
| Transport | File (node_mcp) | Port | Auth | Notes |
21+
|-----------|------------------|------|------|-------|
22+
| **stdio** | src/servers/stdio.js | N/A | API key | Includes `faxbot_pdf` prompt (OCR) |
23+
| **HTTP** | src/servers/http.js | 3001 | API key | Streamable HTTP + prompts |
24+
| **SSE+OAuth** | src/servers/sse.js | 3002 | JWT/Bearer | Prompts + OAuth2 |
25+
1826
**Quick Selection Guide:**
1927
- **stdio**: Desktop AI assistants (Claude Desktop, Cursor) - simplest setup
2028
- **HTTP**: Web applications, cloud-based AI services - scalable
@@ -49,7 +57,7 @@ Assistant → MCP Server → Faxbot API → Backend (Phaxio or SIP/Asterisk)
4957
- Pass huge base64 string as tool parameter
5058
- Base64 encoding increases file size by ~33%
5159

52-
### Realistic User Workflow:
60+
### Realistic User Workflow (Legacy Base64 Path):
5361
```
5462
❌ NOT POSSIBLE: "Hey Claude, fax document.pdf to +1234567890"
5563
@@ -69,7 +77,60 @@ Assistant → MCP Server → Faxbot API → Backend (Phaxio or SIP/Asterisk)
6977
### Why This Design Was Chosen:
7078
MCP protocol's JSON-based messaging requires binary data as base64. Alternative approaches (file paths, resource URLs) are emerging in the MCP community but not yet standardized for tool parameters.
7179

72-
## Setup
80+
## OCR Workaround (Recommended): Faxbot Prompt/Tool (Node + Python)
81+
82+
The new Node MCP servers in `node_mcp/` add a prompt-driven workflow that avoids sending base64 data through the conversation. Python MCP servers now include a matching tool for parity.
83+
84+
- Node prompt: `faxbot_pdf`
85+
- Python tool: `faxbot_pdf(pdf_path, to, header_text?)`
86+
- Behavior: Extracts text from the local PDF and sends it as a TXT fax to drastically reduce tokens. If text is not embedded, optional OCR fallback can be enabled.
87+
88+
Example prompt execution (conceptual GetPrompt request):
89+
```
90+
name: "faxbot_pdf"
91+
arguments: { "pdf_path": "/abs/path/report.pdf", "to": "+15551234567", "header_text": "Acme Clinic" }
92+
```
93+
94+
Start Node MCP servers with OCR support:
95+
```
96+
# stdio (desktop assistants)
97+
cd node_mcp && npm install
98+
FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY ./scripts/start-stdio.sh
99+
100+
# HTTP (port 3001)
101+
FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY MCP_HTTP_PORT=3001 ./scripts/start-http.sh
102+
103+
# SSE + OAuth (port 3002)
104+
OAUTH_ISSUER=... OAUTH_AUDIENCE=... FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY \
105+
MCP_SSE_PORT=3002 ./scripts/start-sse.sh
106+
```
107+
108+
Notes:
109+
- Set `MAX_TEXT_SIZE` (default 100000 bytes) to control extracted text size. Exceeding text is truncated with a warning.
110+
- This path does not embed base64 in the AI conversation, improving reliability for large PDFs.
111+
112+
Python MCP (stdio/HTTP/SSE) with OCR tool:
113+
```
114+
cd python_mcp
115+
python -m venv .venv && source .venv/bin/activate
116+
pip install -r requirements.txt
117+
export FAX_API_URL=http://localhost:8080
118+
export API_KEY=your_api_key
119+
export FAXBOT_OCR_ENABLE=true # optional; enables OCR fallback when embedded text is missing
120+
export FAXBOT_OCR_DPI=200 # optional; rasterization DPI for OCR
121+
python stdio_server.py # stdio
122+
# or: uvicorn http_server:app --host 0.0.0.0 --port 3004
123+
# or: uvicorn server:app --host 0.0.0.0 --port 3003 (SSE+OAuth)
124+
```
125+
126+
OCR dependencies (Python optional):
127+
- Requires `pdfminer.six` (installed via requirements)
128+
- OCR fallback requires: `pypdfium2`, `Pillow`, `pytesseract`, and the Tesseract binary on your system
129+
- macOS: `brew install tesseract`
130+
- Ubuntu/Debian: `sudo apt-get install tesseract-ocr`
131+
- Set `TESSERACT_CMD` if tesseract is not on PATH
132+
133+
## Setup (Legacy /api servers)
73134
1) API running at `FAX_API_URL` (default `http://localhost:8080`).
74135
2) Install Node deps in `api/`:
75136
```
@@ -81,7 +142,7 @@ export FAX_API_URL=http://localhost:8080
81142
export API_KEY=your_secure_api_key
82143
```
83144

84-
## Quick Start (Scripts)
145+
## Quick Start (Scripts) — Legacy /api servers
85146
- macOS/Linux (stdio):
86147
```
87148
FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY \
@@ -111,7 +172,7 @@ FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY faxbot-mcp
111172
FAX_API_URL=http://localhost:8080 API_KEY=$API_KEY MCP_HTTP_PORT=3001 faxbot-mcp-http
112173
```
113174

114-
## Stdio Transport (Claude Desktop, Cursor)
175+
## Stdio Transport (Claude Desktop, Cursor) — Legacy /api server
115176
- Node (stdio) start:
116177
```
117178
cd api && npm run start:mcp
@@ -132,7 +193,7 @@ export API_KEY=your_api_key
132193
python stdio_server.py
133194
```
134195

135-
## HTTP Transport (Cloud/Local)
196+
## HTTP Transport (Cloud/Local) — Legacy /api server
136197
- Node (HTTP) start:
137198
```
138199
cd api && npm run start:http

docs/TROUBLESHOOTING.md

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,9 @@ If you're unsure which MCP transport to use:
3636

3737
| Transport | File | Port | Auth | Use Case |
3838
|-----------|------|------|------|----------|
39-
| **stdio** | mcp_server.js | N/A | API key | Claude Desktop, Cursor |
40-
| **HTTP** | mcp_http_server.js | 3001 | API key | Web apps, cloud AI |
41-
| **SSE+OAuth** | mcp_sse_server.js | 3002 | JWT/Bearer | Enterprise, HIPAA |
39+
| **stdio** | api/mcp_server.js or node_mcp/src/servers/stdio.js | N/A | API key | Desktop AI |
40+
| **HTTP** | api/mcp_http_server.js or node_mcp/src/servers/http.js | 3001 | API key | Web apps, cloud AI |
41+
| **SSE+OAuth** | api/mcp_sse_server.js or node_mcp/src/servers/sse.js | 3002 | JWT/Bearer | Enterprise, HIPAA |
4242

4343
### Common MCP Problems
4444

@@ -50,10 +50,23 @@ If you're unsure which MCP transport to use:
5050
- 100KB PDF = usable
5151
- 500KB PDF = borderline
5252
- 1MB+ PDF = probably fails
53-
- **Workaround**: Use smaller PDFs or convert large documents to text first
53+
- **Recommended Workaround (OCR)**: Use the Faxbot OCR workflow to extract text locally and send as TXT fax.
54+
- Node prompt: `faxbot_pdf` (in `node_mcp/`)
55+
- Python tool: `faxbot_pdf(pdf_path, to, header_text?)` (in `python_mcp/`)
56+
- Start (Node): `cd node_mcp && npm install && ./scripts/start-stdio.sh`
57+
- Start (Python): `cd python_mcp && python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt && python stdio_server.py`
58+
- Env: set `FAX_API_URL` and `API_KEY`; optional `MAX_TEXT_SIZE`, `FAXBOT_OCR_ENABLE=true`, `FAXBOT_OCR_DPI=200`
5459

5560
#### Connection & Authentication
56-
- **MCP server not found**: Ensure you're in the `api/` directory when starting MCP servers
61+
- **MCP server not found**: Ensure you’re starting from the correct path:
62+
- Legacy servers: `api/scripts/start-mcp*.sh`
63+
- New servers (recommended): `node_mcp/scripts/start-*.sh`
64+
- Python servers: `python_mcp/` (`stdio_server.py`, `http_server.py`, `server.py`)
65+
66+
### OCR Issues (Python)
67+
- `pytesseract` errors: Install Tesseract OCR (macOS: `brew install tesseract`, Ubuntu: `sudo apt-get install tesseract-ocr`). Ensure `tesseract` is on PATH or set `TESSERACT_CMD`.
68+
- `pypdfium2` missing: Run `pip install -r python_mcp/requirements.txt` within a virtualenv.
69+
- OCR not triggered: Ensure `FAXBOT_OCR_ENABLE=true`. The code only falls back to OCR when extracted text is empty or very short.
5770
- **Authentication failures**:
5871
- stdio: Check `API_KEY` environment variable matches Faxbot API setting
5972
- HTTP: Verify `X-API-Key` header is being passed correctly

node_mcp/src/servers/http.js

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -80,8 +80,16 @@ app.delete('/mcp', async (req, res) => {
8080
await session.transport.handleRequest(req, res);
8181
});
8282

83-
const port = parseInt(process.env.MCP_HTTP_PORT || '3001', 10);
84-
app.listen(port, () => {
85-
console.log(`Faxbot MCP HTTP (streamable) on http://localhost:${port}`);
86-
});
83+
export async function start() {
84+
const port = parseInt(process.env.MCP_HTTP_PORT || '3001', 10);
85+
app.listen(port, () => {
86+
console.log(`Faxbot MCP HTTP (streamable) on http://localhost:${port}`);
87+
});
88+
}
8789

90+
if (import.meta.url === `file://${process.argv[1]}`) {
91+
start().catch((err) => {
92+
console.error('Failed to start HTTP server:', err);
93+
process.exit(1);
94+
});
95+
}

node_mcp/src/servers/sse.js

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -99,8 +99,16 @@ app.delete('/messages', authenticate, async (req, res) => {
9999
}
100100
});
101101

102-
const port = parseInt(process.env.MCP_SSE_PORT || '3002', 10);
103-
app.listen(port, () => {
104-
console.log(`Faxbot MCP SSE (OAuth2) on http://localhost:${port}`);
105-
});
102+
export async function start() {
103+
const port = parseInt(process.env.MCP_SSE_PORT || '3002', 10);
104+
app.listen(port, () => {
105+
console.log(`Faxbot MCP SSE (OAuth2) on http://localhost:${port}`);
106+
});
107+
}
106108

109+
if (import.meta.url === `file://${process.argv[1]}`) {
110+
start().catch((err) => {
111+
console.error('Failed to start SSE server:', err);
112+
process.exit(1);
113+
});
114+
}

node_mcp/src/shared/pdf-extractor.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ export async function extractTextFromBuffer(buffer) {
2222
throw new Error('Invalid or empty buffer provided');
2323
}
2424
try {
25-
const { default: pdf } = await import('pdf-parse');
25+
const mod = await import('pdf-parse/lib/pdf-parse.js');
26+
const pdf = mod.default || mod;
2627
const data = await pdf(buffer);
2728
const cleaned = normalizeWhitespace(data.text || '');
2829
if (!cleaned) {

python_mcp/requirements.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,7 @@ starlette>=0.36.3
33
uvicorn>=0.30.6
44
python-jose>=3.3.0
55
httpx>=0.27.2
6+
pdfminer.six>=20231228
7+
pypdfium2>=4.30.0
8+
Pillow>=10.3.0
9+
pytesseract>=0.3.10

python_mcp/server.py

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
uvicorn server:app --host 0.0.0.0 --port 3003
2222
"""
2323
import base64
24+
import pathlib
2425
import os
2526
import time
2627
from typing import Dict, Any, Optional
@@ -206,6 +207,44 @@ async def get_fax_status(jobId: str) -> str: # noqa: N803
206207
return "\n".join(lines)
207208

208209

210+
def _normalize_and_truncate(text: str) -> str:
211+
max_bytes = int(os.getenv("MAX_TEXT_SIZE", "100000"))
212+
b = text.encode("utf-8")
213+
if len(b) > max_bytes:
214+
return b[:max_bytes].decode("utf-8", errors="ignore")
215+
return text
216+
217+
218+
@mcp.tool()
219+
async def faxbot_pdf(pdf_path: str, to: str, header_text: str = "") -> str:
220+
"""Extract text from a PDF (with optional OCR fallback) and send as TXT fax.
221+
222+
Mirrors the Node `faxbot_pdf` prompt functionality for Python MCP.
223+
"""
224+
from .text_extract import extract_text_from_pdf
225+
226+
if not pdf_path:
227+
raise ValueError("pdf_path is required")
228+
abs_path = str(pathlib.Path(pdf_path).expanduser().resolve())
229+
if not os.path.exists(abs_path):
230+
raise ValueError(f"File not found: {abs_path}")
231+
if not abs_path.lower().endswith(".pdf"):
232+
raise ValueError("Only PDF input is supported")
233+
234+
text, used_ocr = extract_text_from_pdf(abs_path)
235+
if header_text and header_text.strip():
236+
text = f"{header_text.strip()}\n\n{text}"
237+
text = _normalize_and_truncate(text)
238+
239+
file_b64 = base64.b64encode(text.encode("utf-8")).decode("ascii")
240+
job = await api_send_fax(to, "extracted.txt", file_b64, "txt")
241+
method = "OCR" if used_ocr else "text extraction"
242+
return (
243+
f"Faxbot workflow initiated via {method}.\n\nPDF: {abs_path}\nJob ID: {job['id']}\nRecipient: {to}\n"
244+
f"Status: {job['status']}\n(Truncation may apply; adjust MAX_TEXT_SIZE if needed.)"
245+
)
246+
247+
209248
# Build underlying SSE ASGI app from FastMCP
210249
inner_app = mcp.sse_app()
211250

@@ -236,4 +275,3 @@ async def health(_request: Request):
236275
],
237276
)
238277
app.add_middleware(AuthMiddleware)
239-

python_mcp/stdio_server.py

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@
1414
"""
1515
import asyncio
1616
import os
17+
import base64
18+
import pathlib
1719
from typing import Optional, Dict, Any
1820

1921
import httpx
@@ -98,6 +100,50 @@ async def get_fax_status(jobId: str) -> str: # noqa: N803
98100
return "\n".join(parts)
99101

100102

103+
def _normalize_and_truncate(text: str) -> str:
104+
max_bytes = int(os.getenv("MAX_TEXT_SIZE", "100000"))
105+
b = text.encode("utf-8")
106+
if len(b) > max_bytes:
107+
return b[:max_bytes].decode("utf-8", errors="ignore")
108+
return text
109+
110+
111+
@mcp.tool()
112+
async def faxbot_pdf(pdf_path: str, to: str, header_text: str = "") -> str:
113+
"""Extract text from a PDF (with optional OCR fallback) and send as TXT fax.
114+
115+
Args:
116+
pdf_path: Absolute or relative path to a local PDF file
117+
to: Destination fax number
118+
header_text: Optional header text prepended to the content
119+
Returns:
120+
Confirmation string including job ID.
121+
"""
122+
from .text_extract import extract_text_from_pdf
123+
124+
if not pdf_path:
125+
raise ValueError("pdf_path is required")
126+
abs_path = str(pathlib.Path(pdf_path).expanduser().resolve())
127+
if not os.path.exists(abs_path):
128+
raise ValueError(f"File not found: {abs_path}")
129+
if not abs_path.lower().endswith(".pdf"):
130+
raise ValueError("Only PDF input is supported")
131+
132+
text, used_ocr = extract_text_from_pdf(abs_path)
133+
if header_text and header_text.strip():
134+
text = f"{header_text.strip()}\n\n{text}"
135+
text = _normalize_and_truncate(text)
136+
137+
# Encode as base64 TXT and send via existing API function
138+
file_b64 = base64.b64encode(text.encode("utf-8")).decode("ascii")
139+
job = await _api_send(to, "extracted.txt", file_b64, "txt")
140+
method = "OCR" if used_ocr else "text extraction"
141+
return (
142+
f"Faxbot workflow initiated via {method}.\n\nPDF: {abs_path}\nJob ID: {job['id']}\nRecipient: {to}\n"
143+
f"Status: {job['status']}\n(Truncation may apply; adjust MAX_TEXT_SIZE if needed.)"
144+
)
145+
146+
101147
def main() -> None:
102148
# FastMCP provides stdio runner; prefer run() or run_stdio() depending on version
103149
runner = None
@@ -116,4 +162,3 @@ def main() -> None:
116162

117163
if __name__ == "__main__":
118164
main()
119-

0 commit comments

Comments
 (0)