Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 21 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,16 @@
**Web search and content extraction for AI models via Model Context Protocol (MCP)**

[![Version](https://img.shields.io/badge/version-2.2.0-blue.svg)](https://github.com/Kode-Rex/webcat)
[![Docker](https://img.shields.io/badge/docker-ready-brightgreen.svg)](https://hub.docker.com/r/tmfrisinger/webcat)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

## Quick Start

```bash
# Run WebCat with Docker (30 seconds to working demo)
docker run -p 8000:8000 tmfrisinger/webcat:latest
cd docker
python -m pip install -e ".[dev]"

# Start demo server with UI
python simple_demo.py

# Open demo client
open http://localhost:8000/demo
Expand All @@ -30,40 +32,22 @@ Built with **FastAPI** and **FastMCP** for seamless AI integration.

## Features

- ✅ **No Authentication Required** - Simple setup
- ✅ **Optional Authentication** - Bearer token auth when needed, or run without
- ✅ **Automatic Fallback** - Serper API → DuckDuckGo if needed
- ✅ **Smart Content Extraction** - Trafilatura removes navigation/ads/chrome
- ✅ **MCP Compliant** - Works with Claude Desktop, LiteLLM, etc.
- ✅ **Rate Limited** - Configurable protection
- ✅ **Docker Ready** - One command deployment
- ✅ **Parallel Processing** - Fast concurrent scraping

## Installation & Usage

### Docker (Recommended)

```bash
# With Serper API (best results)
docker run -p 8000:8000 -e SERPER_API_KEY=your_key tmfrisinger/webcat:2.2.0

# Free tier (DuckDuckGo only)
docker run -p 8000:8000 tmfrisinger/webcat:2.2.0

# Custom configuration
docker run -p 9000:9000 \
-e PORT=9000 \
-e SERPER_API_KEY=your_key \
-e RATE_LIMIT_WINDOW=60 \
-e RATE_LIMIT_MAX_REQUESTS=10 \
tmfrisinger/webcat:2.2.0
```

### Local Development

```bash
cd docker
python -m pip install -e ".[dev]"

# Configure environment (optional)
echo "SERPER_API_KEY=your_key" > .env

# Start MCP server
python mcp_server.py

Expand All @@ -88,6 +72,7 @@ python simple_demo.py
| Variable | Default | Description |
|----------|---------|-------------|
| `SERPER_API_KEY` | *(none)* | Serper API key for premium search (optional) |
| `WEBCAT_API_KEY` | *(none)* | Bearer token for authentication (optional, if set all requests must include `Authorization: Bearer <token>`) |
| `PORT` | `8000` | Server port |
| `LOG_LEVEL` | `INFO` | Logging level (DEBUG, INFO, WARNING, ERROR) |
| `LOG_DIR` | `/tmp` | Log file directory |
Expand All @@ -99,7 +84,17 @@ python simple_demo.py
1. Visit [serper.dev](https://serper.dev)
2. Sign up for free tier (2,500 searches/month)
3. Copy your API key
4. Pass to Docker: `-e SERPER_API_KEY=your_key`
4. Add to `.env` file: `SERPER_API_KEY=your_key`

### Enable Authentication (Optional)

To require bearer token authentication for all MCP tool calls:

1. Generate a secure random token: `openssl rand -hex 32`
2. Add to `.env` file: `WEBCAT_API_KEY=your_token`
3. Include in all requests: `Authorization: Bearer your_token`

**Note:** If `WEBCAT_API_KEY` is not set, no authentication is required.

## MCP Tools

Expand Down Expand Up @@ -216,7 +211,6 @@ MIT License - see [LICENSE](LICENSE) file for details.
## Links

- **GitHub:** [github.com/Kode-Rex/webcat](https://github.com/Kode-Rex/webcat)
- **Docker Hub:** [hub.docker.com/r/tmfrisinger/webcat](https://hub.docker.com/r/tmfrisinger/webcat)
- **MCP Spec:** [modelcontextprotocol.io](https://modelcontextprotocol.io)
- **Serper API:** [serper.dev](https://serper.dev)

Expand Down
17 changes: 17 additions & 0 deletions docker/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Serper API key for premium search (optional)
# If not set, DuckDuckGo fallback will be used
SERPER_API_KEY=

# WebCat API key for bearer token authentication (optional)
# If set, all requests must include: Authorization: Bearer <token>
# If not set, no authentication is required
WEBCAT_API_KEY=

# Server configuration
PORT=8000
LOG_LEVEL=INFO
LOG_DIR=/tmp

# Rate limiting
RATE_LIMIT_WINDOW=60
RATE_LIMIT_MAX_REQUESTS=10
4 changes: 1 addition & 3 deletions docker/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"""Constants for WebCat application."""

# Application version
VERSION = "2.3.0"
VERSION = "2.3.1"

# Service information
SERVICE_NAME = "WebCat MCP Server"
Expand All @@ -19,8 +19,6 @@
"Content extraction and scraping",
"Markdown conversion",
"FastMCP protocol support",
"SSE streaming",
"Demo UI client",
]

# Content limits
Expand Down
6 changes: 0 additions & 6 deletions docker/endpoints/health_endpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
from fastapi import FastAPI
from fastapi.responses import JSONResponse

from endpoints.demo_client import serve_demo_client
from models.health_responses import (
get_detailed_status,
get_health_status,
Expand All @@ -34,11 +33,6 @@ async def health_check():
logger.error(f"Health check failed: {str(e)}")
return JSONResponse(status_code=500, content=get_unhealthy_status(str(e)))

@app.get("/demo")
async def sse_client():
"""Serve the WebCat SSE demo client."""
return serve_demo_client()

@app.get("/status")
async def server_status():
"""Detailed server status endpoint."""
Expand Down
9 changes: 3 additions & 6 deletions docker/models/responses/health_responses.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,9 @@ def get_server_configuration() -> dict:
def get_server_endpoints() -> dict:
"""Get server endpoints dictionary."""
return {
"main_mcp": "/mcp",
"sse_demo": "/sse",
"mcp": "/mcp",
"health": "/health",
"status": "/status",
"demo_client": "/demo",
}


Expand Down Expand Up @@ -81,10 +79,9 @@ def get_root_info() -> dict:
"version": VERSION,
"description": "Web search and content extraction with MCP protocol support",
"endpoints": {
"demo_client": "/demo",
"mcp": "/mcp",
"health": "/health",
"status": "/status",
"mcp_sse": "/mcp",
},
"documentation": "Access /demo for the demo interface",
"documentation": "MCP server - connect via SSE at /mcp/sse endpoint",
}
113 changes: 11 additions & 102 deletions docker/simple_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,28 +4,19 @@
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

"""Simplified demo server that combines health and SSE endpoints in one FastAPI app."""
"""Simplified demo server with FastMCP integration."""

import asyncio
import logging
import os
import tempfile
import time

import uvicorn
from dotenv import load_dotenv
from fastapi import FastAPI, Query
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import StreamingResponse
from fastmcp import FastMCP

from api_tools import create_webcat_functions, setup_webcat_tools
from demo_utils import (
format_sse_message,
get_server_info,
handle_health_operation,
handle_search_operation,
)
from health import setup_health_endpoints

# Load environment variables
Expand All @@ -44,73 +35,14 @@
logger.setLevel(getattr(logging, LOG_LEVEL))


async def _generate_webcat_stream(
webcat_functions, operation: str, query: str, max_results: int
):
"""Generate SSE stream for WebCat operations.

Args:
webcat_functions: Dictionary of WebCat functions
operation: Operation to perform
query: Search query
max_results: Maximum results

Yields:
SSE formatted messages
"""
try:
# Send connection message
yield format_sse_message(
"connection",
status="connected",
message="WebCat stream started",
operation=operation,
)

if operation == "search" and query:
search_func = webcat_functions.get("search")
if search_func:
async for msg in handle_search_operation(
search_func, query, max_results
):
yield msg
else:
yield format_sse_message(
"error", message="Search function not available"
)

elif operation == "health":
health_func = webcat_functions.get("health_check")
async for msg in handle_health_operation(health_func):
yield msg

else:
# Just connection - send server info
yield format_sse_message("data", data=get_server_info())
yield format_sse_message("complete", message="Connection established")

# Keep alive with heartbeat
heartbeat_count = 0
while True:
await asyncio.sleep(30)
heartbeat_count += 1
yield format_sse_message(
"heartbeat", timestamp=time.time(), count=heartbeat_count
)

except Exception as e:
logger.error(f"Error in SSE stream: {str(e)}")
yield format_sse_message("error", message=str(e))


def create_demo_app():
"""Create a single FastAPI app with all endpoints."""

# Create FastAPI app with CORS middleware
app = FastAPI(
title="WebCat MCP Demo Server",
description="WebCat server with FastMCP integration and SSE streaming demo",
version="2.2.0",
title="WebCat MCP Server",
description="WebCat server with FastMCP integration",
version="2.3.1",
)

app.add_middleware(
Expand All @@ -131,31 +63,10 @@ def create_demo_app():
webcat_functions = create_webcat_functions()
setup_webcat_tools(mcp_server, webcat_functions)

# Add custom SSE endpoint for demo
@app.get("/sse")
async def webcat_stream(
operation: str = Query(
"connect", description="Operation to perform: connect, search, health"
),
query: str = Query("", description="Search query for search operations"),
max_results: int = Query(5, description="Maximum number of search results"),
):
"""Stream WebCat functionality via SSE"""
return StreamingResponse(
_generate_webcat_stream(webcat_functions, operation, query, max_results),
media_type="text/event-stream",
headers={
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers": "*",
},
)

# Mount FastMCP server as a sub-application (like Clima project)
# Mount FastMCP server
app.mount("/mcp", mcp_server.sse_app())

logger.info("FastAPI app configured with SSE and FastMCP integration")
logger.info("FastAPI app configured with FastMCP integration")
return app


Expand All @@ -166,20 +77,18 @@ def run_simple_demo(host: str = "0.0.0.0", port: int = 8000):
app = create_demo_app()

# Log endpoints
logger.info(f"WebCat Demo Server: http://{host}:{port}")
logger.info(f"SSE Demo Endpoint: http://{host}:{port}/sse")
logger.info(f"WebCat MCP Server: http://{host}:{port}")
logger.info(f"FastMCP Endpoint: http://{host}:{port}/mcp")
logger.info(f"Health Check: http://{host}:{port}/health")
logger.info(f"Demo Client: http://{host}:{port}/demo")
logger.info(f"Server Status: http://{host}:{port}/status")

print("\n🐱 WebCat MCP Demo Server Starting...")
print("\n🐱 WebCat MCP Server Starting...")
print(f"📡 Server: http://{host}:{port}")
print(f"🔗 SSE Demo: http://{host}:{port}/sse")
print(f"🛠️ FastMCP: http://{host}:{port}/mcp")
print(f"🛠️ MCP Endpoint: http://{host}:{port}/mcp")
print(f"💗 Health: http://{host}:{port}/health")
print(f"🎨 Demo UI: http://{host}:{port}/demo")
print(f"📊 Server Status: http://{host}:{port}/status")
print(f"📊 Status: http://{host}:{port}/status")
print("\n✨ Ready for connections!")

# Run the server
Expand Down
Loading
Loading