Web search and content extraction for AI models via Model Context Protocol (MCP)
cd docker
python -m pip install -e ".[dev]"
# Start demo server with UI
python simple_demo.py
# Open demo client
open http://localhost:8000/demoWebCat is an MCP (Model Context Protocol) server that provides AI models with:
- 🔍 Web Search - Serper API (premium) or DuckDuckGo (free)
- 📄 Content Extraction - Clean markdown conversion with Trafilatura
- 🌐 SSE Streaming - Real-time results via Server-Sent Events
- 🎨 Demo UI - Interactive testing interface
Built with FastAPI and FastMCP for seamless AI integration.
- ✅ Optional Authentication - Bearer token auth when needed, or run without
- ✅ Automatic Fallback - Serper API → DuckDuckGo if needed
- ✅ Smart Content Extraction - Trafilatura removes navigation/ads/chrome
- ✅ MCP Compliant - Works with Claude Desktop, LiteLLM, etc.
- ✅ Rate Limited - Configurable protection
- ✅ Parallel Processing - Fast concurrent scraping
cd docker
python -m pip install -e ".[dev]"
# Configure environment (optional)
echo "SERPER_API_KEY=your_key" > .env
# Start MCP server
python mcp_server.py
# Or start demo server with UI
python simple_demo.py| Endpoint | Description |
|---|---|
http://localhost:8000/demo |
🎨 Interactive demo UI |
http://localhost:8000/health |
💗 Health check |
http://localhost:8000/status |
📊 Server status |
http://localhost:8000/mcp |
🛠️ MCP protocol endpoint |
http://localhost:8000/sse |
🔗 SSE streaming |
| Variable | Default | Description |
|---|---|---|
SERPER_API_KEY |
(none) | Serper API key for premium search (optional) |
WEBCAT_API_KEY |
(none) | Bearer token for authentication (optional, if set all requests must include Authorization: Bearer <token>) |
PORT |
8000 |
Server port |
LOG_LEVEL |
INFO |
Logging level (DEBUG, INFO, WARNING, ERROR) |
LOG_DIR |
/tmp |
Log file directory |
RATE_LIMIT_WINDOW |
60 |
Rate limit window in seconds |
RATE_LIMIT_MAX_REQUESTS |
10 |
Max requests per window |
- Visit serper.dev
- Sign up for free tier (2,500 searches/month)
- Copy your API key
- Add to
.envfile:SERPER_API_KEY=your_key
To require bearer token authentication for all MCP tool calls:
- Generate a secure random token:
openssl rand -hex 32 - Add to
.envfile:WEBCAT_API_KEY=your_token - Include in all requests:
Authorization: Bearer your_token
Note: If WEBCAT_API_KEY is not set, no authentication is required.
WebCat exposes these tools via MCP:
| Tool | Description | Parameters |
|---|---|---|
search |
Search web and extract content | query: str, max_results: int |
scrape_url |
Scrape specific URL | url: str |
health_check |
Check server health | (none) |
get_server_info |
Get server capabilities | (none) |
MCP Client (Claude, LiteLLM)
↓
FastMCP Server (SSE Transport)
↓
Search Decision
├─ Serper API (premium) → Content Scraper
└─ DuckDuckGo (free) → Content Scraper
↓
Trafilatura (markdown)
↓
Structured Response
cd docker
# Run all tests
python -m pytest tests/unit -v
# With coverage
python -m pytest tests/unit --cov=. --cov-report=term --cov-report=html
# CI-safe (no external dependencies)
python -m pytest -v -m "not integration"Current test coverage: 70%+ across all modules
# Install with dev dependencies
pip install -e ".[dev]"
# Format code
make format
# Lint code
make lint
# Run tests
make test
# Full CI check
make cidocker/
├── mcp_server.py # Main MCP server
├── simple_demo.py # Demo server with UI
├── clients/ # Serper & DuckDuckGo clients
├── services/ # Content scraping & search
├── tools/ # MCP tool implementations
├── models/ # Pydantic data models
│ ├── domain/ # Domain entities
│ └── responses/ # API responses
├── endpoints/ # FastAPI endpoints
└── tests/ # Comprehensive test suite
| Feature | Serper API | DuckDuckGo |
|---|---|---|
| Cost | Paid (free tier available) | Free |
| Quality | ⭐⭐⭐⭐⭐ Excellent | ⭐⭐⭐⭐ Good |
| Coverage | Comprehensive (Google-powered) | Standard |
| Speed | Fast | Fast |
| Rate Limits | 2,500/month (free tier) | None |
- Text-focused: Optimized for article content, not multimedia
- Rate limits: Respects configured limits to prevent abuse
- No JavaScript: Cannot scrape dynamic JS-rendered content
- PDF support: Detection only, not full extraction
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure
make cipasses - Submit a Pull Request
See CLAUDE.md for development guidelines and architecture standards.
MIT License - see LICENSE file for details.
- GitHub: github.com/Kode-Rex/webcat
- MCP Spec: modelcontextprotocol.io
- Serper API: serper.dev
Version 2.2.0 | Built with ❤️ using FastMCP, FastAPI, and Trafilatura
