GitHub - polymons/MCP-Security-Proxy: A project to test MCP voulnerabilities and defense strategies in a cloud-native Dockerized environment. This repository is part of my Bsc thesis in computer engineering at Óbuda University.

Project Overview

This is my thesis research project implementing a transparent security proxy for the Model Context Protocol (MCP) using ensemble anomaly detection with DistilBERT embeddings and prototypical learning. The system validates all requests between Language Models and MCP tools using binary classification (Benign vs Attack).

Core Architecture: Transparent Security Proxy with Binary Classification

Key Design Decisions:

Binary Classification: BENIGN (allow) vs ATTACK (block) for simplicity and accuracy
Fail-Safe Behavior: Ambiguous cases default to ATTACK to minimize false negatives
Margin-Based Classification: Benign requires clear margin to allow request
Ensemble Approach: Multiple detectors with weighted voting for robust detection
MCP Protocol Compliance: Transparent proxy compatible with any standard MCP server

Essential Commands

Docker Operations

# Start all services
docker compose up -d --build

# View bridge logs (primary debugging)
docker compose logs -f mcp-bridge

# Restart bridge after code changes
docker compose restart mcp-bridge

# Stop all services
docker compose down

Testing

# Interactive client testing
docker exec -it mcp-client python agent.py

# Direct tool testing (JSON-RPC)
docker exec -it mcp-client python test_tools.py

# Load/traffic generation
docker exec -it mcp-client python traffic_generator.py --samples 100 --ratio 0.3 --concurrency 10

# Latency benchmarking
docker exec -it mcp-client python latency_test.py

Health Checks

# Check bridge status
curl http://localhost:8000/health

# Check local LLM service
curl http://localhost:8080/health

Research/Model Training

# Standalone model training
python research/tools/train_models.py

# Train with MAML meta-learning
python research/tools/train_models.py --maml --maml-epochs 100

# Access Jupyter notebooks
cd research/notebooks && jupyter notebook

Architecture Overview

Network Topology

mcp-public: External access
mcp-secure: Internal only (internal: true). Tools have NO internet access.

Component Responsibilities

mcp_bridge/ - FastAPI security proxy (port 8000)

Main entry point: mcp_bridge/src/main.py
Configuration: mcp_bridge/src/config.py
Transparent MCP proxy intercepting JSON-RPC 2.0 requests
Dynamic tool discovery via tools/list on configured MCP servers
Security inspection using BinaryMCPDetector ensemble
Request routing via dynamic tool_name → server_url mapping
MCP 2024-11-05 protocol support (initialize, ping, tools/list, resources/list, prompts/list)

mcp_tools/ - Example MCP servers (can be replaced with any MCP-compliant server)

filesystem/server.py: File operations (read_file, write_file, list_directory)
sqlite/server.py: Database queries (read_query, write_query, list_tables, describe_table)
time/server.py: Timezone operations (get_current_time, convert_time)
fetch/server.py: URL fetching
memory/server.py: Knowledge graph operations
All implement MCP 2024-11-05 spec with tools/list, tools/call, isError field

llm_service/ - Local LLM inference (port 8080)

Uses llama-cpp-python with GGUF models
GPU-accelerated when NVIDIA drivers available
OpenAI-compatible /v1/chat/completions endpoint

llm_cloud_service/ - Cloud LLM wrapper (port 8081)

OpenAI/Gemini API adapter
Unified /v1/chat/completions interface

client/ - Testing utilities

agent.py: Interactive chat with tool calling
traffic_generator.py: Load testing and attack simulation
test_tools.py: Direct JSON-RPC validation
latency_test.py: Performance benchmarking

research/ - Thesis data and analysis

data/: Datasets (training_dataset.json, test_dataset.json, validation_dataset.json)
trained_models/: Pre-trained detectors (semantic_model.pt, statistical_model.pt, maml_model.pt)
notebooks/: Scientific evaluation (model_comparisons.ipynb, semantic_evaluation.ipynb, etc.)
tools/: Research utilities (train_models.py, verify_datasets.py, etc.)

Detection System Architecture

Binary Security Model

Two classes:

BENIGN (Allow) - Safe requests matching known good patterns
ATTACK (Block) - All threat types: injections, traversals, anomalies, policy violations

Detection Pipeline

The BinaryMCPDetector (mcp_bridge/src/core/detector.py) coordinates these detectors:

RuleBasedDetector (mcp_bridge/src/core/detectors/rule_based.py)
- Regex patterns for known attacks (path traversal, SQL injection, command injection, SSRF, XSS)
- High precision, returns immediately on pattern match
- Returns None to defer to other detectors if no match
StatisticalFeatureDetector (mcp_bridge/src/core/detectors/statistical.py)
- Feature extraction: length, entropy, special char ratios, suspicious pattern density
- Mahalanobis distance with pooled covariance
- Supports LDA (homoscedastic) and QDA (heteroscedastic) modes
- Data-driven margin thresholds
SemanticDetector (mcp_bridge/src/core/detectors/semantic.py)
- DistilBERT embeddings (768-dim) with security feature augmentation
- Prototypical learning with cosine similarity
- Data-driven margin-based classification
- Out-of-distribution detection
MAMLDetector (mcp_bridge/src/core/detectors/maml.py) - Optional
- Model-Agnostic Meta-Learning for few-shot adaptation
- Enable with MAML_ENABLED=true

Ensemble Strategy

Default Weights (empirically tuneable):

Rule-Based: 0.40 (high precision)
Semantic: 0.35 (best generalization)
Statistical: 0.15 (anomaly detection)
MAML: 0.10 (if enabled)

Features:

Platt scaling / temperature calibration for confidence scores
Fail-safe tie-breaking: margin < 10% defaults to ATTACK
Weight adjustment via provide_feedback() with labeled samples

Key Code Locations

Request Flow

Request arrives at mcp_bridge/src/main.py:243 /jsonrpc endpoint
MCP protocol methods handled (initialize, ping, tools/list, etc.)
For tools/call: payload extracted and flattened
Detection runs via BinaryMCPDetector.predict() in threadpool
If blocked: JSON-RPC error with class/confidence
If allowed: forwarded via mcp_bridge/src/services/forwarder.py

Detector Facade

mcp_bridge/src/core/detector.py:

BinaryMCPDetector: Singleton facade for all detectors
fit(): Train detectors on labeled samples
predict(): Ensemble prediction
save_models() / load_models(): Model persistence
provide_feedback(): Supervised feedback for weight adjustment

Data Models

mcp_bridge/src/models.py:

JsonRpcRequest: JSON-RPC 2.0 request validation
JsonRpcResponse: Response formatting
SecurityClass: Enum for BENIGN/ATTACK

mcp_bridge/src/core/detectors/base.py:

DetectionResult: Unified result format
BaseDetector: Abstract base class

Audit Logging

mcp_bridge/src/core/utils.py:

log_thesis_data(): JSONL logging for analysis
Rotating log files (10MB per file, 5 backups)

Environment Variables

Bridge Configuration

LOG_LEVEL=INFO
DEBUG=false
MODEL_NAME=distilbert-base-uncased
DETECTOR_SIGMA=3.0

# MCP Server Discovery
MCP_SERVERS=http://tool-filesystem:8080,http://tool-sqlite:8080,http://tool-time:8080,http://tool-fetch:8080,http://tool-memory:8080

# Model Paths
RESEARCH_DATA_PATH=/app/research_data
TRAINING_DATA_FILE=training_dataset.json
SEMANTIC_MODEL_PATH=/app/trained_models/semantic_model.pt
STATISTICAL_MODEL_PATH=/app/trained_models/statistical_model.pt
MAML_MODEL_PATH=/app/trained_models/maml_model.pt
AUDIT_LOG_PATH=/app/research_data/runtime_audit.jsonl

MAML Configuration (Optional)

MAML_ENABLED=false
MAML_META_LR=0.001
MAML_INNER_LR=0.01
MAML_ADAPTATION_STEPS=5
MAML_FIRST_ORDER=true
MAML_SHOTS=5
MAML_HIDDEN_DIM=256
MAML_CONFIDENCE_THRESHOLD=0.6
MAML_NUM_META_EPOCHS=100

LLM Service Configuration

LLM_MODEL_PATH=/app/models/llama-2-7b-chat.Q4_K_M.gguf
LLM_N_GPU_LAYERS=-1
LLM_N_CTX=4096
LLM_CHAT_FORMAT=llama-2

# Cloud LLM
CLOUD_LLM_PROVIDER=openai  # or 'gemini'
CLOUD_OPENAI_API_KEY=...
CLOUD_GOOGLE_API_KEY=...

Development Patterns

Async/FastAPI

Use async def for all handlers
CPU-bound operations (ML inference) run in threadpool: await run_in_threadpool(detector.predict, ...)
Tool communication uses httpx.AsyncClient with connection pooling

Adding New MCP Servers

Add server to docker compose.yml on mcp-secure network
Add URL to MCP_SERVERS env var (comma-separated)
Restart bridge - auto-discovers tools via tools/list

Training Data Format

Tool-keyed dict with "benign" and "attack" keys:

{
  "read_file": {
    "benign": ["{\"path\": \"docs/readme.txt\"}", ...],
    "attack": ["{\"path\": \"../../../etc/passwd\"}", ...]
  }
}

Singleton Pattern

BinaryMCPDetector uses singleton pattern to avoid reloading models. Use BinaryMCPDetector.reset_instance() to recover from initialization failures or for testing.

Important Constraints

MCP Protocol Compliance: Bridge must not break MCP compatibility. Standard JSON-RPC 2.0 requests pass through unmodified (except when blocked).
Network Isolation: Tools on mcp-secure network must be in an isolated network.
Binary Classification: System uses 2-class model (BENIGN vs ATTACK). All threat types merged into ATTACK.
Zero-Trust for New Tools: Tools without training data default-deny all requests.
Async Patterns: The MCP bridge acts as a bottleneck if not properly using async patterns. Never run blocking operations in FastAPI handlers without threadpool/async wrapper.

Troubleshooting

Common Issues

Bridge fails to start / "No training data found"

ls research/data/training_dataset.json
# If missing, runs in ZERO-SHOT mode (high false positives)

Tools not discovered

docker compose ps
docker compose logs mcp-bridge | grep -i "discovery\|registered"
curl -X POST http://localhost:8000/jsonrpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'

All requests blocked (high false positives)

# Increase sigma for lenient detection
DETECTOR_SIGMA=4.0 docker compose up -d mcp-bridge

Model retraining

rm research/trained_models/*.pt
docker compose restart mcp-bridge

Debugging Commands

# Real-time logs
docker compose logs -f mcp-bridge

# Test MCP initialize
curl -X POST http://localhost:8000/jsonrpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"initialize","params":{"clientInfo":{"name":"debug","version":"1.0"}},"id":1}'

# Test tool call (should be blocked if attack)
curl -X POST http://localhost:8000/jsonrpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"read_file","arguments":{"path":"../../../etc/passwd"}},"id":1}'

# View audit logs
tail -f research/data/runtime_audit.jsonl

Research Context

Scientific evaluation notebooks in research/notebooks/:

model_comparisons.ipynb: ROC curves, precision-recall, per-detector analysis
semantic_evaluation.ipynb: Semantic detector deep-dive
statistical_evaluation.ipynb: Statistical detector analysis
maml_evaluation.ipynb: MAML few-shot learning evaluation
rule_based_evaluation.ipynb: Rule-based detector analysis
dataset_validation.ipynb: Dataset integrity checks

FAQ

Q: How do I add a new MCP server? A: Add URL to MCP_SERVERS env var, restart bridge. It auto-discovers tools.

Q: How do I train on new tools? A: Add samples to research/data/training_dataset.json, delete cached models, restart bridge.

Q: Why binary classification? A: Simpler, faster, more accurate. All threats merged into "attack" eliminates ambiguity.

Q: How do I lower false positives? A: Increase DETECTOR_SIGMA (default: 3.0). Higher = more lenient.

Q: Can I use this with Claude Desktop? A: Yes. Point MCP client to http://localhost:8000. Bridge is MCP 2024-11-05 compliant.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
client		client
llm_cloud_service		llm_cloud_service
llm_service		llm_service
mcp_bridge		mcp_bridge
mcp_tools		mcp_tools
research		research
sandbox_files		sandbox_files
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
docker-compose.yml		docker-compose.yml
documentation.md		documentation.md
readme.md		readme.md
requirements.txt		requirements.txt

polymons/MCP-Security-Proxy

Folders and files

Latest commit

History

Repository files navigation