RAG Python Demo

A production-ready Retrieval-Augmented Generation (RAG) server implementing the Model Context Protocol (MCP).

Benchmark Score: 855/1000 (A+) - See BENCHMARKS.md

Features

Hybrid Search: Combines semantic embeddings with BM25 keyword search
Cross-Encoder Reranking: Uses mmarco-mMiniLMv2-L12-H384-v1 for relevance scoring
MMR Diversity: Maximal Marginal Relevance to avoid redundant results
Query Expansion: Automatic query reformulation for better recall
Contextual Retrieval: Chunk enrichment with document context
Semantic Cache: LRU cache with embedding similarity for faster responses
PDF Support: Extract and index PDF documents with header metadata
Time-Weighted Scoring: Boost recent documents when relevant

Tech Stack

Component	Technology
Embeddings	`paraphrase-multilingual-MiniLM-L12-v2`
Reranker	`cross-encoder/mmarco-mMiniLMv2-L12-H384-v1`
Vector DB	ChromaDB (persistent)
Protocol	MCP (Model Context Protocol)
Runtime	Docker

Architecture

rag-server/
├── server.py              # MCP server & tool definitions
├── search.py              # Search orchestration
├── hybrid_search.py       # BM25 + semantic fusion
├── reranker.py            # Cross-encoder + MMR
├── chunker.py             # Document chunking strategies
├── query_expander.py      # Query expansion module
├── contextual_retrieval.py # Chunk enrichment
├── semantic_cache.py      # Embedding-based cache
├── pdf_processor.py       # PDF extraction
├── indexing.py            # Document indexing
├── maintenance.py         # Cleanup & optimization
└── config.py              # Configuration

Quick Start

# Clone the repository
git clone https://github.com/kurt83340/RagPythonDemo.git
cd RagPythonDemo

# Create knowledge folder and add your documents
mkdir -p knowledge
cp your-docs/*.md knowledge/

# Start the server
docker compose up

See USAGE.md for complete setup guide, document optimization tips, and examples.

MCP Tools

Tool	Description
`search(query)`	Hybrid search with reranking
`search_by_date(from, to)`	Temporal filtering
`get_related(file)`	Find similar documents
`reindex()`	Rebuild search index
`get_stats()`	System statistics

Configuration

Key parameters in config.py:

TOP_K: Number of results (default: 10)
MMR_LAMBDA: Diversity vs relevance trade-off (0-1)
CACHE_TTL: Cache expiration time
CHUNK_SIZE: Document chunk size

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
rag-server		rag-server
.gitignore		.gitignore
BENCHMARKS.md		BENCHMARKS.md
README.md		README.md
USAGE.md		USAGE.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Python Demo

Features

Tech Stack

Architecture

Quick Start

MCP Tools

Configuration

License

About

Uh oh!

Releases

Packages

Languages

kurt83340/RagPythonDemo

Folders and files

Latest commit

History

Repository files navigation

RAG Python Demo

Features

Tech Stack

Architecture

Quick Start

MCP Tools

Configuration

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages