Your Spring-AI-Topic-RAG project is a production-ready Retrieval-Augmented Generation (RAG) system built with Spring AI, Ollama, and Qdrant. Tthis project demonstrates an advanced implementation of AI-powered document processing and semantic search.
- Multi-Topic RAG System: Separate, isolated RAG instances for different domains (Pentesting, IoT, Blockchain, Cloud, etc.)
- Semantic Search: Intelligent document retrieval using vector embeddings
- Document Processing: Support for PDF and Markdown files with automatic metadata extraction
- Local LLM Processing: Runs entirely locally using Ollama (no external API dependencies)
- Vector Storage: Efficient semantic search using Qdrant
- Easy Extensibility: Simple configuration to add new topics
- Framework: Spring Boot 3.5.8 with Spring AI 1.1.2
- Language: Java 21
- LLM: Ollama (local language model)
- Vector Database: Qdrant
- Document Processing: Apache Tika & PDFBox
- Communication: gRPC for Qdrant client
- Build Tool: Maven
RAG is a technique that combines:
- Retrieval: Finding relevant documents from a knowledge base
- Augmentation: Using those documents to enhance the AI's response
- Generation: Creating an answer based on both the user's question and the retrieved documents
Think of it as giving an AI assistant a library of books before asking questions!
Code
Your PDF/Markdown Files
↓
Tika & PDFBox (read files)
↓
Extract Text & Metadata
- The system reads your documents and extracts text content
- Metadata (author, creation date, etc.) is automatically captured
- Documents are split into manageable chunks
Code
Document Text
↓
Ollama (local LLM)
↓
Convert to Vectors (numerical representation)
- Each document chunk is converted into a vector (a list of numbers)
- These vectors capture the semantic meaning of the text
- Similar documents have similar vectors
Code
Document Vectors
↓
Qdrant Vector Database
↓
Organized & Searchable Index
- Vectors are stored in Qdrant for fast retrieval
- Organized by topic to keep domains separate
- Enables quick semantic search
Code
User Question
↓
Convert to Vector (same way as documents)
↓
Search Qdrant for Similar Vectors
↓
Retrieve Top Matching Documents
- When a user asks a question, it's converted to a vector
- The system finds documents with similar vectors
- Only relevant documents are retrieved
Code
User Question + Retrieved Documents
↓
Ollama (local LLM)
↓
Generate Intelligent Answer
- The LLM reads the retrieved documents
- It generates an answer grounded in those specific documents
- Response is more accurate and verifiable
Instead of one large database, you have separate RAG systems for different domains:
Code
Spring AI Application
├── Pentesting RAG ──→ Pentesting Documents → Pentesting Vector Store
├── IoT RAG ──────────→ IoT Documents → IoT Vector Store
Benefits:
- Better semantic relevance (avoids mixing unrelated domains)
- Faster searches (smaller databases)
- Easy to manage (add/remove topics independently)
Before running this project, you need:
| Requirement | Purpose |
|---|---|
| Java 17+ | Run the Spring application |
| Maven | Build & manage dependencies |
| Docker & Docker Compose | Run containerized services |
| Ollama | Local language model (runs AI locally) |
| Qdrant | Vector database (stores & searches vectors) |
| Technology | Role |
|---|---|
| Spring Boot | Web framework for the application |
| Spring AI | Abstraction layer for AI/ML operations |
| Ollama | Runs large language models locally (privacy-first) |
| Qdrant | Specialized database for vector similarity search |
| Apache Tika | Extracts text from various document formats |
| PDFBox | Reads PDF files and metadata |
| gRPC | Fast communication protocol between services |
Scenario: You have PDFs about cloud security
- Upload PDFs → System reads and chunks them
- Index → Each chunk gets converted to vectors and stored in Qdrant
- User Asks → "What are cloud security best practices?"
- Search → Finds relevant chunks about cloud security
- Generate → Ollama writes an answer based on those chunks
- Return → User gets an accurate, sourced answer
✅ Private: Everything runs locally, no data sent to external APIs
✅ Accurate: Answers are grounded in your actual documents
✅ Customizable: Add any topic/domain you need
✅ Scalable: Separate topics mean independent scaling
✅ Modern: Uses cutting-edge Spring AI framework