Skip to content

chao-dotcom/RAGh-Tutor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RAGh-Tutor - Java Edition

A production-ready Retrieval-Augmented Generation (RAG) system built with Java 17 and Spring Boot 3.

Note: This is the Java implementation. For the Python version, see legacy Python README.

πŸŽ₯ Demo

RAGh-Tutor Demo

Click the thumbnail above to watch the demo video!

✨ Features

  • πŸ” Vector Search: In-memory vector store with cosine similarity search
  • πŸ“„ Document Processing: PDF, TXT, Markdown support with Apache PDFBox
  • πŸ€– Multiple LLM Providers: OpenAI GPT-4, Anthropic Claude
  • πŸ’¬ Conversation Memory: Context-aware multi-turn conversations
  • ⚑ Streaming Responses: Server-Sent Events (SSE) for real-time streaming
  • πŸ“Š Metrics & Monitoring: Prometheus/Grafana integration with Micrometer
  • πŸ”’ Security: Rate limiting, content moderation, audit logging
  • πŸš€ Performance: Response caching, batch processing, performance profiling
  • πŸ“ˆ Analytics: Query tracking, usage statistics, popular queries
  • 🐳 Docker Support: Full containerization with Docker Compose
  • πŸ“š API Documentation: Interactive OpenAPI/Swagger UI
  • πŸ§ͺ Testing: JUnit 5, Mockito, comprehensive test coverage

πŸš€ Quick Start

Prerequisites

  • Java 17 or higher
  • Maven 3.6+
  • OpenAI or Anthropic API key

Installation

Option 1: Local Development

# 1. Set your API key
export OPENAI_API_KEY="your-api-key-here"

# 2. Build the project
mvn clean install

# 3. Run the application
mvn spring-boot:run

# 1. Add documents to the documents/ folder
cp your-document.pdf documents/

# 2. Index documents
curl -X POST http://localhost:8000/api/v1/index

# 3. Make your first query
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What is retrieval augmented generation?", "topK": 5}'

πŸ“– Documentation

πŸ› οΈ Configuration

Edit src/main/resources/application.properties:

# Server
server.port=8000

# LLM Settings
llm.provider=openai
llm.openai.api-key=${OPENAI_API_KEY}
llm.openai.model=gpt-4
llm.temperature=0.7
llm.max-tokens=2000

# Embedding
embedding.model=sentence-transformers/all-mpnet-base-v2
embedding.dimension=768

# Retrieval
retrieval.top-k=10
retrieval.mode=hybrid

# Chunking
chunking.size=800
chunking.overlap=200

# Security
security.rate-limit.enabled=true
security.rate-limit.requests=100

πŸ“‘ API Endpoints

Core Endpoints

  • GET /api/v1/health - Health check
  • GET /api/v1/health/detailed - Detailed health with component status
  • GET /api/v1/ready - Kubernetes readiness probe
  • POST /query - Query the knowledge base
  • POST /query/multi-document - Query across multiple specific documents
  • POST /stream - Streaming query (Server-Sent Events)
  • GET /docs - Interactive API documentation (Swagger UI)

Document Management

  • POST /documents/upload - Upload and index a new document
  • POST /index - Index all documents from documents folder

Processing Endpoints

  • POST /process/audio - Transcribe audio files
  • POST /process/image - Extract text from images using OCR

Conversation Management

  • GET /conversation/{session_id}/history - Get conversation history
  • DELETE /conversation/{session_id} - Clear conversation history

Feedback & Analytics

  • POST /feedback - Submit user feedback
  • GET /feedback/stats - Get feedback statistics

Monitoring & Metrics

  • GET /metrics - Get system metrics (JSON)
  • GET /metrics/prometheus - Prometheus metrics endpoint
  • WebSocket /ws/{client_id} - WebSocket for real-time streaming

See guide/quick-start.md for full API reference.

Monitoring & Observability

The system includes comprehensive monitoring capabilities:

  • Prometheus Metrics: Export metrics at /metrics/prometheus
  • Health Checks: Basic (/health) and detailed (/health/detailed) health endpoints
  • Performance Profiling: Built-in performance profiler for optimization
  • Tracing: Distributed tracing support
  • Query Analytics: Track query patterns, performance, and usage statistics

Monitoring Setup

With Docker Compose, Prometheus is automatically configured:

# Access Prometheus UI
# http://localhost:9090

For production deployments, see k8s/ directory for Kubernetes manifests with monitoring configured.

Deployment

Docker Production

Production-ready Docker Compose configuration is available:

# Use production configuration
docker-compose -f docker/docker-compose.prod.yml up -d

Kubernetes

Kubernetes deployment manifests are available in the k8s/ directory:

  • Deployment with horizontal pod autoscaling
  • Service and Ingress configuration
  • ConfigMap and Secrets management
  • Persistent volume claims for data storage

Documentation

  • Windows Setup: guide/windows-setup.md ⭐ (No Make required - Windows-friendly commands)
  • Quick Start: guide/quick-start.md
  • Docker Setup: guide/docker-setup-instructions.md
  • Docker Quick Fix: guide/docker-quick-fix.md (solves installation errors)
  • Troubleshooting: guide/troubleshooting.md

Project Structure

RAGh-Tutor/
β”œβ”€β”€ pom.xml                          # Maven dependencies
β”œβ”€β”€ Dockerfile                       # Docker configuration
β”œβ”€β”€ docker-compose-java.yml          # Docker Compose setup
β”œβ”€β”€ build.sh / build.bat            # Build scripts
β”œβ”€β”€ prometheus.yml                   # Prometheus config
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main/
β”‚   β”‚   β”œβ”€β”€ java/com/ragtutor/
β”‚   β”‚   β”‚   β”œβ”€β”€ RagTutorApplication.java    # Spring Boot entry point
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ config/                     # Configuration classes
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ AppConfig.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ LLMConfig.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ EmbeddingConfig.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ RetrievalConfig.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ ChunkingConfig.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ MemoryConfig.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ AgentConfig.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ SecurityConfig.java
β”‚   β”‚   β”‚   β”‚   └── WebConfig.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ controller/                 # REST API
β”‚   β”‚   β”‚   β”‚   └── RagController.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ service/                    # Business logic
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ QueryService.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ DocumentService.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ ConversationService.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ FeedbackService.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ HealthService.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ MetricsService.java
β”‚   β”‚   β”‚   β”‚   └── InitializationService.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ schemas/                    # DTOs
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ QueryRequest.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ QueryResponse.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ ChatRequest.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ Document.java
β”‚   β”‚   β”‚   β”‚   └── ...
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ retrieval/                  # Vector store
β”‚   β”‚   β”‚   β”‚   └── InMemoryVectorStore.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ embedding/                  # Embedding generation
β”‚   β”‚   β”‚   β”‚   └── EmbeddingModelService.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ generation/                 # LLM integration
β”‚   β”‚   β”‚   β”‚   └── LLMClient.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ chunking/                   # Document chunking
β”‚   β”‚   β”‚   β”‚   └── DocumentChunker.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ processing/                 # Document processing
β”‚   β”‚   β”‚   β”‚   └── DocumentLoader.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ memory/                     # Conversation management
β”‚   β”‚   β”‚   β”‚   └── ConversationManager.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ agents/                     # RAG agent
β”‚   β”‚   β”‚   β”‚   └── RAGAgent.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ security/                   # Security components
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ ContentModerator.java
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ AuditLogger.java
β”‚   β”‚   β”‚   β”‚   └── ActionBudgetGuard.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ middleware/                 # Middleware
β”‚   β”‚   β”‚   β”‚   └── RateLimiterFilter.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ monitoring/                 # Observability
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ PerformanceProfiler.java
β”‚   β”‚   β”‚   β”‚   └── TracingService.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ performance/                # Performance optimization
β”‚   β”‚   β”‚   β”‚   └── ResponseCache.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ features/                   # Advanced features
β”‚   β”‚   β”‚   β”‚   └── QueryAnalytics.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ utils/                      # Utilities
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ TextUtils.java
β”‚   β”‚   β”‚   β”‚   └── FileUtils.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   β”œβ”€β”€ exception/                  # Exception handling
β”‚   β”‚   β”‚   β”‚   └── GlobalExceptionHandler.java
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚   └── listener/                   # Event listeners
β”‚   β”‚   β”‚       └── ApplicationStartupListener.java
β”‚   β”‚   β”‚
β”‚   β”‚   └── resources/
β”‚   β”‚       └── application.properties      # Spring configuration
β”‚   β”‚
β”‚   └── test/
β”‚       └── java/com/ragtutor/             # JUnit tests
β”‚           β”œβ”€β”€ RagTutorApplicationTests.java
β”‚           β”œβ”€β”€ HealthServiceTest.java
β”‚           └── DocumentChunkerTest.java
β”‚
β”œβ”€β”€ docker/                                  # Docker configurations
β”œβ”€β”€ k8s/                                     # Kubernetes manifests
β”œβ”€β”€ guide/                                   # Documentation
β”œβ”€β”€ data/                                    # Data storage
β”‚   β”œβ”€β”€ embeddings/
β”‚   β”œβ”€β”€ cache/
β”‚   └── feedback/
β”œβ”€β”€ documents/                               # Document upload directory
└── logs/                                    # Application logs

## πŸ§ͺ Testing

```bash
# Run all tests
mvn test

# Run with coverage
mvn clean test jacoco:report

# Run integration tests
mvn verify

# View coverage report
open target/site/jacoco/index.html

🚒 Deployment

Docker Production

# Build production image
docker build -t rag-tutor:latest .

# Run with environment variables
docker run -d -p 8000:8000 \
  -e OPENAI_API_KEY=$OPENAI_API_KEY \
  -v ./documents:/app/documents \
  -v ./data:/app/data \
  rag-tutor:latest

Kubernetes

Kubernetes manifests are available in the k8s/ directory:

  • Deployment with horizontal pod autoscaling
  • Service and Ingress configuration
  • ConfigMap and Secrets management
  • Persistent volume claims
kubectl apply -f k8s/

πŸ“Š Monitoring

Prometheus & Grafana

# Start with monitoring stack
docker-compose -f docker-compose-java.yml up -d

# Access dashboards
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (admin/admin)
- Metrics: http://localhost:8000/actuator/prometheus

Metrics Tracked

  • Query latency and throughput
  • Retrieval performance
  • LLM generation time
  • Cache hit rates
  • Error rates by type
  • JVM metrics (heap, GC, threads)

πŸ”’ Security Features

  • βœ… Rate Limiting: Token bucket algorithm (100 req/min default)
  • βœ… Content Moderation: Filters inappropriate content
  • βœ… Audit Logging: Complete audit trail of operations
  • βœ… Action Budget: Prevents abuse with session limits
  • βœ… Input Validation: Bean Validation on all inputs
  • βœ… CORS Configuration: Configurable cross-origin policies

⚑ Performance Features

  • βœ… Response Caching: Caffeine cache for frequent queries
  • βœ… Batch Processing: Efficient batch embedding generation
  • βœ… Connection Pooling: HTTP client connection reuse
  • βœ… Async Operations: CompletableFuture for parallel processing
  • βœ… Performance Profiling: Detailed timing metrics

πŸ”„ Migration from Python

Migrating from the Python version? See Python to Java Migration Guide.

Key Differences:

  • FastAPI β†’ Spring Boot
  • asyncio β†’ CompletableFuture
  • Pydantic β†’ Lombok + Bean Validation
  • FAISS β†’ In-memory vector store
  • Port: Same (8000)
  • API: Compatible endpoints

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

MIT License - See LICENSE file for details

πŸ†˜ Support

πŸ™ Acknowledgments

  • Original Python implementation
  • Spring Boot framework
  • LangChain4j library
  • Apache PDFBox
  • OpenAI & Anthropic

Built with β˜• and Java 17

About

RAG Tutor, naming RAGh! tutor, RAGh a tiger roar 🦁

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published