Skip to content

A microservices-based RAG chatbot using LangGraph for workflow orchestration, FAISS for retrieval, and FastAPI for all service endpoints. Features include iterative LLM-driven retrieval, centralized logging with SSE streaming, multi-store management and text-to-speech capabilities.

License

Notifications You must be signed in to change notification settings

RobHaRepos/Chatbot-agentic-RAG

Repository files navigation

CI Quality Gate Status Snyk Vulnerabilities Python Coverage

Chatbot-agentic-RAG

A microservices-based RAG chatbot using LangGraph for workflow orchestration, FAISS for retrieval, and FastAPI for all service endpoints. Features include iterative LLM-driven retrieval, centralized logging with SSE streaming, and text-to-speech capabilities.

Tech Stack: Python 3.13 | FastAPI | LangGraph | FAISS | HuggingFace Transformers | OpenAI | Docker Compose

Architecture

Services:

  • Workflow (port 8000): LangGraph-based orchestration with /run and /tts endpoints
  • Retriever (port 8001): FAISS vector search with multi-store support, document management, and configurable chunking
  • LLM (port 8002): ChatOpenAI wrapper for answer generation and retrieval decisions
  • Frontend (port 8003): React SPA with TTS, vector store management, and document upload
  • Logger (port 8004): Centralized log collection with SSE streaming
  • TTS (port 8005): External Kokoro TTS service (optional) - TTS_kokoro

LangGraph state graph

Quick Start

# Set required environment variables
$env:OPENAI_API_KEY = "your-key-here"
$env:PATH_TO_FAISS_INDEX = "./faiss_Hugging_index"

# Build and run all services
docker compose build --no-cache --pull
docker compose up -d

# Verify health
curl http://localhost:8000/health  # workflow
curl http://localhost:8003          # frontend

Access the UI at http://localhost:8003

Project Structure

app/
├── langgraph_code/      # Workflow orchestration
│   └── src/
│       ├── workflow.py  # LangGraph state machine
│       ├── nodes.py     # Node implementations
│       ├── wf_api.py    # Main FastAPI app
│       └── tts_api.py   # TTS proxy endpoint
├── llm/                 # LLM service
├── retriever/src/       # Retriever service
│   ├── retriever.py     # FastAPI endpoints
│   ├── crud.py          # Database CRUD operations
│   ├── database.py      # SQLAlchemy models & templates
│   └── faiss_utils.py   # FAISS operations
├── frontend/            # React SPA with TypeScript
└── logger_service/      # Centralized logging
tests/                   # Unit & integration tests
docker-compose.yml       # Service orchestration

How It Works

  1. User submits question → Workflow invokes LLM to decide next action (retrieve/answer/clarify)
  2. LLM requests retrieval → Workflow calls FAISS retriever with targeted query
  3. Documents returned → LLM evaluates if sufficient to answer (iterates up to 5x)
  4. Final answer generated → Response includes answer, context, and retrieval count
  5. TTS playback → Optional audio synthesis via speaker button in UI

The LLM maintains a context summary across iterations to track gathered information and avoid redundant retrievals.

API Endpoints

Workflow Service (/)

  • POST /run - Execute RAG workflow: {"question": "...", "k": 3}
  • POST /tts - Synthesize speech: {"text": "...", "voice": "am_onyx", "speed": 1.0}
  • GET /health, GET /ready - Health checks

LLM Service (/)

  • POST /retrieve_or_respond - Decide next action
  • POST /generate_answer - Generate final answer

Retriever Service (/)

  • POST /stores/{store_id}/retrieve - FAISS semantic search
  • GET/POST/PATCH/DELETE /stores - Vector store CRUD
  • POST /stores/{store_id}/upload - Upload documents (.txt, .md)
  • GET/PATCH/DELETE /stores/{store_id}/documents/{doc_id} - Document management
  • GET/POST/PATCH/DELETE /templates - Prompt template management

Logger Service (/)

  • POST /logs - Submit logs
  • GET /stream - SSE log stream

Configuration

Environment Variables:

  • OPENAI_API_KEY - Required for LLM service
  • PATH_TO_FAISS_INDEX - Default FAISS index directory (store 1)
  • CHUNK_SIZE, CHUNK_OVERLAP - Document chunking config (defaults: 4000, 800)
  • MODEL_NAME_EMBEDDING - HuggingFace embedding model (default: sentence-transformers/all-MiniLM-L6-v2)
  • LANGGRAPH_LLM_API_URL - LLM service URL (default: http://localhost:8002)
  • LANGGRAPH_RETRIEVER_API_URL - Retriever URL (default: http://localhost:8001)
  • TTS_SERVICE_URL - TTS service URL (default: http://tts_service:8005)
  • MODEL_NAME_LLM, TEMPERATURE_LLM, MAX_TOKENS - LLM configuration

Development

Run tests:

pytest                    # All tests
pytest tests/test_llm.py  # Specific module
pytest --cov=.            # With coverage

Local development without Docker:

# Install dependencies
pip install -r requirements.txt

# Run services individually
uvicorn app.retriever.src.retriever:app --port 8001
uvicorn app.llm.src.llm_api:app --port 8002
uvicorn app.langgraph_code.src.wf_api:app --port 8000

CI/CD & Security

  • GitHub Actions: Automated testing with ruff linting and pytest
  • SonarQube Cloud: Code quality and security analysis
  • Snyk: Dependency vulnerability scanning

All Docker containers run as non-root appuser for security.

Text-to-Speech Setup

  1. Run external Kokoro TTS service on port 8005
  2. Configure Docker networks: add tts_kokoro_network to langgraph service
  3. Set TTS_SERVICE_URL environment variable
  4. Click speaker icons in UI to play audio

Note: HTTP is used for internal Docker communication as services are network-isolated.

AI-Assisted Development

The React frontend was developed with GitHub Copilot as an experiment in AI-accelerated learning. See AI_DEVELOPMENT.md for details on this approach and insights gained.

About

A microservices-based RAG chatbot using LangGraph for workflow orchestration, FAISS for retrieval, and FastAPI for all service endpoints. Features include iterative LLM-driven retrieval, centralized logging with SSE streaming, multi-store management and text-to-speech capabilities.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published