RAG Telegram Assistant

An intelligent Telegram bot powered by a Retrieval-Augmented Generation (RAG) pipeline, built from scratch to answer questions based on a custom knowledge base. It handles both text and voice messages, maintains conversation history, and cites its sources.

This project serves as a clear, practical demonstration of how to build a modern LLM assistant without high-level frameworks like LangChain, offering a deep dive into the mechanics of a RAG pipeline.

➡️ For a detailed component breakdown and logic, see ARCHITECTURE.md.

🚀 Key Features

Data-Grounded Responses (RAG): The bot uses documents from a knowledge base as its primary source of truth, preventing confabulation.
Voice Support: Integrated speech recognition via OpenAI allows users to ask questions using voice messages.
Conversation Memory: Remembers recent messages to maintain a coherent dialogue.
Source Citations: Cites the source document for answers drawn from the knowledge base.
Flexible Configuration: Key parameters (GPT model, relevance threshold, chunk size) are managed in a central config file.
Usage Limiting: A built-in system to control the number of requests per user.

🛠️ How It Works

Indexing: Local .txt files from the data/vectordb directory are loaded, split into smaller chunks, and vectorized using OpenAI's models. These vectors are stored in a local ChromaDB instance.
Retrieval: When a user asks a question, it's also vectorized. The system then searches the database for the most semantically similar text chunks.
Generation: The retrieved chunks, conversation history, and the user's original question are combined into a comprehensive prompt, which is sent to the GPT model to generate a final, context-aware answer.

⚙️ Getting Started

Prerequisites

Installation & Setup

1. Clone the repository:

git clone https://github.com/your-username/rag-telegram-assistant.git
cd rag-telegram-assistant

2. Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate

3. Install dependencies:

pip install -r requirements.txt

4. Configure environment variables: Create a .env file in the project root by copying .env.example or creating it from scratch. Fill in your API keys:

TELEGRAM_BOT_TOKEN="Your token from @BotFather"
OPENAI_API_KEY="Your key from OpenAI"

5. Prepare your knowledge base: Place your custom .txt files into the data/vectordb directory.

6. Create the vector index: Run this script once to index your documents. Re-run it whenever you update the knowledge base.

python src/run_indexer.py

7. Run the bot:

python src/main.py

Your assistant is now live and ready to chat in Telegram!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG Telegram Assistant

🚀 Key Features

🛠️ How It Works

⚙️ Getting Started

Prerequisites

Installation & Setup

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

RAG Telegram Assistant

🚀 Key Features

🛠️ How It Works

⚙️ Getting Started

Prerequisites

Installation & Setup