Skip to content

Latest commit

 

History

History
62 lines (45 loc) · 3.86 KB

File metadata and controls

62 lines (45 loc) · 3.86 KB

Project Architecture: Assistant

This document describes the internal structure and logic of the Telegram bot.

1. Overall Workflow

The project implements the Retrieval-Augmented Generation (RAG) pattern.

  1. Input Request: A user sends a text or voice message in Telegram.
  2. Processing:
    • Voice-to-Text (if needed): Voice messages are transcribed into text using the gpt-4o-mini-transcribe model.
    • Knowledge Base Search (Retrieval): The query text is converted into a vector (embedding) and used to find the most relevant text fragments (chunks) in the vector database (ChromaDB).
    • Prompt Assembly: The relevant chunks, along with the dialogue history and the user's original question, are assembled into a system prompt for the Large Language Model (LLM).
    • Response Generation: The assembled prompt is sent to the OpenAI (GPT) model, which generates the final answer.
  3. User Response: The generated answer is sent back to the user in Telegram.
  4. Logging: The entire conversation (question-answer) is saved to a text log file.

2. Module Structure (src)

  • main.py: Entry Point. Initializes the bot, the aiogram dispatcher, databases, and starts polling.

  • /handlers: Message Handlers.

    • handlers.py: Receives messages from aiogram (commands, text, voice). It performs initial validation (request limits), manages conversation state (FSM), and triggers the main processing logic.
  • /assistant: Assistant Core (LLM).

    • assistant.py: Contains the core RAG logic. The get_assistant_response function orchestrates the vector search, chunk filtering, prompt creation, and the call to the OpenAI API.
    • prompt.py: Stores system prompts and templates for the LLM, separating logic from model instructions.
  • /database: Data Management.

    • vector_store.py: An abstraction layer for the ChromaDB vector database. It provides functions for searching (search_vector_store) and loading/creating the database.
    • build_index.py: A script for pre-processing the knowledge base. It reads .txt files, splits them into chunks, and vectorizes them.
    • user_database.py: Manages a user database with SQLite (for demonstration purposes) to track request counts and enforce daily limits.
    • database.py: Basic functions for initializing the databases.
  • /transcriber: Speech Recognition.

    • transcriber.py: Responsible for processing voice messages. It downloads the audio file, sends it to the OpenAI API with the gpt-4o-mini-transcribe model, and returns the text.
  • /utils: Helper Utilities.

    • utils.py: Contains the log_conversation function for saving conversation histories to text files.
  • config.py: Configuration. A central place for all settings: tokens, API keys, file paths (LOGS_DIR, DB_PATH), and RAG/model parameters (CHUNK_SIZE, RELEVANCE_THRESHOLD).

3. Request Lifecycle

Text Message:

  1. handlers.py catches the message.
  2. The user's request limit is checked in user_database.
  3. assistant.py:get_assistant_response is called with the question text and history.
  4. assistant.py calls database/vector_store.py to find relevant documents.
  5. assistant.py builds a prompt using a template from prompt.py and the retrieved documents.
  6. assistant.py makes a request to the OpenAI API.
  7. handlers.py receives the response and sends it to the user.
  8. utils.py:log_conversation saves the dialogue.
  9. The conversation history is updated in the FSM.

Voice Message:

  1. handlers.py catches the voice message.
  2. transcriber.py:process_voice_message is called.
  3. transcriber.py downloads the file, sends it to the OpenAI API with the gpt-4o-mini-transcribe model, and receives the text.
  4. From here, the process is identical to the text message flow, starting from step 2.