This module implements the core GenAI capabilities for Study Mate, leveraging Retrieval-Augmented Generation (RAG) and LangChain to enable document-aware responses, summaries, flashcards, and quizzes.
When a user uploads a .pdf or .txt document, the system:
- β Parses and splits the content into meaningful chunks
- π§ Embeds content using HuggingFace (MiniLM) and stores in Weaviate
- π¬ Supports document-specific chat using RAG
- βοΈ Generates:
- Structured summaries (Markdown)
- Flashcards (difficulty-tagged)
- Quizzes (MCQ and short-answer)
| File | Description |
|---|---|
llm.py |
Manages all GenAI functionality: chat, summarization, flashcards, quiz generation |
rag.py |
Handles ingestion, chunking, metadata, vector embedding, and retrieval via Weaviate |
chains.py |
Defines custom LangChain chains to generate structured outputs (flashcards, quizzes) |
- π LangChain for LLM orchestration
- π Weaviate stores two types of chunks:
RAGChunksIndex: Vectorized, small chunks for semantic searchGenerationChunksIndex: Larger, plain text chunks for generative tasks (e.g., summaries)
- π€ HuggingFace MiniLM used for embeddings
- π Open WebUI-compatible API (LLaMA 3) for LLM calls
-
Load document
β viaStudyLLM.load_document(doc_name, path, user_id) -
Ingest chunks
β Embedded RAG chunks go toRAGChunksIndex
β Generation chunks go toGenerationChunksIndex -
Chat
β Queries are filtered byuser_idand optionallydoc_name
β Top-k relevant chunks retrieved from Weaviate and passed to the LLM -
Summarize / Flashcards / Quiz
β Uses larger plain-text chunks stored per document
β LangChainβs map-reduce pattern parallelized to generate structured output
| Feature | Method |
|---|---|
| Load Document | load_document(doc_name, path, user_id) |
| RAG Chat | prompt(prompt, user_id) |
| Summarize | summarize(document_name, user_id) |
| Flashcards | generate_flashcards(document_name, user_id) |
| Quiz | generate_quiz(document_name, user_id) |
| Cleanup (see below) | cleanup() |
- Only
.pdfand.txtdocuments are supported.
langchainlangchain-openailangchain-huggingfacelangchain-communityweaviate-clientPyMuPDF,dotenv,asyncio, etc.
Call StudyLLM.cleanup() to close the Weaviate client connection properly.
MIT β see LICENSE