An AI-powered study companion that helps students understand lecture material through intelligent question answering, slide summarization, PDF summaries, and flashcard generation. Built with LangChain, Hugging Face Transformers, and Gradio β and fully powered by open-source LLMs running on your local GPU.
Drag and drop any academic or lecture PDF β slides or notes.
Use natural language to ask:
- βSummarize Slide 4β
- βWhat is Linear Regression?β
- βWhat is covered under Artificial Neural Networks?β
Supports fuzzy slide matching and page number detection.
Click one button to get a concise overview of the entire document.
Automatically generates Q&A-style flashcards from your material β perfect for revision and quizzes.
Queries like "Summarize Slide 5" are matched against actual page content using fuzzy logic for accurate results.
| Layer | Tools Used |
|---|---|
| Frontend | Gradio (Python UI) |
| LLM | Hugging Face Transformers (Phi-2 / Mistral 7B / Falcon) |
| Framework | LangChain |
| Vector DB | FAISS |
| PDF Parsing | PyMuPDF (via LangChain loaders) |
| Embeddings | SentenceTransformers (MiniLM-L6-v2) |
| Fuzzy Matching | RapidFuzz |
ai_tutor/
βββ app.py
βββ ui/
β βββ interface.py
βββ modules/
β βββ document_loader.py
β βββ vector_store.py
β βββ rag_pipeline.py
β βββ llm_interface.py
β βββ slide_mapper.py
β βββ summarizer.py
β βββ flashcards.py
βββ data/
β βββ (your PDFs here)
βββ requirements.txt
βββ README.md
git clone https://github.com/your-username/ai-study-assistant.git
cd ai-study-assistantpython -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windowspip install -r requirements.txtpython -m ui.interface- βSummarize Slide 3β
- βWhat is supervised learning?β
- βWhat is covered on page 10?β
- βGenerate flashcards for the regression topicβ
- βGive a summary of this entire PDFβ
microsoft/phi-2β (recommended for speed and accuracy)mistralai/Mistral-7B-Instruct-v0.1βtiiuae/falcon-rw-1bβ (lightweight)
You can switch models in
llm_interface.py
This project runs entirely offline on your own PC β no API keys or cloud calls required. Perfect for privacy and local LLM experimentation.
- LangChain chains and document loaders
- RAG (Retrieval-Augmented Generation) pipeline
- LLM fine-tuned prompting
- Embedding & similarity search with FAISS
- PDF text extraction using PyMuPDF
- RapidFuzz for slide number matching
- Gradio UI design
Deshan Senanayake
BSc (Hons) Artificial Intelligence & Data Science
Robert Gordon University (via IIT, Sri Lanka)
Feel free to add any issues you find.
- Export flashcards as
.csvor.txt - Add MCQ quiz generator
- Add feedback loop to improve flashcards
- Deploy to HuggingFace Spaces or Streamlit Cloud
- Voice input (Whisper integration)
MIT License β free to use, modify, and share with credit.


