SyllabusRAG

SyllabusRAG helps you ask questions about a university syllabus easily.
It reads a syllabus file, builds a knowledge base and uses an AI model to answer questions.

How It Works ⚡

The syllabus data is stored in a file called syllabus.json.
Each course is read and turned into a simple text summary.
These summaries are saved in knowledge_base.txt.
The summaries are converted into vector embeddings using sentence-transformers.
The vectors are stored in a FAISS index for fast similarity search.
A small AI model (google/gemma-7b-it) is loaded from Hugging Face.
When you ask a question:
- If you include a course code, the system retrieves that course directly.
- If not, it finds the closest course by comparing your question with the vector index.
The AI generates an answer based only on the matched course information.
If nothing relevant is found, it clearly responds that it cannot answer.

Folder Structure 📂

.
├── README.md
├── data
│   └── syllabus.json        # Course syllabus data in JSON format
├── main.ipynb               # Main implementation notebook
├── report/
│   └── report.pdf           # Project report
├── requirements.txt         # Python dependencies
└── training_data/
    ├── knowledge_base.txt   # Processed text summaries
    └── faiss_index/
        ├── index.faiss      # FAISS vector index
        └── index.pkl        # Index metadata

Requirements 🛠️

To run the project, create a requirements.txt file with the following content:

transformers
torch
accelerate
bitsandbytes
langchain
langchain-community
sentence-transformers
faiss-cpu
langchain-huggingface

Then, run this command in your notebook or terminal:

!pip install -r requirements.txt

Setup 🚀

Open Google Colab (recommended) or Jupyter Notebook with GPU support.
Clone or download this repository and upload main.ipynb along with syllabus.json.
Create a free Hugging Face account and copy your access token.
In Colab, click the 🔑 icon on the left, add a new secret named HF_TOKEN and paste your token.
Open main.ipynb and run all cells.

This will:

Install required dependencies ⚙️
Process the syllabus data 📑
Build a FAISS index 📊
Load the model 🤖
Enable direct Q&A from the syllabus 💬

Example 💡

ask_question("What is the title for BECE309L?")
ask_question("What are the prerequisites for BECE309L?")
ask_question("What are the objectives of the Artifical Intelligence and Machine Learning course?")

✅ After completing these steps, your setup will be ready to query the syllabus!

Project Report 📊

For detailed analysis and evaluation results, check out the comprehensive project report:

PDF: report/report.pdf

The report includes qualitative and quantitative analysis, performance metrics and experimental results.

Limitations ⚠️

The model is very basic and only lightly fine-tuned on a small dataset.
Data pre-processing is raw and automated, not carefully cleaned or standardized.
Because of this, the model can sometimes give inaccurate or incomplete answers.
It works best only for specific, direct questions related to the syllabus.
This project should be seen as a beginner-friendly starting point, not a production-ready system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SyllabusRAG

How It Works ⚡

Folder Structure 📂

Requirements 🛠️

Setup 🚀

Example 💡

Project Report 📊

Limitations ⚠️

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
report		report
training_data		training_data
.gitignore		.gitignore
README.md		README.md
folder_structure.txt		folder_structure.txt
main.ipynb		main.ipynb
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SyllabusRAG

How It Works ⚡

Folder Structure 📂

Requirements 🛠️

Setup 🚀

Example 💡

Project Report 📊

Limitations ⚠️

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages