Skip to content

Latest commit

 

History

History
280 lines (207 loc) · 9.28 KB

File metadata and controls

280 lines (207 loc) · 9.28 KB

Genie 🧞

The Central Intelligence & Utility Hub for the IIIT Community.

Genie is a unified student dashboard that combines essential campus utilities with an AI-powered knowledge assistant. It serves as a personalized "start page" for IIIT Hyderabad students, integrating draggable widgets with a RAG-based chatbot that can intelligently query faculty research, thesis repositories, and lab information.


📋 Problem Statement

Navigating IIIT Hyderabad's information ecosystem is fragmented and time-consuming.

Students struggle to find:

  • Which professors work on specific research areas (Computer Vision, NLP, Robotics, etc.)
  • Lab groups and research centres relevant to their interests
  • Previous thesis work and research projects for guidance
  • Quick access to timetables and campus events

Information is scattered across multiple websites, PDFs, and databases with no unified, intelligent search interface.


💡 Our Solution

Genie provides a single, intelligent interface that:

  1. Centralizes campus utilities in a customizable, widget-based dashboard
  2. Enables natural language queries over IIIT's knowledge base using RAG (Retrieval-Augmented Generation)
  3. Indexes scraped data from thesis repositories, research centres, and faculty information
  4. Provides cited, accurate responses grounded in real IIIT data

🛠️ Tech Stack

Layer Technology
Frontend SvelteKit 5 + TailwindCSS 4
Backend FastAPI (Python 3.13)
AI/RAG Framework LlamaIndex
LLM Google Gemini (gemma-3-27b-it)
Vector Database ChromaDB
Embeddings BAAI/bge-small-en-v1.5 (HuggingFace)
Package Managers uv (Python), Bun/npm (Node)
Task Runner Just

✨ Key Features

🧩 Widget Dashboard (UI Design)

Our dashboard implements a fully draggable and resizable widget system:

  • Drag & Drop Layout: Click and drag widgets to reposition them anywhere on screen
  • Resizable Widgets: Grab corner/edge handles to resize any widget
  • Collision Detection: Widgets automatically push each other to prevent overlapping
  • Snap-to-Grid: Optional grid snapping for precise alignment (toggle in Edit Mode)
  • Persistent Layout: Widget positions saved to localStorage, restored on refresh
  • Z-Index Management: Clicked widgets automatically come to front
  • Smooth Animations: CSS transitions for fluid drag/resize feedback
  • Dark/Light Mode: Full theme support with instant toggle

Available Widgets:

Widget Description
Oracle Chat RAG-powered AI assistant for IIIT queries
Timetable Live class schedule with countdown to next class
Events Real-time campus events feed
Kanban Board Personal task management
Theme Studio Customize dashboard appearance

🧠 The Oracle (AI Chatbot)

The Oracle is a RAG-powered chatbot that answers questions about IIIT:

Example Queries:

  • "Who works on Computer Vision at IIIT?"
  • "Tell me about CVIT lab"
  • "What theses have been written on image segmentation?"
  • "Which professor should I approach for deep learning research?"

Capabilities:

  • Context-Aware Responses: Uses retrieved documents to ground answers
  • Fuzzy Name Matching: Handles typos ("cvit" → "CVIT", "PK" → "Prof. Ponnurangam")
  • Source Citations: Every answer includes document sources
  • Chat History: Maintains conversation context across messages
  • Markdown Rendering: Formatted responses with clickable links

� Data Pipeline

Data Sources (Pre-scraped)

File Content Records
theses.json MS/PhD thesis abstracts with PDF links ~1,700
research_projects.json Ongoing research projects 57
research_centres.json Labs and research groups 28
labs.json Lab descriptions and faculty 10+
faculty.csv Faculty profiles Sample data

Ingestion Pipeline (ingest.py)

The ingestion script processes all data sources and indexes them into ChromaDB:

# Full pipeline
python ingest.py --clear    # Clear existing + ingest all

# Selective ingestion
python ingest.py --source faculty   # Only faculty data
python ingest.py --source labs      # Only lab data

Pipeline Steps:

  1. Load Embedding Model - Downloads and caches BAAI/bge-small-en-v1.5
  2. Parse Data Sources - Faculty CSV, Labs JSON, Theses JSON, etc.
  3. Create Documents - Converts raw data to LlamaIndex Document objects with metadata
  4. Chunk Documents - Splits into 512-character chunks with 50-char overlap
  5. Generate Embeddings - Creates vector representations
  6. Store in ChromaDB - Persists to local vector database

RAG Architecture (rag.py)

User Query → Embedding → ChromaDB Similarity Search → Top-K Documents
                                    ↓
                        LLM (Gemini) + Context → Response
  • Retrieval: Top-10 most similar chunks retrieved
  • Context Window: 3000 token memory buffer for chat history
  • Response Generation: Gemini model with custom system prompt
  • Source Extraction: Automatic parsing of citations from response

🚀 Getting Started

Prerequisites

  • Python 3.13+
  • uv - Python package manager
  • Bun or Node.js 18+ - JavaScript runtime
  • Just (optional) - Task runner

Installation

# 1. Clone the repository
git clone https://github.com/PPLP-HackIIIT-26/genie.git
cd genie

# 2. Backend Setup
cd backend
uv sync                     # Install Python dependencies
cp .env.example .env        # Create environment file
# Edit .env and add your GOOGLE_API_KEY

# 3. Ingest Data (first time only - creates vector DB)
uv run python ingest.py --clear

# 4. Start Backend Server
uv run uvicorn main:app --reload --port 8000

# 5. Frontend Setup (new terminal)
cd ../frontend
bun install                  # or: npm install
bun run dev                  # or: npm run dev

# 6. Open http://localhost:5173

Environment Variables

Create backend/.env:

# Required - Get from Google AI Studio
GOOGLE_API_KEY=your_gemini_api_key_here

# Optional (these are defaults)
LLM_PROVIDER=gemini
LLM_MODEL=gemma-3-27b-it
EMBEDDING_MODEL=BAAI/bge-small-en-v1.5

Quick Start with Just

If you have just installed:

just dev    # Starts both frontend and backend in parallel

� Project Structure

genie/
├── backend/
│   ├── main.py              # FastAPI application & routes
│   ├── rag.py               # RAG service (LlamaIndex + ChromaDB)
│   ├── config.py            # Configuration & system prompt
│   ├── ingest.py            # Data ingestion pipeline
│   ├── data/                # Scraped JSON/CSV data
│   │   ├── theses.json      # 1,700+ thesis records
│   │   ├── research_centres.json
│   │   ├── research_projects.json
│   │   └── faculty.csv
│   └── chroma_db/           # Persisted vector database (gitignored)
│
├── frontend/
│   ├── src/
│   │   ├── routes/+page.svelte    # Main dashboard page
│   │   └── lib/
│   │       ├── ChatWidget.svelte           # Oracle AI chat
│   │       ├── TimetableWidget.svelte      # Class schedule
│   │       ├── EventsWidget.svelte         # Campus events
│   │       ├── KanbanWidget.svelte         # Task board
│   │       ├── DraggableResizableWidget.svelte  # Widget container
│   │       └── stores/widgetStore.ts       # Layout state management
│   └── package.json
│
├── Justfile                 # Task runner commands
└── README.md

📚 References & Open Source

Frameworks & Libraries Used

Data Sources

  • IIIT Hyderabad Thesis Repository (scraped)
  • IIIT Research Centres directory
  • Faculty information pages

No external starter templates were used - This project was built from scratch.


� AI Tools Disclosure

Tool Usage
Claude (Anthropic) Code assistance, debugging, architecture design
GitHub Copilot Code autocompletion
Google Gemini API Production LLM for the chatbot

All AI-generated code was reviewed, tested, and modified as needed. The core architecture, data pipeline design, and UI/UX were human-designed.


⚠️ Important Notes

  • No API keys committed - Use .env file (gitignored)
  • Pre-scraped data included - JSON files in backend/data/
  • Vector DB is local - ChromaDB persists in backend/chroma_db/ (gitignored)
  • Requires Gemini API key - Get free from Google AI Studio

👥 Team

PPLP - Built for HackIIIT 2026


📄 License

MIT © 2026 Genie Team