Skip to content

Hairum-Qureshi/cis-advisor-backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

165 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CIS Advisor Backend

Gemini-Powered Academic Q&A API with RAG

This backend powers the chatbot functionality for the CIS Advisor Chatbot for the University of Delaware Graduate Computer Science program.

It provides:

  • A secure backend layer for interacting with the Google Gemini API
  • A Retrieval-Augmented Generation (RAG) pipeline for grounded answers
  • Strict domain constraints to prevent off-topic responses
  • Zero exposure of API keys to the client

At no point does the frontend directly communicate with Gemini or the embedding system.


Model Versioning

Initial version (2025):

  • gemini-2.5-flash

Current version (February 2026):

  • gemini-2.5-flash-lite

If this model is deprecated or rate-limited in the future, update the model identifier in the Gemini client configuration accordingly.


Core Goals

  • Securely serve Gemini responses (no API keys in client code)
  • Ground responses using UD CIS program data via RAG
  • Constrain the model to only answer UD Grad CS questions
  • Reject irrelevant or out-of-scope queries deterministically
  • Output HTML-formatted responses for frontend rendering
  • Continuously verify embedding integrity via a golden Q&A benchmark

High-Level Architecture

Client
  │
  ▼
Node / Express API (this repo)
  │
  ├─ Fetch dataset & embeddings from MongoDB
  │
  ├─ Compute similarity via Python FastAPI embedding backend (optional)
  │
  ├─ Run golden Q&A embedding verification
  │
  └─ Send retrieved context + query to Gemini
          ▼
       Gemini API

This design cleanly separates:

  • LLM orchestration (Node backend)
  • Embedding + retrieval logic (Python backend)
  • Dataset persistence (MongoDB)

RAG Workflow Diagram

The following diagram illustrates how the Retrieval-Augmented Generation (RAG) pipeline processes a user query and produces a grounded answer.

RAG Workflow Diagram


How It Works

Endpoints

Endpoint Method Description Admin Key Required
/ GET Basic server health check No
/api/data-source-json GET Returns the full Q&A dataset currently stored in MongoDB No
/api/ask-gemini POST Runs the RAG pipeline and queries Gemini for a grounded answer No
/api/add-data-source POST Adds new Q&A entries and generates embeddings Yes
/api/regenerate-embeddings PUT Regenerates embeddings for the entire dataset Yes
/api/q-and-a/:id DELETE Deletes a specific Q&A pair by its ID Yes
/api/clear-data-source DELETE Deletes all dataset entries and embeddings Yes

For all admin endpoints, the request body must include:

{
  "key": "your_admin_key"
}

Request & Response Examples

GET /api/data-source-json

Response

[
	{
		"_id": "6421f5e7abc1234567890abc",
		"id": "p0",
		"Question": "Which courses are required for CIS?",
		"Answer": "The required courses are ...",
		"Category": "Program Requirements",
		"Notes": ""
	}
]

POST /api/ask-gemini

Request Body

{
	"query": "How do I request an admissions deferment when I cannot attend during my expected enrollment term?"
}

Response

{
	"answer": "<p>Submit your admission deferral request to the CIS Graduate Academic Advisor II for review.</p>"
}

PUT /api/regenerate-embeddings

Regenerates embeddings for all Q&A entries currently stored in MongoDB.

This endpoint is primarily used when:

  • The embedding model changes
  • Dataset entries are modified outside the normal ingestion pipeline
  • The golden benchmark detects embedding drift
  • Embedding corruption is suspected

Request Body

{
	"key": "your_admin_key"
}

Response

{
	"message": "Embeddings successfully regenerated."
}

DELETE /api/q-and-a/:id

Deletes a specific Q&A entry by its ID.

URL Params

id=[string]

Request Body

{
	"key": "your_admin_key"
}

Example request:

DELETE /api/q-and-a/p0

DELETE /api/clear-data-source

Deletes all Q&A entries and their embeddings from MongoDB.

Request Body

{
	"key": "your_admin_key"
}

Retrieval-Augmented Generation (RAG)

RAG responsibilities are encapsulated in a dedicated class that handles:

  • Query preprocessing
  • Context retrieval via embeddings from MongoDB
  • Prompt construction and constraint enforcement
  • Gemini request orchestration
  • Deterministic rejection of out-of-scope queries
  • Embedding integrity verification via a golden Q&A pair

The RAG pipeline is centralized behind a single abstraction rather than scattered across route handlers.


Golden Q&A Embedding Verification

To ensure long-term embedding correctness, the system supports a golden benchmark check.

A known question–answer pair from the dataset is embedded and evaluated at runtime to confirm that vector similarity and retrieval behavior remain stable.

Purpose

  • Detect silent embedding drift
  • Catch accidental dataset corruption
  • Trigger automatic embedding regeneration when similarity falls below an acceptable threshold

This provides a deterministic signal that the vector store no longer reflects the source dataset accurately.


Embedding Backend

  • Implemented using Python + FastAPI

Responsible for:

  • Generating embeddings for new queries or updated dataset entries
  • Performing similarity search
  • Returning the most relevant context

Repository:

https://github.com/Hairum-Qureshi/embedding-python-backend

The Node backend invokes this service as part of the RAG pipeline before any request is sent to Gemini.

For detailed documentation on the RAG orchestration layer:

https://github.com/Hairum-Qureshi/cis-advisor-backend/blob/main/api/RAGClass.md


Prompt Strategy

Gemini is explicitly instructed to:

  • Answer only University of Delaware Graduate CS questions
  • Reject unrelated or general-knowledge queries
  • Use only retrieved context when forming answers
  • Output HTML (no headers, frontend-safe markup)

This is a prompt-control mechanism, not a complete safety system.


Environment Variables

Variable Required Description
GEMINI_API_KEY Yes Google Generative AI API key
PYTHON_SERVER_URL Yes Base URL of the Python FastAPI embedding service
MONGO_URI Yes MongoDB connection string
ADMIN_KEY Yes Key to authorize admin endpoints
GOLDEN_QUESTION Yes Verbatim question from the dataset used as an embedding benchmark
GOLDEN_ANSWER Yes Verbatim answer corresponding to GOLDEN_QUESTION
PORT No Local dev port (default: 3000)

.env Example

GEMINI_API_KEY=your_key_here
PYTHON_SERVER_URL=http://localhost:8000
MONGO_URI=mongodb+srv://user:password@cluster.mongodb.net/dbname
ADMIN_KEY=supersecret
GOLDEN_QUESTION="How do I request an admissions deferment?"
GOLDEN_ANSWER="Submit your admission deferral request to the CIS Graduate Academic Advisor II for review."
PORT=3000

Installation & Local Development

git clone <repo-url>
cd api
npm install
npm run dev

API will be available at:

http://localhost:3000

If running locally, ensure the Python embedding backend is running before calling /api/ask-gemini.


Future Improvements

  • In terms of design, using an id property was implemented prior to the shift toward using a database-backed RAG architecture. The system can be simplified by removing this field and instead relying entirely on MongoDB's native _id property for record identity and retrieval.

  • Instead of having the Python embedding backend in a separate repo, integrate it in this repo so it's all in one place. Cause of concern is needing to update Vercel config and at this point in time, I'm not looking to break things while things are working as I don't have the knowledge of how to do this with Vercel yet.

About

A simple Node.js/Express.js backend I chose to make for my capstone's CIS Advisor chatbot site project. Its main purpose was to safeguarded the API key so it's not exposed in the frontend JavaScript code. It's now also responsible for handling all AI-related requests in the backend as well utilizing RAG.

Resources

Stars

Watchers

Forks

Contributors