Skip to content

r6mez/BioTrek

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioTrek

An AI-powered knowledge engine for NASA space biology research. Ask questions about 600+ research papers, get answers with auto-generated charts, and browse an insights dashboard — all through a conversational interface backed by RAG.

Built for the Build a Space Biology Knowledge Engine challenge.

Showcase

alt text

alt text

alt text


Features

  • Conversational RAG chatbot — ask natural language questions; answers are grounded in 600+ ingested NASA papers with source citations
  • Automatic chart rendering — detects chartable data in AI responses (JSON blocks, Markdown tables, CSV, bullet lists, [CHART:type] directives) and renders Line, Bar, Pie, Area, or Scatter charts automatically
  • Source citations panel — slide-out sidebar with matched documents, content snippets, and direct links to original PDFs/HTML files
  • Chat history — create, rename, load, and delete past conversations with full message persistence per user
  • Insights dashboard — publication statistics with interactive Recharts visualizations across 9 research domains
  • Off-topic gating — dual-signal filtering (vector distance threshold + LLM refusal pattern matching) suppresses hallucinated sources on irrelevant queries
  • Auth — JWT + refresh token rotation, proactive silent refresh 2 min before expiry, 401-interceptor fallback, role-based access control
  • Dark/light mode — theme toggle persisted to localStorage

Architecture

Single Node.js process — no separate Python service. The entire RAG pipeline runs in TypeScript via LangChain.js and external APIs.


Tech Stack

Backend

Layer Technology
Framework NestJS 11 + TypeScript
ORM TypeORM → PostgreSQL
Auth JWT + refresh token rotation (Passport)
Embeddings HuggingFace Inference API (BAAI/bge-small-en-v1.5)
LLM Groq API (llama-3.3-70b-versatile, temp 0.3)
Vector store ChromaDB via LangChain.js
RAG @langchain/core · @langchain/community · @langchain/groq
Doc parsing pdf-parse (PDF)
API docs Swagger / OpenAPI at /docs

Frontend

Layer Technology
Framework React 19 + TypeScript
Build Vite 7
Styling Tailwind CSS + shadcn/ui + Radix UI
Charts Recharts 3
3D / Shaders Three.js (custom GLSL fragment shader)
Markdown react-markdown + rehype-highlight
Routing React Router 7
Toasts Sonner

Setup

Prerequisites

  • Node.js ≥ 18
  • PostgreSQL ≥ 14
  • ChromaDB (pip install chromadb) + Python 3.9+
  • Groq API key (free tier available)

1. Clone

git clone <repo-url>
cd BioTrek

2. PostgreSQL

Linux:

sudo apt update && sudo apt install postgresql postgresql-contrib
sudo systemctl start postgresql && sudo systemctl enable postgresql
sudo -u postgres psql -c "CREATE DATABASE biotrek;"

macOS:

brew install postgresql@16 && brew services start postgresql@16
createdb biotrek

Windows: Download from postgresql.org, create a database named biotrek owned by the default postgres user.

3. ChromaDB

pip install chromadb
chroma run --path ./back-end/data/database   # keep running in a separate terminal

4. Backend

cd back-end
cp env-example-relational .env

update the .env file with your PostgreSQL credentials, Groq API key, and any other overrides, then run:

npm install
npm run migration:run
npm run seed:run:relational
npm run start:dev
  • API: http://localhost:3000/api/v1
  • Swagger: http://localhost:3000/docs

First run: if no ChromaDB collection is found, the vector store builds automatically from the documents in CHATBOT_DATA_PATH. This calls the HuggingFace API to embed all chunks and may take a few minutes.

5. Frontend

cd front-end
npm install
npm run dev

App: http://localhost:5173

To override the API base URL:

# front-end/.env
VITE_API_URL=http://localhost:3000/api/v1

Scripts

Backend (back-end/)

Script Description
npm run start:dev Dev server with hot-reload
npm run build Compile TypeScript to dist/
npm run start:prod Run compiled output
npm run migration:generate Generate a new TypeORM migration
npm run migration:run Apply pending migrations
npm run migration:revert Revert last migration
npm run seed:run:relational Seed the database
npm run lint ESLint

Frontend (front-end/)

Script Description
npm run dev Dev server with HMR
npm run build Production build (outputs to dist/)
npm run preview Preview production build locally
npm run lint ESLint

Thanks for checking out BioTrek! Feel free to explore the code, run it locally, and reach out if you have any questions or feedback.

About

An AI-powered knowledge engine for NASA space biology research, query 600+ papers through a conversational RAG interface with auto-generated charts and source citations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors