🩺 MediScan AI

Smart Medical Report Analyzer

Upload any blood test or lab report → Get AI-powered health insights in seconds.

⚠️ Demo purposes only. Not a substitute for professional medical advice.

📖 Table of Contents

⚡ How to Run
Overview
Live Demo Flow
Features
Architecture
Tech Stack
Project Structure
Getting Started
API Reference
Analysis Pipeline
Database Schema
Health Scoring
Known Issues & Limitations
Roadmap

⚡ How to Run

TL;DR — Two terminals, four commands, and you're live.

Step 1 — Clone & configure secrets

git clone https://github.com/Autonomous-Drone-Target-Tracking-System/MediScan-AI_Smart-Report-Analyzer.git
cd MediScan-AI_Smart-Report-Analyzer

# Copy the env template and fill in your API keys
copy .env.example .env

Open .env and set:

GROQ_API_KEY=<your Groq key>        # https://console.groq.com/keys
OCR_SPACE_API_KEY=<your OCR key>    # https://ocr.space/ocrapi  (free tier OK)
FRONTEND_URL=http://localhost:3000
NEXT_PUBLIC_API_URL=http://localhost:8000

Step 2 — Start the Backend (Terminal 1)

cd backend

# Create & activate virtual environment
python -m venv venv
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS / Linux

# Install dependencies
pip install -r requirements.txt

# Run the API server
uvicorn main:app --reload --host 0.0.0.0 --port 8000

✅ Backend ready at http://localhost:8000 · Swagger docs at http://localhost:8000/docs

Step 3 — Start the Frontend (Terminal 2)

cd frontend

# Copy frontend env
copy .env.local.example .env.local

# Install dependencies
npm install

# Start the dev server
npm run dev

✅ Frontend ready at http://localhost:3000

Step 4 — Use the App

1. Open http://localhost:3000
2. Click "Analyze My Report"
3. Drag & drop a PDF or image of a blood/lab report
4. Click "Analyze Report"
5. Wait ~10–15 seconds → dashboard with your health score & AI insights

💡 No Tesseract? The app works fine without it — OCR.space handles scanned documents, and pdfplumber handles digital PDFs. Tesseract is only a local fallback.

🌟 Overview

MediScan AI is a full-stack web application built for a hackathon that transforms raw medical lab reports (PDFs or images) into clear, actionable health insights — powered by OCR and large language models.

A user uploads their blood test or lab report, and within seconds receives:

A health score (0–100)
Color-coded risk classification for each biomarker (Normal / Moderate / Critical)
Plain-English AI explanations for every marker
Personalized recommendations generated by Groq LLM
A persistent dashboard they can revisit at any time

No sign-up. No medical degree required. Just clarity.

🎬 Live Demo Flow

1. Visit http://localhost:3000
2. Click "Analyze My Report" → Upload page
3. Drag & drop a PDF or image of a lab report
4. Click "Analyze Report"
5. ⏱️  ~10–15 seconds later → redirected to your personal Dashboard
6. See your health score, biomarker table, AI summary, and recommendations

✨ Features

Feature	Description
📄 Smart OCR Extraction	pdfplumber (text PDFs) → OCR.space API (scanned) → Tesseract (local fallback)
🧠 AI Interpretation	Groq LLM generates plain-English explanations for every biomarker
📊 Risk Dashboard	Health score gauge, color-coded biomarker cards, risk badges
🛡️ Rule-Based Validation	Clinical reference ranges engine — grounds AI output in facts
⚡ Results in Seconds	Upload to full dashboard in under 15 seconds
🔬 30+ Biomarkers	Hemoglobin, LDL, Blood Sugar, TSH, Vitamin D, Creatinine, and more
💾 Persistent Reports	All analyses stored in SQLite — revisit any report via dashboard URL
📱 Fully Responsive	Mobile-first design, works on phones, tablets, and desktops
🔒 No Login Required	Zero friction — upload and analyze immediately

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        USER BROWSER                             │
│                    Next.js 16 Frontend                          │
│         Landing → Upload → Dashboard (per report_id)            │
└──────────────────────┬──────────────────────────────────────────┘
                       │ HTTP (axios)
                       ▼
┌────────────────────────────────────────────────────────────────┐
│                    FastAPI Backend :8000                       │
│                                                                │
│  POST /api/upload          POST /api/analyze/{id}              │
│  GET  /api/report/{id}     GET  /docs (Swagger)                │
│                                                                │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                  Analysis Pipeline                      │   │
│  │                                                         │   │
│  │  1. OCR Router ──────────────────────────────────────┐  │   │
│  │     ├── pdfplumber (text-based PDFs)                 │  │   │
│  │     ├── OCR.space API (scanned PDFs & images)        │  │   │
│  │     └── Tesseract + OpenCV (local fallback)          │  │   │
│  │                                                      │  │   │
│  │  2. Medical Parser ──── regex + keyword matching     │  │   │
│  │     └── Extracts: name, value, unit, ref range       │  │   │
│  │                                                      │  │   │
│  │  3. Risk Engine ───────────────────────────────────┐ │  │   │
│  │     ├── Classifies: Normal / Moderate / Critical   │ │  │   │
│  │     └── Calculates: Health Score (0–100)           │ │  │   │
│  │                                                    │ │  │   │
│  │  4. Groq LLM ──────────────────────────────────── ◄┘ │  │   │
│  │     ├── Per-biomarker plain-English explanations     │  │   │
│  │     ├── Overall health summary                       │  │   │
│  │     └── Personalized recommendations                 │  │   │
│  │                                                      │  │   │
│  │  5. SQLite Persistence ──────────────────────────────┘  │   │
│  └─────────────────────────────────────────────────────────┘   │
└────────────────────────────────────────────────────────────────┘
                       │
                       ▼
              SQLite (medical.db)
         reports + biomarkers tables

🛠️ Tech Stack

Backend

Technology	Version	Purpose
FastAPI	0.136	REST API framework
Uvicorn	0.47	ASGI server
pdfplumber	0.11.9	Text extraction from digital PDFs
OCR.space API	—	Cloud OCR for scanned documents
pytesseract	0.3.13	Local OCR fallback
OpenCV	4.13	Image preprocessing for OCR
Groq SDK	1.2.0	LLM inference (llama3 models)
SQLite3	built-in	Lightweight persistence
Pydantic	2.13	Data validation & schemas
python-multipart	0.0.29	File upload handling
python-dotenv	1.2.2	Environment variable loading

Frontend

Technology	Version	Purpose
Next.js	16.2.6 (Turbopack)	React framework with SSR/CSR
TypeScript	5	Type safety
framer-motion	—	Animations & micro-interactions
Recharts	—	Health score gauge & charts
axios	—	HTTP client for API calls
Lucide React	—	Icon library
Poppins + Inter	Google Fonts	Typography
Vanilla CSS	—	Custom design system with CSS tokens

📁 Project Structure

Hackethon/
├── .env                          # Root env file (shared API keys)
├── README.md
│
├── backend/
│   ├── main.py                   # FastAPI app entry point, CORS, lifespan
│   ├── requirements.txt          # Python dependencies
│   ├── medical.db                # SQLite database (auto-created)
│   │
│   ├── routes/
│   │   ├── upload.py             # POST /api/upload
│   │   └── analyze.py            # POST /api/analyze/{id}, GET /api/report/{id}
│   │
│   ├── services/
│   │   ├── pipeline.py           # Main orchestrator: OCR→Parse→Classify→AI→DB
│   │   ├── ocr_router.py         # Routes to PDF or image extractor
│   │   ├── pdf_service.py        # pdfplumber + OCR.space PDF fallback
│   │   ├── ocr_service.py        # OCR.space API + Tesseract local fallback
│   │   ├── medical_parser.py     # Regex-based biomarker extraction
│   │   ├── risk_engine.py        # Clinical range classification + health score
│   │   └── ai_service.py        # Groq LLM: explanations, summary, recommendations
│   │
│   ├── db/
│   │   ├── database.py           # SQLite connection factory
│   │   ├── init_db.py            # Schema creation & migrations
│   │   └── crud.py               # All DB read/write operations
│   │
│   ├── models/
│   │   └── schemas.py            # Pydantic request/response models
│   │
│   └── uploads/                  # Uploaded files (gitignored)
│
└── frontend/
    ├── .env.local                 # NEXT_PUBLIC_API_URL
    ├── app/
    │   ├── layout.tsx             # Root layout (fonts, metadata)
    │   ├── globals.css            # Design system (CSS tokens, utilities)
    │   ├── page.tsx               # Landing page
    │   ├── upload/
    │   │   └── page.tsx           # Upload page (drag-and-drop)
    │   └── dashboard/
    │       └── [reportId]/
    │           └── page.tsx       # Results dashboard (dynamic route)
    │
    └── components/
        └── Navbar.tsx             # Sticky responsive navbar

🚀 Getting Started

Prerequisites

Requirement	Notes
Python 3.11+	Backend runtime
Node.js 18+	Frontend runtime
Tesseract OCR	Download for Windows — optional fallback
Groq API Key	Free at console.groq.com
OCR.space API Key	Free at ocr.space — optional

Backend Setup

# 1. Navigate to backend
cd backend

# 2. Create and activate virtual environment
python -m venv venv

# Windows
venv\Scripts\activate

# macOS/Linux
source venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Start the development server
uvicorn main:app --reload --host 0.0.0.0 --port 8000

The API will be available at:

Base URL: http://localhost:8000
Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Frontend Setup

# 1. Navigate to frontend
cd frontend

# 2. Install dependencies
npm install

# 3. Start the development server
npm run dev

The frontend will be available at http://localhost:3000

Environment Variables

Create a .env file in the project root (Hackethon/.env):

# ── AI (Groq) ──────────────────────────────────────────────────────────────────
GROQ_API_KEY=your_groq_api_key_here

# ── OCR (OCR.space) ────────────────────────────────────────────────────────────
# Get a free key at https://ocr.space/OCRAPI
# Leave empty to use Tesseract-only fallback
OCR_SPACE_API_KEY=your_ocrspace_key_here

# ── CORS ───────────────────────────────────────────────────────────────────────
FRONTEND_URL=http://localhost:3000

Create a .env.local file inside frontend/:

# ── API URL ─────────────────────────────────────────────────────────────────────
NEXT_PUBLIC_API_URL=http://localhost:8000

💡 Groq Model Note: Ensure you use a currently supported model in ai_service.py.
As of 2026, use llama-3.3-70b-versatile or llama3-70b-8192. The llama3-8b-8192 model has been decommissioned.

📡 API Reference

`POST /api/upload`

Upload a medical report file.

Request: multipart/form-data

Field	Type	Description
`file`	File	PDF, JPG, or PNG — max 20 MB

Response:

{
  "report_id": 5,
  "document_url": "/uploads/abc123.pdf",
  "message": "File uploaded successfully"
}

`POST /api/analyze/{report_id}`

Run the full analysis pipeline on an uploaded report.

Response: AnalysisResult

{
  "report_id": 5,
  "health_score": 70,
  "biomarkers": [
    {
      "marker_id": 12,
      "marker_name": "Hemoglobin",
      "extracted_value": 10.2,
      "unit": "g/dL",
      "risk_category": "Moderate",
      "ai_explanation": "Your hemoglobin is slightly below the normal range..."
    }
  ],
  "ai_summary": "Your report shows mild anemia with otherwise normal metabolic markers...",
  "recommendations": [
    "Consult a hematologist about your hemoglobin levels",
    "Consider iron-rich foods like spinach and lentils"
  ]
}

`GET /api/report/{report_id}`

Retrieve a previously analyzed report from the database.

Response: Same AnalysisResult schema as above.

Error (404): Report not found or analysis not yet run.

🔬 Analysis Pipeline

The pipeline in services/pipeline.py runs 7 sequential steps:

Step 1: OCR Extraction
    ├── pdfplumber  → text-based PDFs (fastest, most accurate)
    ├── OCR.space   → scanned PDFs and images (cloud, handles tables well)
    └── Tesseract   → local fallback with OpenCV preprocessing

Step 2: Biomarker Parsing  (medical_parser.py)
    └── Regex + keyword matching to extract:
        name, numeric value, unit, reference range

Step 3: Risk Classification  (risk_engine.py)
    ├── Normal   → within reference range (or <10% deviation)
    ├── Moderate → 10–25% outside reference range
    └── Critical → >25% outside reference range

Step 4: Health Score Calculation  (risk_engine.py)
    └── Start at 100, deduct 15 per Critical, 7 per Moderate (min: 0)

Step 5: AI Explanations  (ai_service.py → Groq)
    └── Per-biomarker plain-English explanation

Step 6: AI Summary & Recommendations  (ai_service.py → Groq)
    ├── Overall health narrative
    └── Prioritized, actionable recommendations

Step 7: Persist to Database  (db/crud.py)
    └── health_score, biomarkers, ai_summary, recommendations → SQLite

🗄️ Database Schema

-- Users (placeholder, no auth currently)
CREATE TABLE users (
    user_id    INTEGER PRIMARY KEY AUTOINCREMENT,
    created_at TEXT    DEFAULT (datetime('now'))
);

-- Medical Reports
CREATE TABLE reports (
    report_id            INTEGER PRIMARY KEY AUTOINCREMENT,
    user_id              INTEGER REFERENCES users(user_id),
    upload_timestamp     TEXT    DEFAULT (datetime('now')),
    document_url         TEXT    NOT NULL,
    overall_health_score INTEGER DEFAULT 100,
    ai_summary           TEXT,
    recommendations      TEXT    -- JSON array stored as string
);

-- Biomarker Results
CREATE TABLE biomarkers (
    marker_id       INTEGER PRIMARY KEY AUTOINCREMENT,
    report_id       INTEGER NOT NULL REFERENCES reports(report_id) ON DELETE CASCADE,
    marker_name     TEXT    NOT NULL,
    extracted_value REAL,
    unit            TEXT,
    risk_category   TEXT    CHECK(risk_category IN ('Normal','Moderate','Critical')) DEFAULT 'Normal',
    ai_explanation  TEXT
);

📊 Health Scoring

The health score (0–100) is calculated deterministically from biomarker risk classifications:

Risk Level	Deduction	Trigger Condition
Critical	−15 points	Value >25% outside reference range
Moderate	−7 points	Value 10–25% outside reference range
Normal	−0 points	Value within range (or <10% deviation)

Example:

13 biomarkers found:
  → 3 Critical  = 3 × 15 = 45 points deducted
  → 2 Moderate  = 2 ×  7 = 14 points deducted
  → 8 Normal    = 0 deducted

Health Score = max(0, 100 - 59) = 41

⚠️ Known Issues & Limitations

Issue	Impact	Fix / Workaround
`llama3-8b-8192` model decommissioned	AI explanations fail silently	Update model in `ai_service.py` to `llama-3.3-70b-versatile`
No user authentication	All reports are public by `report_id`	Add JWT auth for production
SQLite not suitable for production	Single-writer lock, no concurrent writes	Migrate to PostgreSQL for deployment
OCR.space free tier limits	500 API calls/month	Use Tesseract fallback or upgrade plan
Report history not shown in UI	Users must know their `report_id`	Add a History page

🗺️ Roadmap

Fix Groq model → update to llama-3.3-70b-versatile
Report History page — list all past uploads with timestamps
Export to PDF — print-friendly dashboard report
Authentication — user accounts via NextAuth / Supabase
Trend Analysis — compare reports over time (same user)
Deploy — Vercel (frontend) + Railway/Render (backend)
PostgreSQL migration — replace SQLite for production
More biomarkers — expand the reference range database

👥 Team

Built for HackXcelarate 2K26 by the Optimus Devs.

📄 License

MIT License — see LICENSE for details.

Made with ❤️ and ☕ for the HackXcelarate 2K26

🚀 Upload Your Report · 📖 API Docs

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
backend		backend
frontend		frontend
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Design Document.docx		Design Document.docx
Product Requirements Document (PRD).docx		Product Requirements Document (PRD).docx
README.md		README.md
Tech Stack.docx		Tech Stack.docx
package-lock.json		package-lock.json

Folders and files

Latest commit

History

Repository files navigation

🩺 MediScan AI

Smart Medical Report Analyzer

📖 Table of Contents

⚡ How to Run

Step 1 — Clone & configure secrets

Step 2 — Start the Backend (Terminal 1)

Step 3 — Start the Frontend (Terminal 2)

Step 4 — Use the App

🌟 Overview

🎬 Live Demo Flow

✨ Features

🏗️ Architecture

🛠️ Tech Stack

Backend

Frontend

📁 Project Structure

🚀 Getting Started

Prerequisites

Backend Setup

Frontend Setup

Environment Variables

📡 API Reference

POST /api/upload

POST /api/analyze/{report_id}

GET /api/report/{report_id}

🔬 Analysis Pipeline

🗄️ Database Schema

📊 Health Scoring

⚠️ Known Issues & Limitations

🗺️ Roadmap

👥 Team

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /api/upload`

`POST /api/analyze/{report_id}`

`GET /api/report/{report_id}`

Packages