Skip to content

rajantripathi/Multimodal-Agentic-AI-for-Biomedical-Insight-and-Decision-Support

Repository files navigation

Multimodal Agentic AI for Biomedical Insight and Decision Support

Disclaimer

This project is a research/competition prototype only and is not for clinical diagnosis, treatment, or medical decision-making.

What This Is

This repository provides a local, offline-capable prototype for multimodal biomedical insight generation using isolated modality agents and a verifier layer.

Architecture

Case Input
  |-- vision_agent ------|
  |-- ehr_agent ---------|
  |-- genomics_agent ----|--> verifier_agent --> final JSON + audit trail
  |-- literature_agent --|

Rules:

  • Agents do not communicate with each other.
  • Verifier is the only consumer of agent outputs.
  • All outputs are strict Pydantic schemas and JSON-serializable.
  • No invented citations. Literature evidence must come from retrieved local corpus documents.

Local Install (Baseline)

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

GPU VM Setup (Ubuntu 22.04)

  1. Bootstrap base environment:
bash scripts/bootstrap_vm.sh
  1. Install CUDA-matched torch/torchvision (example, adjust for your CUDA version):
source venv/bin/activate
pip install --index-url https://download.pytorch.org/whl/cu121 torch torchvision
  1. Verify GPU:
python scripts/check_gpu.py
  1. Configure runtime:
cp .env.example .env
# edit .env and set USE_REAL_* flags, HF_TOKEN, model ids, DEVICE=cuda

Runtime Configuration

Environment variables are loaded from .env (via python-dotenv).

Key flags:

  • USE_REAL_VISION=true|false
  • USE_REAL_EHR=true|false
  • USE_REAL_LITERATURE=true|false
  • DEVICE=cuda|cpu
  • VISION_MODEL_ID
  • EHR_MODEL_ID
  • LIT_EMBED_MODEL_ID
  • INFERENCE_TIMEOUT_SECONDS
  • LITERATURE_CORPUS_PATH (default data/processed/papers.jsonl)
  • LIT_EMBED_CACHE_PATH (default data/processed/lit_embeddings.npz)

Default mode remains backward compatible (USE_REAL_* = false).

Data Acquisition and Normalization

Create canonical directories:

mkdir -p data/raw data/processed

Literature (Europe PMC):

python scripts/download_literature_europepmc.py \
  --query "biomedical malignancy risk factors" \
  --page-size 100 --pages 2 \
  --output data/processed/papers.jsonl

Vision manifest normalization:

python scripts/normalize_vision_manifest.py \
  --input data/raw/vision_manifest.csv \
  --output data/processed/vision_manifest.csv \
  --base-dir data/raw

EHR manifest normalization:

python scripts/normalize_ehr_manifest.py \
  --input data/raw/ehr_manifest.csv \
  --output data/processed/ehr_manifest.csv

Genomics QC report:

python scripts/genomics_qc.py data/raw/*.csv --output data/processed/genomics_qc.json

Model Download and Prewarm

python scripts/prewarm_models.py

Build literature embedding cache (if USE_REAL_LITERATURE=true):

python scripts/build_literature_embeddings.py

Run

CLI pipeline:

python -m orchestrator.run sample_cases/case_01
python -m demo.run_case sample_cases/case_01

Streamlit UI:

streamlit run apps/streamlit_app.py

Deployment

Systemd unit template:

  • deploy/systemd/agentic-health.service

Snapshot environment for reproducibility:

bash scripts/snapshot_env.sh

Tests

pytest -q
python -m eval.smoke_test

Notes on Real Model Integration

  • Vision agent: optional Hugging Face image-classification path with timeout and heuristic fallback.
  • EHR agent: optional Hugging Face zero-shot path with timeout and rule-based fallback.
  • Literature agent: embedding retrieval path with timeout and TF-IDF fallback.
  • Genomics agent: deterministic robust CSV/QC baseline retained.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors