βοΈ A lightweight Retrieval-Augmented Generation (RAG) assistant for clinical diagnosis, trained on annotated medical notes from MIMIC-IV-Ext-DiReCT, and deployable both on GPU (fast) and CPU (slow) modes.
π― Try the model live (CPU deployment):
π RAGnosis β Hugging Face Spaces
| Layer | Details |
|---|---|
| π§ Model | Nous-Hermes-2-Mistral-7B-DPO (GPU) / BioMistral-7B (CPU) |
| π₯ Dataset | MIMIC-IV-Ext-DiReCT |
| π Retriever | FAISS + SentenceTransformers (all-MiniLM-L6-v2) |
| π» Frontend | Gradio (via Hugging Face Spaces) |
| π§ Backend | PyTorch + Transformers + BitsAndBytes |
- π Top-k retrieval from real, annotated clinical notes
- π§ Explainable diagnosis using structured logic and LLMs
- π Based on real diagnostic chains from MIMIC-IV-Ext-DiReCT
- π¬ Clean Gradio UI for free-text medical queries
- β Supports GPU for fast inference or CPU fallback
- Parse annotated
.jsonsamples and knowledge graphs - Chunk clinical facts into
retrieval_corpus.csv - Embed chunks using Sentence-BERT
- Save embeddings into
faiss_index.bin
- Query is embedded using MiniLM
- Top-k chunks are returned using FAISS index
- Query and context are merged into a prompt
- Model (Mistral-7B) generates the diagnosis
- Output is parsed and shown in Gradio UI
| Feature | Hugging Face (CPU) | Local/Colab (GPU) |
|---|---|---|
| Model Used | BioMistral/BioMistral-7B |
Nous-Hermes-2-Mistral-7B-DPO |
| Speed | π’ ~500 seconds per query | β‘ <10 seconds per query |
| Accuracy | β Good | β Great (instruction-tuned) |
| Setup | Ready-to-use (slow) | Requires CUDA but runs super fast |
| Hosting | Free (Hugging Face Spaces) | Free (Colab, Kaggle, local CUDA) |
π‘ For real-time use, prefer running the CUDA version via GitHub clone. Hugging Face version is for preview/demo only.
git clone https://github.com/asadsandhu/RAG-Diagnostic-Assistant.git
cd RAG-Diagnostic-Assistantpip install -r requirements.txtpython app.pyβοΈ Required files are included:
retrieval_corpus.csvfaiss_index.bin
RAG-Diagnostic-Assistant/
βββ app.py # Deployable backend using Gradio
βββ RAGnosis.ipynb # Notebook version of the pipeline
βββ faiss_index.bin # FAISS vector index
βββ retrieval_corpus.csv # Processed clinical chunks
βββ requirements.txt # Dependencies
βββ assets/
β βββ demo.png # Sample UI screenshot
βββ README.md
- Combines annotated diagnostic chains (in
samples/) and structured graphs (indiagnostic_kg/) - Captures how clinicians move from symptom β rationale β diagnosis
- Original repo: DiReCT GitHub
Query: "patient is experiencing shortness of breath"
π¬ LLM Output:
"Shortness of breath is a common symptom that can be caused by a variety of respiratory conditions. The differential diagnosis for shortness of breath includes asthma, chronic obstructive pulmonary disease (COPD), congestive heart failure, pneumonia, and pneumothorax. In order to determine the cause of the shortness of breath, it is important to consider the patient's medical history, physical examination findings, and diagnostic testing results. For example, if the patient has a history of asthma and is experiencing wheezing and a prolonged expiratory phase on examination, this would suggest asthma as the cause of the shortness of breath. On the other hand, if the patient has a history of congestive heart failure and is experiencing orthopnea, crackles on auscultation, and a history of edema, this would suggest congestive heart failure as the cause of the shortness of breath."
π Read the full blog explaining RAGnosis, dataset structure, pipeline design, and tradeoffs:
π Read on Medium
Built by Asad Ali AI Developer & NLP Researcher
- π LinkedIn
- π§ Medium
- π» GitHub
- π€ Hugging Face
MIT License. See LICENSE for details.
- π₯ MIT-LCP for MIMIC-IV dataset
- π§ͺ DiReCT team for annotated clinical reasoning data
- π€ Hugging Face Transformers & Gradio
- π Facebook Research for FAISS
- π§ Nous Research for Mistral models
β οΈ Disclaimer: This tool is for academic and demo purposes only. Not intended for clinical use.
