Skip to content

FreedomIntelligence/Awesome-LLM-Patient-Simulators

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 

Repository files navigation

Awesome-LLM-Patient-Simulators ✨

Awesome LLM Patient Simulators Last Commit GitHub Stars

This repository provides a curated collection of research papers on Large Language Models (LLMs) for patient simulation in healthcare, covering:

  • πŸ₯ Virtual patient interactions
  • πŸŽ“ Clinical training applications
  • πŸ” Diagnostic reasoning systems
  • 🧠 Mental health simulations

Table of Contents πŸ“‹


πŸ” Fundamental Research πŸ“š

Paper Published in Focus Area Description Data/Demo
Simulator Limitations and Their Effects on Decision-Making Sage 1996/10 High-Fidelity Simulation Principles Outlines core principles of high-fidelity patient simulation (dynamic case design, structured feedback, and behavioral assessment) that remain foundational for modern LLM-based patient simulation systems in medical education. -
Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System Springer Link 2021/05 Virtual Patient Dialogue System Usability Developed a virtual patient dialogue system for medical students to practice. Data | Demo
The simulated patient method: Design and application in health services research Science Direct 2021/12 Simulated Patient Method Design (Health Services) Explores the design and application of the simulated patient method in health services research, particularly in pharmacy practice. -
Patient-Drama: A Literature Review of Simulated Patient Experiences in Medical Education and Training Lesley 2022/04 Simulated Patient Experience Examines simulated patient experiences through the lens of drama therapy. Four specific types of simulated patient experiences are examined- peer role play, peer simulation, standardized patient experiences, and clinical scenario drama. -
The Role of ChatGPT, Generative Language Models, and Artificial Intelligence in Medical Education JMIR Med Educ 2023/03 LLMs in Medical Education Explores the potential application of ChatGPT and other generative language models in medical education, including virtual patient simulation. -
ChatGPT and Other Large Language Models in Medical Education β€” Scoping Literature Review Springer 2024/11 LLMs in Medical Education A scoping review of the first-year research on the application of large language models (LLMs) in medical education was conducted, and a total of 145 studies were included. The results indicated that most studies focused on the ability of LLMs to pass medical examinations. A few discussed the teaching uses of LLMs such as in simulated patient cases. The number of studies that truly implemented teaching experiments was very limited. -
From ChatGPT to DeepSeek: Can LLMs Simulate Humanity? arXiv 2025/02 LLM Human Simulation Capabilities Analyses LLMs' ability to simulate human behavior, discussing key limitations (e.g., bias, lack of incentives) and future alignment strategies. Data(Appendix)

πŸ€’ Virtual Patient Development (Historical Context)

Paper Published in Focus Area Description
Virtual Patient Simulation at U.S. and Canadian Medical Schools ResearchGate (Orig. Acad Med) 2007 VP Adoption in Medical Schools Examines the adoption, implementation, and educational impact of virtual patient simulations in medical schools across the U.S. and Canada.
High-fidelity patient simulation: considerations for effective learning. ResearchGate (Orig. J Contin Educ Health Prof) 2010/09 High-Fidelity Simulation Design Examines high-fidelity patient simulation in medical education, emphasizing key design and implementation factors that optimize learning outcomes.
Virtual patient simulation: knowledge gain or knowledge loss? BMC Medical Education 2010/09 VP Effectiveness (Clinical Skills) Explores the use of virtual patient simulations in medical education, demonstrating their effectiveness in improving clinical decision-making skills.
Virtual patient simulation: what do students make of it? A focus group study BMC Medical Education 2010/12 VP Effectiveness (Student Perspective) Evaluates the effectiveness of virtual patient simulations in medical education, finding that they enhance clinical reasoning skills among students.
Effectiveness of patient simulation in nursing education: meta-analysis. Nurse Education Today 2015/01 VP Effectiveness (Nursing Meta-Analysis) Identify the best available evidence about the effects of patient simulation in nursing education through a meta-analysis.
Guidelines for the design of a virtual patient for psychiatric interview training Springer Link 2020/07 VP Design Guidelines (Psychiatry) Specific guidelines for designing psychiatric virtual patients... can be directly applied to LLM-driven patient simulation systems...

πŸ› οΈ Methodology βš™οΈ

This section categorizes papers based on their core methodological contribution to LLM-based patient simulation.

πŸ€– Agent Systems & Simulation Environments

Focus on systems using multiple interacting agents or complex agent behaviors.

Paper Submitted in Description Method Code/Project/Data
Towards Conversational Diagnostic AI arXiv 2024/01 Introduces AMIE, an AI-driven system powered by LLMs that simulates virtual patient-doctor interactions for clinical education... LLM, Agent (Implicit), Dialogue System Resource
Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents arXiv 2024/05 Introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness... all patients, nurses, and doctors are LLM-powered autonomous agents. LLM, Agent System, Simulation Environment Dataset_MedQA
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator arXiv 2024/06 (Revised) Introduces AI Hospital, a multi-agent framework for evaluating large language models in simulated medical interactions. LLM, Agent System, Benchmark, Evaluation Framework Project | Dataset_MVME | Dataset_MedQA | Dataset_PubMedQA | Dataset_MedMCQA
Towards a Client-Centered Assessment of LLM Therapists by Client Simulation arXiv 2024/06 Proposes ClientCAST, a novel LLM-based client simulation framework to assess the performance of AI therapists... LLM, Agent (Client Simulation), Evaluation Framework Project | Dataset_AnnoMI
AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow arXiv 2024/10 Presents a novel agentic workflow combining EHR knowledge graphs with Reasoning-RAG architecture. LLM, Agent System, RAG, Knowledge Graph, EHR Integration Code | Data (MIMIC-III)
LLMs Can Simulate Standardized Patients via Agent Coevolution arXiv 2024/12 Introduces the EvoPatient framework, which utilizes multi-agent coevolution to enable LLMs to simulate standardized patients... LLM, Agent System, Coevolution Project
Scaffolding Empathy: Training Counselors with Simulated Patients and Utterance-level Performance Visualizations arXiv 2025/02 Introduces a LLM-based multi-agent system (SimPatient) with dynamic cognitive modeling and real-time feedback to enhance motivational interviewing (MI) training... LLM, Agent System, Cognitive Modeling, Feedback Visualization -
Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions arXiv 2025/03 introduce MedAgentSim, an open-source simulated clinical environment with doctor, patient, and measurement agents designed to evaluate... LLM performance... LLM, Agent System, Simulation Environment, Evaluation Dataset_MedQA

πŸ—£οΈ NLP & Dialogue Systems

Focus on natural language processing techniques and the architecture of dialogue systems, including hybrid approaches.

Paper Submitted in Description Method Code/Project/Data
Artificial intelligence in virtual standardized patients: Combining natural language understanding and rule based dialogue management... Medical Teacher 2022/11 Developed a novel hybrid dialogue system using ASR, hybrid AI, and automated speech generation, enabling artificially intelligent VSPs... NLP, Dialogue System, Rule-based AI, ASR, Speech Generation -

🧬 Generative Models & Data Integration

Focus on the use of generative models for tasks like data augmentation, image generation, or integrating structured data.

Paper Submitted in Description Method Code/Project/Data
Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models arXiv 2024/05 Introduces a semi-structured data and LLM-integrated framework to automate the generation of medical simulation scenarios... LLM, Structured Data Integration, RAG (Implicit), Scenario Generation -
MedDiT: A Knowledge-Controlled Diffusion Transformer Framework for Dynamic Medical Image Generation in Virtual Simulated Patient arXiv 2024/08 Introduces MedDiT, a knowledge-controlled diffusion transformer framework designed to dynamically generate medical images for virtual simulated patients. Transformer, Diffusion Models, Generative Models, Knowledge Control, Image Generation -
Augmenting Insufficiently Accruing Oncology Clinical Trials Using Generative Models: Validation Study JMIR 2025/03 Explores the use of generative models to augment oncology clinical trials with insufficient patient accrual, validating their effectiveness in simulating additional patient data to improve trial outcomes. Generative Models, Data Augmentation Data (Appendix)

πŸ’¬ Prompt Engineering & Chatbots

Focus on the direct application of LLMs via carefully crafted prompts or chatbot interfaces as the primary contribution.

Paper Submitted in Description Method Code/Project/Data
A Generative Pretrained Transformer (GPT)–Powered Chatbot as a Simulated Patient to Practice History Taking... JMIR 2024/01 Employs a GPT-powered chatbot as a simulated patient, using meticulously crafted prompts to replicate realistic clinical interactions... LLM, Prompt Engineering, Chatbot Data(Appendix)
Application of Large Language Models in Medical Training Evaluationβ€”Using ChatGPT as a Standardized Patient: Multimetric Assessment JMIR 2025/01 Optimized ChatGPT for medical standardized patient simulation via prompt engineering, enhancing clinical accuracy and realism. LLM, Prompt Engineering, Evaluation Data(Appendix)

πŸ—οΈ Frameworks, Platforms & Workflows

Focus on integrated systems, platforms, specific workflows, or hardware integrations for patient simulation.

Paper Submitted in Description Method Code/Project/Data
Creating Virtual Patients using Robots and Large Language Models: A Preliminary Study with Medical Students ACM 2024/03 Presents a virtual patient platform combining the social robot Furhat with LLMs (GPT-3.5-turbo) to enhance clinical reasoning training... LLM, Generative Models, Robotics Integration, Dialogue System -
Leveraging Large Language Model as Simulated Patients for Clinical Education arXiv 2024/04 Introduces CureFun, an integrated framework that utilizes LLMs as simulated patients for clinical education, enhancing student-patient dialogues... LLM, Framework, Evaluation -
RasPatient Pi: A Low-Cost Customizable LLM-Based Virtual Standardized Patient Simulator Springer Link 2025 (Year inferred) Presents RasPatient Pi, a low-cost, customizable virtual standardized patient simulator based on Large Language Models (LLMs). LLM, Hardware Integration (Raspberry Pi), Framework Project

πŸ₯ Applications 🩺

🩺 General Medical Education

Paper Published in Description Key Feature/System Code/Project/Data
Virtual patient simulation: what do students make of it? A focus group study. BMC Med Educ 2010 Evaluates virtual patient simulations in medical training, concluding that they significantly improve clinical reasoning skills... Virtual Patient Simulation -
Using virtual standardized patients to accurately assess information gathering skills in medical students Medical Teacher 2019/06 Developed a virtual standardized patient (VSP) system that allows students to practice their history taking skills and receive immediate feedback. Virtual Standardized Patient (VSP) System, Automated Assessment -
The Role of Large Language Models in Medical Education: Applications and Implications JMIR 2023/08 Shows LLMs can simulate patient encounters for medical education but lack nonverbal communication and may exhibit biases. LLM for Patient Encounters -
Virtual Standardized LLM-AI Patients for Clinical Practice Annual Review of CyberTherapy and Telemedicine 2024/01 Explores the use of virtual standardized AI patients powered by LLMs to enhance clinical practice and therapy training. LLM-AI Patients -
Increasing Realism and Variety of Virtual Patient Dialogues for Prenatal Counseling Education Through a Novel Application of ChatGPT: Exploratory Observational Study JMIR 2024/02 Explored the effectiveness of generative AI (ChatGPT) in developing realistic virtual standardized patient dialogues to teach prenatal counseling skills. Virtual Patient Dialogues -
Enhancing communication and clinical reasoning in medical education: Building virtual patients with generative AI Future Healthcare Journal 2024/04 Demonstrates the viability of using LLM technology, specifically GPT-3.5 and GPT-4, to create realistic virtual patients. Generative AI (GPT-3.5, GPT-4) for VP Creation -
Utilizing generative conversational artificial intelligence to create simulated patient encounters... anaesthesia training Postgraduate Medical Journal 2024/04 Explores the use of generative conversational artificial intelligence to create simulated patients for medical education, β€˜Convai’ was used... Generative Conversational AI (Convai) -
Simulating Diverse Patient Populations Using Patient Vignettes and Large Language Models ACL Anthology 2024/05 Demonstrates how LLMs like GPT-4 can simulate diverse patient populations via role-prompted vignettes, enabling cost-effective testing... LLM (GPT-4), Patient Vignettes, Role-Prompting -
Roleplay-doh: Enabling Domain-Experts to Create LLM-simulated Patients via Eliciting and Adhering to Principles arXiv 2024/07 Introduces Roleplay-doh, a tool that enables domain experts to collaboratively create realistic AI patient simulations by providing qualitative feedback. Roleplay-doh Tool, Collaborative Creation, Qualitative Feedback Project
Creating virtual patients using large language models: scalable, global, and low cost PubMed (Cureus) 2024/07 Presents a low-cost, AI-driven virtual patient (LLM-VP) system using LLMs like GPT-4 for clinical reasoning training... LLM-VP System (GPT-4 based) -
A Language Model-Powered Simulated Patient With Automated Feedback for History Taking: Prospective Study JMIR 2024/08 Presents a language model-powered simulated patient designed to enhance history-taking skills... providing automated feedback... LLM Simulated Patient, Automated Feedback -
Enhancing Medical Interview Skills Through AI-Simulated Patient Interactions: Nonrandomized Controlled Trial JMIR 2024/09 Demonstrates that AI-simulated patient interactions effectively enhance medical interview skills in a nonrandomized controlled trial... AI-Simulated Patient Interactions -
Enhancing Patient-Centric Communication: Leveraging LLMs to Simulate Patient Perspectives (Note: Link uses original ID 2501) arXiv 2025/01 Explores LLM-based patient simulation via persona prompts, showing higher accuracy for educated groups but biases against underserved populations. LLM, Persona Prompts -
MedSimAI: Simulation and Formative Feedback Generation to Enhance Deliberate Practice in Medical Education (Note: Link uses original ID 2503) arXiv 2025/03 Introduces MedSimAI, an AI-driven platform designed to enhance medical education by providing realistic simulations and formative feedback. MedSimAI Platform, Simulation, Formative Feedback -

πŸ‘©β€βš•οΈ Nurse Training

Paper Published in Description Key Feature/System
Evaluation of Large Language Model Generated Dialogues for an AI Based VR Nurse Training Simulator Springer Link 2024/06 Evaluates the efficacy of LLM-generated dialogues (GPT-3.5, Bard, ClaudeAI) for VR nurse training simulators LLM (GPT-3.5, Bard, ClaudeAI), VR, Nursing
Assessing the efficacy of ChatGPT as a virtual patient in nursing simulation training: A study on nursing students' experience Science Direct 2024/07 Assessed ChatGPT's effectiveness in nursing simulation, which holds promise in enriching clinical learning. ChatGPT, VR, Nursing

🧠 Psychiatric Medicine

Paper Published in Description Key Feature/System Code/Project/Data
Embodied Virtual Patients as a Simulation-Based Framework for Training Clinician-Patient Communication Skills... Psychiatric and Geriatric Care Frontiers 2022/06 The application of virtual patients in psychiatric and geriatric care is explored, especially for training doctor-patient communication skills. Embodied Virtual Patients -
LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation arXiv 2023/05 Demonstrates ChatGPT's feasibility in powering psychiatric patient and doctor chatbots, validated through real clinician-patient evaluations... LLM (ChatGPT), Chatbots (Patient & Doctor Simulation) -
Feasibility assessment of using ChatGPT for training case conceptualization skills in psychological counseling ScienceDirect 2024/8-12 Investigates the feasibility and effectiveness of using ChatGPT for training case conceptualization skills in psychological counseling(Psychological counseling case conceptualization refers to the comprehensive analysis and assessment of each client.) LLM (ChatGPT), Assessment, Psychological counseling -
PATIENT-Ξ¨: Using Large Language Models to Simulate Patients for Training Mental Health Professionals arXiv 2024/10 Introduces PATIENT-Ξ¨, an LLM-based patient simulation system for training mental health professionals in cognitive behavioral therapy. PATIENT-Ξ¨ System, LLM, CBT Training -
PSYCHE: A Multi-faceted Patient Simulation Framework for Evaluation of Psychiatric Assessment Conversational Agents arXiv 2025/01 Presents PSYCHE, a multi-faceted patient simulation framework designed to evaluate psychiatric assessment conversational agents... PSYCHE Framework, Evaluation Framework -
A large language model digital patient system enhances ophthalmology history taking skills) npj Digit. Med. 2025/04 Developed and evaluated an LLM-based digital patient system (LLMDP) via a randomized controlled trial, demonstrating significant improvements in history-taking skills and empathy for ophthalmology trainees. LLM-SP, Eye manifestations, Education -

βš•οΈ Clinical Medicine

Paper Published in Description Key Feature/System Data/Appendix
ChatGPT Interactive Medical Simulations for Early Clinical Education: Case Study JMIR Med Educ. 2023/11 Models ChatGPT 3.5's ability to perform interactive clinical simulations and shows this tool's benefit to medical education. LLM Patient Simulation, Structured Feedback -
Large language models improve clinical decision making of medical students through patient simulation and structured feedback: a randomized controlled trial PubMed Central (BMJ HCI) 2024/11 Demonstrates that AI-simulated medical history conversations, particularly when combined with structured feedback, can enhance clinical decision-making skills in medical students. LLM Patient Simulation, Structured Feedback Data(Appendix)
Virtual Patients Using Large Language Models: Scalable, Contextualized Simulation of Clinician-Patient Dialogue With Feedback JMIR 2025/04 Show that VPs powered by large language models(LLMs) can generate authentic dialogues, accurately represent patient preferences, and provide personalized feedback on clinical performance. LLM Patient Simulation, Evaluation -

πŸ’¬ Patient Communication Simulation

Paper Published in Description Key Feature/System
Integrating GPT-Based AI into Virtual Patients to Facilitate Communication Training Among Medical First Responders: Usability Study of Mixed Reality Simulation JMIR 2025/03 Proposes the use of Large Language Models (LLMs) to simulate authentic patient communication styles and these VPs are evaluated by medical professionals. LLM, Authentic Communication Style Simulation
Beyond the Script: Testing LLMs for Authentic Patient Communication Styles in Healthcare arXiv 2025/03 Study how the integration of GPT-based AI in a mixed reality (MR)–VP could support communication training of MFRs(medical first responders). LLM, Mixed Reality
Modeling Challenging Patient Interactions: LLMs for Medical Communication Training (Note: Link uses original ID 2503) arXiv 2025/03 Proposes the use of Large Language Models (LLMs) to simulate authentic patient communication styles, specifically personas derived from the Satir model... multilingual applicability. LLM, Persona Simulation (Satir), Multilingual
AI Standardized Patient Improves Human Conversations in Advanced Cancer Care arXiv 2025/05 Introduced an AI-based standardized patient system called SOPHIE (Standardized Online Patient for Healthcare Interaction Education) to improve the serious illness communication (SIC) capabilities of healthcare professionals in advanced cancer care. LLM, Persona Simulation (Satir), Virtual Avatar

β˜‘οΈ Evaluation Methods & Effects 🚨

Paper Published in Focus Description Project/Benchmark
ChatGPT: when the artificial intelligence meets standardized patients in clinical training Journal of Translational Medicine 2023/07 Expert Evaluation(Human-Centric) Collected 10 patient histories related to clinical training and education using ChatGPT, and evaluated them by senior physicians to verify the accuracy of the information generated by ChatGPT simulating SP. -
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena arXiv 2023/10 LLM Evaluation Explore using strong LLMs as judges to evaluate these models on more open-ended questions. We examine the usage and limitations of LLM-as-a-judge and propose solutions to mitigate some of them. Data
Analyzing evaluation methods for large language models in the medical field: a scoping review BMC Med Inform Decis Mak 2024/11 LLM Evaluation Review (Medical) Reviews studies on LLM evaluations in the medical field and analyzes the research methods used in these studies. -
NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes arXiv 2024/02 (Revised) LLM Reasoning Benchmark Introduces NPHardEval, a dynamic benchmark grounded in computational complexity classes... to rigorously evaluate LLMs' reasoning abilities... NPHardEval Project
NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models arXiv 2024/03 Multimodal LLM Reasoning Benchmark Introduces NPHardEval4V, a dynamic multimodal benchmark designed to evaluate the pure reasoning abilities of MLLMs... revealing significant gaps between MLLM and LLM reasoning... NPHardEval4V Project
Assessing the efficacy of ChatGPT as a virtual patient in nursing simulation training... ScienceDirect (Nurse Educ Pract) 2024/07 ChatGPT Effectiveness Assessment (Nursing) Assessed ChatGPT's effectiveness in nursing simulation by collecting data of acceptability, accessibility, engagement ratings, and assessing students' interaction skills. -
PersonaEval: Benchmarking LLMs on Role-Playing Evaluation Tasks ICLR 2024/10 Role-Playing Evaluation, Benchmark Introduce PersonaEval, a benchmark designed to assess the effectiveness of LLMs in role-playing evaluation tasks. -
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation arXiv 2025/04 Role-Playing Language Models Benchmark Introduces a benchmark for evaluating the role-playing capabilities of language models. Project

⚠️ Challenges & Limitations 🚨

Paper Published in Focus Area Description
A Survey of Large Language Models in Medicine: Progress, Application, and Challenge arXiv 2024/07 (Revised) LLM in Medicine Survey The applications of LLM in medicine are reviewed, covering the challenges of patient simulation (such as dynamic interaction and realism).

🎭 Bias & Ethical Considerations

Paper Published in Focus Area Description
Ethical Considerations in the Use of Artificial Intelligence and Machine Learning in Health Care: A Comprehensive Review PubMed Central (Cureus) 2024/06 AI/ML Ethics in Healthcare Explored the multifaceted ethical considerations in the use of AI and ML in health care, including privacy, algorithmic bias, transparency, validation, and professional responsibility.

πŸ“Š Datasets πŸ’Ύ

This section lists datasets and project repositories mentioned in the papers that are either generated by the simulation process or used for evaluation and grounding. Note that project repositories may contain various resources including code, configuration files, and potentially simulation data or logs. Appendices often contain supplementary information like prompts or evaluation rubrics rather than reusable datasets.

Simulation Dialogue, Configuration & Resources

Dataset/Resource Paper Year Language Data Type Project/Code Link
PG-logs-eval 2021/05 French Dialogue Logs - Link
medicinenet-diseases.json 2024/01 English Disease List Code Repo Link
ClientCAST Framework 2024/06 English Simulation Framework Project Project Link
EvoPatient Framework 2023/12 English Simulation Framework Project Project Link
RasPatient Pi 2025 (Year inferred) English Simulation Framework Project Project Link
Roleplay-doh Tool 2024/07 English Simulation Tool Project Project Link
AIPatient Workflow 2024/10 English Agentic Workflow Code Code Link

Evaluation & Grounding Datasets

Dataset Task Type Language Access Used In Paper (Year) Link Project (if applicable)
MIMIC-III EHR Data English Restricted 2024/10 (AIPatient) Link -
MedQA (USMLE) Medical QA English Public 2024/05, 2024/06, 2025/03 Link AI Hospital
MVME Medical Vision/Language Multimodal Public 2024/06 (AI Hospital) Link AI Hospital
PubMedQA Biomedical QA English Public 2024/06 (AI Hospital) Link AI Hospital
MedMCQA Medical QA (India) English Public 2024/06 (AI Hospital) Link AI Hospital
AnnoMI Motivational Interviewing English Public 2024/06 (ClientCAST) Link ClientCAST
NPHardEval LLM Reasoning Benchmark English Public 2024/02 Project Link NPHardEval
NPHardEval4V MLLM Reasoning Benchmark Multimodal Public 2024/03 Project Link NPHardEval4V

⭐ Star History πŸ“ˆ

About

A Paper collection for LLM based Patient Simulators

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •