A comprehensive AI-powered medical chatbot system that provides conversational health triage, symptom assessment, doctor appointment booking, and medical history tracking.
- Multi-turn intelligent conversations about symptoms
- Emergency detection with instant 000 call simulation
- Severity assessment (Emergency, Urgent, Routine)
- Personalized medical guidance and recommendations
- Browse nearby hospitals and clinics (5 seeded hospitals in Sydney)
- View available doctors with specializations (12 doctors across specialties)
- Select date and time slots based on doctor availability
- Automatic appointment confirmations
- Real-time notifications
- Appointment booking confirmations (immediate)
- 1-day reminder before appointment
- 1-hour reminder before appointment
- Real-time notification updates on dashboard
- Complete consultation history
- Symptom tracking over time
- Diagnosis records with confidence levels
- Medication history
- Download records as TXT or PDF
- ChatGPT-style conversational interface
- Responsive design for all devices
- Sidebar chat history
- Intuitive 4-step appointment booking flow
- Beautiful gradient themes
- Image uploads (X-rays, photos)
- Document uploads (medical reports)
- Voice recording for symptom description
- Secure signup and login system
- Password hashing with SHA-256
- Session management with tokens
- Per-user data isolation
- Backend: FastAPI, Python 3.8+
- AI/LLM: Mistral 7B Instruct via Ollama
- Orchestration: LangChain, LangGraph state machines
- Database: SQLite with SQLAlchemy ORM
- Frontend: HTML5, CSS3, Vanilla JavaScript
- Authentication: Email/password with token-based sessions
Feature coverage (implemented endpoints and flows):
- Authentication: signup/login endpoints (
/auth/signup,/auth/login) inmain.py. - Patient management: register, get, and update patient profiles (
/patients/*). - Conversational triage:
/chat/triageaccepts messages + media and returns a structured triage outcome. - Medical history:
/users/{user_id}/medical-historyand/consultations/savefor storing consultations. - Appointments: hospitals list, doctors list, book/cancel appointments, reminders background task.
- Notifications: user notifications endpoints.
Testing and verification:
- Unit / integration helpers:
test_workflow.pyexercises the workflow start-to-finish (note: it expects an Ollama model running for live LLM calls). - API checks & quick script:
quick_test.pyandtest_api.pyshow simple health checks and sample triage calls.
Edge cases handled in code:
- Missing or invalid patient —
404responses and appropriate checks. - LLM JSON parsing errors — fallbacks that ask clarifying questions instead of returning dangerous assertions.
- Emergency detection via keyword matching (immediate escalation) — reduces risk of incorrect triage for clear emergencies.
- Media processing:
chat/triageaccepts media attachments and includes metadata in stored interactions.
Limitations & verification notes:
- Real clinical validation is required before any production deployment. This project is a prototype for decision support and triage only.
- The LLM is not an approved medical device; the system includes multiple disclaimers and escalation paths to clinicians.
This project intentionally integrates multiple modern technologies to maximize performance, scalability, maintainability, and safety. The following are the primary technologies used and how they contribute to the system:
- FastAPI — production-grade ASGI web framework used for the REST API (
main.py). FastAPI provides automatic OpenAPI docs, async-friendly performance, and strong typing via Pydantic. - SQLAlchemy (ORM) + SQLite — robust persistence layer (
database.py) for patient profiles, consultations, interactions, and events. SQLAlchemy provides maintainable models and migrations-ready patterns. - Ollama via LangChain (
langchain_ollama) — local LLM serving that keeps inference on-prem or on local hardware, improving privacy and reducing inference network latency compared to remote APIs. Implemented inllm_wrapper.py. - Langraph workflow engine — explicit state-graph orchestration for multi-step medical reasoning (triage → pathway → action → finalize) in
workflow.py. This enables easy testing and deterministic routing. - LangChain (and prompt engineering) — structured system prompts (see
config.py) and a few-shot approach to improve the reliability of LLM outputs and make them parseable (JSON responses expected by the service). - Frontend single-file UIs (HTML/CSS/Vanilla JS) — lightweight, dependency-free client pages:
chat_ui.html,medical_history.html,auth.html. They provide UX features such as media upload, appointment booking UI, and an emergency modal. - Safety & validation utilities — explicit emergency keyword detection, confidence thresholds (see
config.py), and fallback JSON parsing with safeties inllm_wrapper.pyto reduce hallucination risk.
Measured / observed effects (from project test comments and in-code notes):
- First LLM inference: ~5–10s (model load to VRAM) — noted in
test_workflow.pyandtest_workflowprints. Subsequent inferences are faster once the model is warmed. - Emergency detection short-circuits expensive prompts and immediately returns high-confidence escalations (instant on-match), reducing average triage latency for clear emergencies.
Notes about data provenance and assumptions:
- Performance numbers and the "5–10s" inference note come from the project's test files (comments in
test_workflow.pyandquick_test.py). If you want precise latency numbers for your hardware, runquick_test.pyand collect timing metrics.
This project uses an LLM-centered agent broken into three functional responsibilities. The implementation is in llm_wrapper.py and orchestrated by workflow.py and main.py.
Perception (input stage):
- Inputs accepted: free-text patient message, optional media (images, audio, files), and recent conversation history (last 5 interactions). The chat UI attaches media and the API stores a short media summary.
- Emergency keyword detection:
EMERGENCY_KEYWORDSinconfig.pyare checked inMedicalLLMWrapper._check_emergency_keywords()for immediate escalation (low-latency decision path).
Decision-making (reasoning / policy stage):
- Conversational triage:
perform_triage()constructs a JSON-instruction prompt for the LLM to either ask targeted follow-up questions or provide a full assessment with severity, reasoning, suggested actions, and confidence. - Care pathway selection:
recommend_care_pathway()uses a separate system prompt to map triage to structured care pathways (CARE_PATHWAY_SYSTEM_PROMPTinconfig.py). - Execution plan:
execute_action()generates concrete action steps (book appointment, call emergency services, OTC suggestions) based on the care pathway. - Confidence and escalation policy:
CONFIDENCE_THRESHOLDinconfig.pydetermines whether low-confidence cases are escalated to a clinician. Langraph routes low-confidence or emergency cases to theescalatenode.
Interaction (output / dialog stage):
- JSON-first responses: the LLM is asked to return structured JSON so downstream code can parse and present consistent information to the user. Parsing fallback is implemented to handle malformed outputs.
- Safety & disclaimers: each finalized action plan includes a medical disclaimer (see
workflow._node_finalize) andllm_wrapper.validate_response()contains heuristics for detecting risky diagnosis language.
Benefits observed in the codebase:
- Deterministic routing: by separating triage, pathway, and action into nodes, the system improves traceability and makes it easier to test each step independently.
- Fall-back and safety: emergency keywords short-circuit prompts for instant escalation; parsing fallbacks ensure the system asks clarifying questions instead of returning potentially unsafe or misleading diagnostics.
The project followed an Agile Scrum methodology to enable iterative development, rapid prototyping, and continuous improvement. The work was divided into three sprints under Stage 2, each lasting one week, focusing on progressively building and refining system functionality.
Sprint 1 (Week 10: Oct 13 – Oct 19)
Goal: Establish project foundations and implement proof-of-concept functionalities. Key Deliverables:
Frontend project setup (web/mobile) and repository workflow
Initial LLM triage pipeline and care flow integration
Session and case management (frontend)
Multimodal input handling (text/audio/image)
Emergency escalation design
Coding framework, tooling, and CI/CD workflow setup
✅ Outcome: All foundational components were successfully completed, ensuring project readiness for full-stack integration.
Sprint 2 (Week 11: Oct 20 – Oct 26)
Goal: Implement and test the full end-to-end care flow pipeline. Key Deliverables:
Reminder & follow-up engine
Emergency escalation workflow (EMS API)
Summary generation and export (PDF)
Safety guardrails for reasoning
Core AI agent logic implementation
🧩 Outcome: Completed all backend workflows and integrated safety and persistence features, preparing for deployment.
Sprint 3 (Week 12: Oct 27 – Nov 2)
Goal: System hardening and deployment for final demo. Key Deliverables:
Backend integration with AI agent and LLM alignment
Containerisation using Docker & CI/CD (GitHub Actions)
Logging, observability, and performance optimisation
Final UI/UX polish and LLM refinement
Application deployment and hosting for prototype demonstration
🚀 Outcome: Finalised and deployed a fully functional prototype with improved reliability, usability, and maintainability.
Overall Agile Outcome: The iterative sprint-based approach enabled continuous integration, regular testing, and frequent team reviews. This structure allowed the team to adapt quickly to new requirements, ensure feature completeness, and deliver a stable, production-ready healthcare assistant prototype.
Before you begin, ensure you have the following installed:
-
Python 3.8 or higher
- Download from: https://www.python.org/downloads/
- Verify:
python --version
-
Ollama (for running local LLM)
- Download from: https://ollama.ai/download
- Verify:
ollama --version
# Install all required packages
pip install -r requirements.txtWhat gets installed:
fastapi- Web frameworkuvicorn- ASGI serversqlalchemy- Database ORMpydantic- Data validationemail-validator- Email validationlangchain- LLM orchestrationlangchain-community- LangChain integrationslanggraph- Workflow state machinespython-multipart- File upload supportpython-dotenv- Environment variablesaiofiles- Async file operationsrequests- HTTP library
Visit https://ollama.ai/download and download the installer for Windows.
Run the installer - it will:
- Install Ollama
- Start Ollama service automatically
- Add Ollama to your system PATH
Open a new terminal/PowerShell window and run:
ollama pull mistral:7b-instructThis downloads the Mistral 7B Instruct model (~4.1GB). Wait for completion:
pulling manifest
pulling 61e88e884507... 100% ▕████████████████▏ 4.1 GB
pulling 43070e2d4e53... 100% ▕████████████████▏ 11 KB
pulling e6836092461f... 100% ▕████████████████▏ 42 B
pulling ed11eda7790d... 100% ▕████████████████▏ 30 B
pulling f9b1e3196ecf... 100% ▕████████████████▏ 483 B
verifying sha256 digest
writing manifest
removing any unused layers
success
ollama listExpected output:
NAME ID SIZE MODIFIED
mistral:latest 61e88e884507 4.1 GB 2 minutes ago
Run the seed script to populate with sample hospitals and doctors:
python seed_data.pyExpected output:
Seeding database with sample hospitals and doctors...
✅ Successfully seeded 5 hospitals and 12 doctors!
What gets seeded:
5 Hospitals in Sydney:
- Sydney General Hospital (2.5km away)
- Royal North Shore Hospital (5.2km)
- Westmead Hospital (8.1km)
- Prince of Wales Hospital (6.8km)
- Liverpool Hospital (12.3km)
12 Doctors across specialties:
- General Medicine (5 doctors)
- Cardiology (2 doctors)
- Pediatrics (1 doctor)
- Neurology (1 doctor)
- Orthopedics (1 doctor)
- Oncology (1 doctor)
- Dermatology (1 doctor)
python main.pyExpected output:
============================================================
🏥 Medical Llama - AI Health Assistant
============================================================
🚀 Server starting...
📍 Access the application at:
http://localhost:8000
📄 Available pages:
• Login/Signup: http://localhost:8000/
• Dashboard: http://localhost:8000/dashboard
• Chat: http://localhost:8000/chat
• Medical History: http://localhost:8000/history
🔧 API Documentation:
http://localhost:8000/docs
============================================================
- Open your web browser
- Navigate to: http://localhost:8000
- Create a new account (Sign Up)
- Start using Medical Llama!
- Click "Sign Up" tab
- Fill in the form:
- Password
- First Name
- Last Name
- Age
- Gender
- Click "Create Account"
- Automatically redirected to dashboard
- Click "Start Consultation" on dashboard
- Type your symptoms
- Answer AI's follow-up questions
- Receive severity assessment and recommendations
When severity is "Urgent":
- Click "📅 Book Doctor Appointment" button
- Step 1: Select a hospital
- Step 2: Choose a doctor
- Step 3: Pick date and time
- Step 4: Confirm booking
On dashboard you'll see:
- Upcoming Appointments - All scheduled visits
- Notifications - Booking confirmations and reminders
In Chat:
- Click "📄 Download TXT" for plain text
- Click "📑 Download PDF" for formatted document
In Medical History:
- Click "Download All as TXT"
- Click "Download All as PDF"
# Check if Ollama is running
ollama list
# Start Ollama if needed
ollama serve
# Pull model again
ollama pull mistral# Reseed the database
python seed_data.py
# Restart server
python main.py# Activate virtual environment
.\venv\Scripts\Activate.ps1
# Reinstall dependencies
pip install -r requirements.txt# Find and kill process using port 8000
netstat -ano | findstr :8000
taskkill /PID <PID> /FIMPORTANT: This application is for EDUCATIONAL AND DEMONSTRATION PURPOSES ONLY.
- ❌ NOT a substitute for professional medical advice
- ❌ NOT intended to diagnose, treat, cure, or prevent disease
- ❌ NOT a replacement for qualified healthcare providers
- ✅ Always seek advice from qualified healthcare professionals
- ✅ In a real emergency, call 000 (Australia) or your local emergency number
This is a demonstration/educational project:
- NOT intended for real medical use
- Passwords hashed with SHA-256 (use bcrypt for production)
- No HTTPS encryption in development mode
- Simple session tokens (use JWT for production)
try again/
├── main.py # FastAPI application
├── database.py # Database models
├── workflow.py # LangGraph workflow
├── config.py # Configuration
├── seed_data.py # Database seeding
├── requirements.txt # Dependencies
├── medical_llama.db # SQLite database (auto-created)
├── auth.html # Login/Signup page
├── dashboard.html # Dashboard
├── chat_ui.html # Chat interface
└── medical_history.html # Medical records
Once the server is running, visit:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
- Python 3.8+ installed
- Ollama installed and running
- Mistral model pulled (
ollama pull mistral) - Dependencies installed (
pip install -r requirements.txt) - Database seeded (
python seed_data.py) - Server started (
python main.py) - Browser opened to http://localhost:8000
- Account created
- First consultation completed
- Appointment booked
- Notifications received
- Real SMS/Email notifications (Twilio/SendGrid)
- Video consultation scheduling
- Prescription management
- Lab results integration
- Mobile app (React Native)
- Multi-language support
- Wearable device integration
This project is provided as-is for educational purposes.
For issues:
- Check Troubleshooting section
- Review API docs at http://localhost:8000/docs
- Check browser console for errors (F12)
- Review server logs in terminal
Remember: This is a demonstration project. Always consult real healthcare professionals for medical advice!
Version: 1.0.0
Last Updated: November 2, 2025