Skip to content

binnisha/Rural_Telehealth_NLP_Backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Low-Bandwidth Multilingual Telehealth Platform 🏥

Powered by MediBridge (AI-Driven Backend Engine)

FastAPI Python Whisper IndicTrans2 License Status


🌍 The Problem

India has 1.4 billion people. 65% live in rural areas. And yet - the entire digital healthcare revolution has been built for urban India.

Here is the reality on the ground:

  • 🗣️ A farmer in Bihar speaks Bhojpuri. The nearest specialist is in Delhi and speaks English.
  • 👩‍⚕️ A tribal woman in Jharkhand speaks Santali. No telehealth app supports her language.
  • 📶 A village in Rajasthan has 2G internet. Every existing telehealth platform buffers, crashes, or simply doesn't load.
  • 📋 A rural ASHA worker fills patient forms in Hindi. The hospital system only accepts English records.

Practo, 1mg, Apollo 247, Tata Health - these platforms are excellent. For urban India. For English speakers. For 4G users.

They were never designed for the 800 million who don't fit that profile.

This platform is.


💡 What We're Building - And Why It's Different

The Low-Bandwidth Multilingual Telehealth Platform is a healthcare communication infrastructure built from the ground up for rural India.

At its core is MediBridge - a custom AI backend engine that does something no existing telehealth platform in India currently offers:

A patient speaks in their native Indian language. MediBridge transcribes it, translates it into English, and delivers it to a doctor, in real time & on a 2G connection.

That's the entire value proposition. Simple to explain. Incredible to build. And we're building it.


🆚 How We Stand Out Against Market Leaders

Feature Practo 1mg Apollo 247 Our Platform
Multilingual Voice Input ✅ 22 Languages
Low-Bandwidth Optimized ✅ Tested on 2G
AI Medical Translation ✅ IndicTrans2
Rural-First Design ✅ Core Focus
DISHA Compliant Backend ⚠️ ⚠️ ⚠️ ✅ By Design
Open Source Backend ✅ GitHub

The gap in the market is real. The technology to fill it now exists. This platform connects the two.


🔬 The Novelty - What Makes This Genuinely New

Most telehealth platforms are booking systems with a video call bolted on. We are building a language infrastructure layer for Indian healthcare.

Three things make this novel:

1. IndicTrans2 in a Medical Context: IndicTrans2 by AI4Bharat is the most accurate Indic language translation model available today, supporting all 22 scheduled Indian languages. No production telehealth system currently uses it. We are integrating it specifically for medical terminology, where accuracy isn't a nice-to-have, it's life-critical.

2. Whisper + IndicTrans2 Pipeline: OpenAI Whisper handles speech-to-text across Indian languages and accents with remarkable accuracy. Chaining Whisper → IndicTrans2 into a single real-time API pipeline is what MediBridge does. This specific pipeline does not exist as a production healthcare tool anywhere currently.

3. Low-Bandwidth as a Design Constraint, Not an Afterthought: Every architectural decision - payload size, API response structure, audio compression - is made with a 2G connection in mind. We will document and publish our bandwidth benchmarks openly. This is measurable, verifiable, and reproducible by anyone.


⚙️ How It Works - Platform Workflow

PATIENT SIDE                    MEDIBRIDGE ENGINE                 DOCTOR SIDE
─────────────                   ─────────────────                 ───────────
Patient speaks          →       1. Audio received by FastAPI  →
in Tamil / Hindi /              2. Whisper transcribes audio
Bhojpuri / any of               3. IndicTrans2 translates          English text
22 Indian languages     →       4. Text stored in PostgreSQL  →    displayed to
via mobile interface            5. JWT-secured delivery            doctor in
                                6. Encrypted per DISHA             real time

In simple terms:

  1. Patient opens app → taps microphone → speaks in their language
  2. Audio sent to MediBridge backend via API
  3. Whisper converts speech → text (in patient's language)
  4. IndicTrans2 translates → English
  5. Doctor sees clean English text of what patient said
  6. Entire exchange stored securely, DISHA compliant

🏗️ MediBridge - Backend Architecture

MediBridge is the engine under the hood. Here's how it's structured:

medibridge-backend/
├── app/
│   ├── main.py            ← FastAPI server entry point
│   ├── routes/            ← All API endpoint definitions
│   │   ├── transcribe.py  ← Speech-to-text endpoints
│   │   ├── translate.py   ← IndicTrans2 translation endpoints
│   │   ├── auth.py        ← Login, JWT token management
│   │   └── patients.py    ← Patient record management
│   ├── models/            ← PostgreSQL database models
│   ├── services/          ← Core AI pipeline logic
│   │   ├── whisper.py     ← Whisper integration
│   │   └── indictrans.py  ← IndicTrans2 integration
│   └── utils/             ← Encryption, helpers, validators
├── tests/                 ← Full test suite
├── .env                   ← Environment variables (never committed)
├── Dockerfile             ← Container configuration
├── requirements.txt       ← All dependencies
└── README.md

🛠️ Tech Stack - Afterall, Every Choice Has a Reason

Layer Technology Why
Backend Framework FastAPI (Python) Async, fast, auto-docs, production-grade
Speech-to-Text OpenAI Whisper Best-in-class multilingual, handles Indian accents
Translation Engine IndicTrans2 by AI4Bharat Only model supporting all 22 Indian languages
Database PostgreSQL Reliable, ACID compliant, production standard
Authentication JWT Stateless, scalable, industry standard
Encryption AES-256 Medical data protection, DISHA requirement
Containerization Docker Consistent deployment, easy scaling
Cloud Deployment AWS EC2 + S3 Reliable, scalable, industry standard
API Documentation Swagger UI (built-in) Auto-generated, always up to date

⚡ Getting Started

# Clone the repository
git clone https://github.com/binnisha/Rural_Telehealth_NLP_Backend
cd medibridge-backend

# Create and activate environment
conda create -n medical-ai python=3.10
conda activate medical-ai

# Install dependencies
pip install -r requirements.txt

# Run the server
uvicorn app.main:app --reload

Visit http://127.0.0.1:8000/docs for live interactive API documentation.


📊 Build Progress

Phase Description Status
✅ Phase 1 Backend foundation, FastAPI server, core endpoints Complete
🔄 Phase 2 OpenAI Whisper integration, /transcribe endpoint In Progress
⏳ Phase 3 IndicTrans2 integration, translation pipeline Upcoming
⏳ Phase 4 PostgreSQL database, patient records Upcoming
⏳ Phase 5 JWT authentication, role-based access Upcoming
⏳ Phase 6 AES-256 encryption, DISHA compliance Upcoming
⏳ Phase 7 Docker + AWS deployment Upcoming
⏳ Phase 8 Bandwidth benchmarking, performance docs Upcoming

🔒 Compliance & Security

This platform is being built with DISHA (Digital Information Security in Healthcare Act) compliance as a core requirement, not an afterthought:

  • All patient data encrypted at rest using AES-256
  • All data in transit secured via TLS
  • Role-based access control - patients and doctors see only what they should
  • No patient data stored without explicit consent flows
  • Full audit trail of all consultations

📄 License

MIT License - see LICENSE file for details

About

AI-powered telehealth backend enabling multilingual speech-to-English medical translation for rural healthcare access using OpenAI Whisper and IndicTrans2.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages