♿ Universal Accessibility Companion

A powerful, multimodal AI assistant designed to empower individuals with vision and hearing impairments. Built with Gradio, Modal, Google Gemini, ElevenLabs, and Hugging Face models.

🌟 Features

👁️ Vision Assistant (For Visually Impaired)

Scene Description: Detects objects and describes the scene using OwlViT.
Smart OCR: Reads text from documents, signs, and screens using Google Gemini 2.5 Flash.
Text Simplification: Summarizes complex text into easy-to-understand language.
Text-to-Speech (TTS): Reads out the simplified text using ElevenLabs (Bella voice).
Haptic Feedback: Maps detected sounds (e.g., "car horn", "siren") to vibration patterns (simulated).

🗣️ Speech Impaired Assistant (For Hearing/Speech Impaired)

Live Captioning: Real-time transcription of speech using Distil-Whisper.
Emotion Detection: Identifies the speaker's emotion (e.g., "Happy", "Sad", "Angry") using DistilHuBERT.
Speaker Diarization: Identifies who is speaking (e.g., "SPEAKER_00", "SPEAKER_01") using pyannote.audio.
Low Latency: Optimized for real-time interaction with parallel processing.

🔗 Integrations

MCP (Model Context Protocol): Connects to external tools like Calendar, Email, and Maps (Mock implementation).

🛠️ Tech Stack

Frontend: Gradio (Python)
Backend: Modal (Serverless GPU inference)
AI Models:
- Vision: google/gemini-2.5-flash-image, google/owlvit-base-patch32
- Hearing: distil-whisper/distil-large-v2
- Emotion: BilalHasan/distilhubert-finetuned-ravdess (ONNX)
- Diarization: pyannote/speaker-diarization-3.1
- TTS: ElevenLabs API

🚀 Setup & Installation

1. Clone the Repository

git clone https://github.com/yourusername/accessibility-companion.git
cd accessibility-companion

2. Install Dependencies

It is recommended to use a virtual environment (Conda or venv).

pip install -r requirements.txt

Note: You also need ffmpeg installed on your system.

3. Configure Environment Variables

Create a .env file in the root directory:

GEMINI_API_KEY=your_gemini_key
ELEVENLABS_API_KEY=your_elevenlabs_key
HF_TOKEN=your_huggingface_token

4. Deploy Backend to Modal

You need a Modal account. Authenticate first:

modal setup

Then deploy the backend functions:

modal deploy modal_app.py

5. Run the App

python app.py

Open your browser at http://localhost:7860.

📂 Project Structure

app.py: Main Gradio application (Frontend & Orchestration).
modal_app.py: Modal backend definitions (GPU inference).
utils.py: Helper functions for TTS and text processing.
requirements.txt: Python dependencies.

🤝 Contributing

Pull requests are welcome! Please open an issue first to discuss changes.

📄 License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
app.py		app.py
debug_modal.py		debug_modal.py
environment.yml		environment.yml
list_models.py		list_models.py
mcp_av_server.py		mcp_av_server.py
mcp_server.py		mcp_server.py
modal_app.py		modal_app.py
modal_gpu_notes.md		modal_gpu_notes.md
modal_paddleocr_app.py		modal_paddleocr_app.py
modal_secrets_guide.md		modal_secrets_guide.md
requirements.txt		requirements.txt
test_utils.py		test_utils.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

♿ Universal Accessibility Companion

🌟 Features

👁️ Vision Assistant (For Visually Impaired)

🗣️ Speech Impaired Assistant (For Hearing/Speech Impaired)

🔗 Integrations

🛠️ Tech Stack

🚀 Setup & Installation

1. Clone the Repository

2. Install Dependencies

3. Configure Environment Variables

4. Deploy Backend to Modal

5. Run the App

📂 Project Structure

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

githubbermoon/disability-assistant

Folders and files

Latest commit

History

Repository files navigation

♿ Universal Accessibility Companion

🌟 Features

👁️ Vision Assistant (For Visually Impaired)

🗣️ Speech Impaired Assistant (For Hearing/Speech Impaired)

🔗 Integrations

🛠️ Tech Stack

🚀 Setup & Installation

1. Clone the Repository

2. Install Dependencies

3. Configure Environment Variables

4. Deploy Backend to Modal

5. Run the App

📂 Project Structure

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages