Skip to content

Inspired by my Nani, I built this solution for the visually impaired. It enables real-time person identification and navigation assistance, using WebRTC and object detection/face-recognition models to help users recognize those around them and move safely through their environment.

Notifications You must be signed in to change notification settings

aakritrajput/NaniVision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

57 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

NaniVision πŸ‘οΈβ€οΈ

Empowering Independence Through Intelligent Vision.

NaniVision is an AI-powered assistive technology project designed to provide independence to individuals with visual impairments. Inspired by my grandmother, this project leverages computer vision to act as a digital pair of eyes, identifying people and detecting obstacles in real-time.


πŸ”— Live Demo

Check out the Demo video here: Demo Video


🌟 Inspiration

The idea for NaniVision sparked when I realized my Nani (grandmother) struggled to navigate her surroundings independently due to vision loss. I wanted to bridge the gap between complex AI research and real-world accessibility. This project is built using industry-standard tools to solve a fundamental human challenge: safe and confident mobility.

πŸš€ Key Features

  • Face Recognition: Distinguishes between known family members and unknown individuals to provide a sense of security and social context.
  • Object Detection & Navigation: Identifies obstacles, furniture, and hazards in the user's path to assist in safe movement.
  • High-Performance Backend: Optimized core logic to process video streams and categorize environmental factors with low latency.
  • Modular Audio Pipeline: Designed to receive audio tracks, making it ready for seamless integration with Voice Commands and Text-to-Speech (TTS) engines.

Future Tragets

  • Multimodal Voice-to-Command Integration: While command detection from raw text is currently implemented, the next phase involves integrating an end-to-end audio pipeline to process raw audio via Speech-to-Text (STT) and feed it into our existing logic.
  • Command-Driven Execution: We are transitioning from "always-on" detection to an intentional command-driven system. This will allow users to toggle specific modes (e.g., "Find my family" or "Detect obstacles") to reduce cognitive load and optimize processing.
  • Real-Time Audio Feedback Loop: Converting backend detection results into real-time, natural-sounding voice responses for the user using advanced Text-to-Speech (TTS) engines.

About

Inspired by my Nani, I built this solution for the visually impaired. It enables real-time person identification and navigation assistance, using WebRTC and object detection/face-recognition models to help users recognize those around them and move safely through their environment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published