-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Description
Implement AI-powered accessibility features including voice commands for drawing ("draw a red circle"), speech-to-text annotations, gesture recognition for hands-free drawing, and screen reader optimizations. Make ResCanvas accessible to users with different abilities.
Current workflow
Canvas requires mouse/touchscreen input; limited accessibility support for screen readers or alternative input methods.
Proposed solution
Integrate speech recognition (Web Speech API, Whisper) for voice commands and dictation. Add computer vision gesture recognition using webcam (MediaPipe, TensorFlow.js). Enhance screen reader support with descriptive canvas state announcements.
Technical requirements
Files to create:
frontend/src/components/Accessibility/VoiceCommandPanel.jsxfrontend/src/components/Accessibility/GestureRecognition.jsxfrontend/src/hooks/useVoiceCommands.jsfrontend/src/hooks/useGestureControl.jsfrontend/src/services/speechService.jsbackend/services/voice_processing_service.py
Files to modify:
Canvas.jsToolbar.jsRoom.jsxfrontend/src/styles/accessibility.css
Backend:
routes/voice_commands.pyconfig.py
Skills
Speech recognition, NLP intent parsing, computer vision, gesture recognition, React accessibility, ARIA, WebRTC
Key features
- Voice commands for drawing tools ("select pen", "change color to blue")
- Speech-to-text for canvas annotations
- Gesture recognition (hand tracking for drawing in air)
- Eye tracking integration for cursor control
- Screen reader optimizations with canvas state descriptions
- Keyboard-only navigation enhancements
- High contrast and colorblind modes
- Undo/redo via voice
- Voice feedback for actions
Challenges
Accent/language variations, noisy environments, gesture accuracy, privacy concerns with camera/microphone, latency for real-time commands, command disambiguation
Getting started
Implement Web Speech API wrapper, add voice command parser, integrate MediaPipe for gestures, enhance ARIA labels
Tests
Unit: voice command parsing ("draw circle" → correct action)
Integration: gesture draws valid stroke on canvas
Accessibility: screen reader announces canvas state changes
UI: voice panel with microphone indicator
Performance: command processing < 300ms
Resources
Web Speech API, OpenAI Whisper, MediaPipe Hands, TensorFlow.js HandPose, WCAG guidelines, ARIA best practices