Skip to content

AI-Enhanced Accessibility: Voice-to-Canvas and Gesture Recognition #30

@bchou9

Description

@bchou9

Description

Implement AI-powered accessibility features including voice commands for drawing ("draw a red circle"), speech-to-text annotations, gesture recognition for hands-free drawing, and screen reader optimizations. Make ResCanvas accessible to users with different abilities.

Current workflow

Canvas requires mouse/touchscreen input; limited accessibility support for screen readers or alternative input methods.

Proposed solution

Integrate speech recognition (Web Speech API, Whisper) for voice commands and dictation. Add computer vision gesture recognition using webcam (MediaPipe, TensorFlow.js). Enhance screen reader support with descriptive canvas state announcements.

Technical requirements

Files to create:

  • frontend/src/components/Accessibility/VoiceCommandPanel.jsx
  • frontend/src/components/Accessibility/GestureRecognition.jsx
  • frontend/src/hooks/useVoiceCommands.js
  • frontend/src/hooks/useGestureControl.js
  • frontend/src/services/speechService.js
  • backend/services/voice_processing_service.py

Files to modify:

  • Canvas.js
  • Toolbar.js
  • Room.jsx
  • frontend/src/styles/accessibility.css

Backend:

  • routes/voice_commands.py
  • config.py

Skills

Speech recognition, NLP intent parsing, computer vision, gesture recognition, React accessibility, ARIA, WebRTC

Key features

  • Voice commands for drawing tools ("select pen", "change color to blue")
  • Speech-to-text for canvas annotations
  • Gesture recognition (hand tracking for drawing in air)
  • Eye tracking integration for cursor control
  • Screen reader optimizations with canvas state descriptions
  • Keyboard-only navigation enhancements
  • High contrast and colorblind modes
  • Undo/redo via voice
  • Voice feedback for actions

Challenges

Accent/language variations, noisy environments, gesture accuracy, privacy concerns with camera/microphone, latency for real-time commands, command disambiguation

Getting started

Implement Web Speech API wrapper, add voice command parser, integrate MediaPipe for gestures, enhance ARIA labels

Tests

Unit: voice command parsing ("draw circle" → correct action)
Integration: gesture draws valid stroke on canvas
Accessibility: screen reader announces canvas state changes
UI: voice panel with microphone indicator
Performance: command processing < 300ms

Resources

Web Speech API, OpenAI Whisper, MediaPipe Hands, TensorFlow.js HandPose, WCAG guidelines, ARIA best practices

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions