A cutting-edge desktop automation system that combines gesture recognition, voice commands, and AI-powered automation to create an intuitive and powerful computer control experience.
- Voice-Activated AI Assistant: Say "hey jarvis" to activate voice commands
- Desktop Automation: Take screenshots, open applications, control system settings
- Natural Language Processing: Understand and execute complex commands
- Persistent Console Interface: Interactive AI agent with real-time feedback
- 8 Gesture Types: Up, Down, Left, Right, Stop, Undo, Redo, None
- Machine Learning Model: Trained on 750+ gesture samples
- Real-time Processing: Instant gesture recognition and response
- Customizable Mappings: Configurable key bindings for each gesture
- Air Controller: Gesture-based gaming controls
- Adaptive AI: Self-improving gesture recognition system
- Data Collection: Built-in gesture training and data collection tools
- Model Training: Automated machine learning pipeline
- Responsive Design: Works on desktop and mobile devices
- Real-time Feedback: Live gesture detection and status updates
- Process Management: Start/stop individual components
- Professional UI: Clean, modern interface with dark theme
- Python 3.11 or higher
- Windows 10/11 (primary support)
- Webcam for gesture recognition
- Microphone for voice commands
-
Clone the Repository
git clone https://github.com/vigneshbs33/29-Codekrafters.git cd 29-Codekrafters -
Install Dependencies
# Install MediaPipe and core dependencies python install_mediapipe.py # Activate virtual environment mediapipe_env\Scripts\activate # Install additional requirements pip install -r requirements.txt pip install -r Agentic-AI/requirements.txt
-
Install Agentic AI Dependencies
cd Agentic-AI pip install SpeechRecognition PyAudio pycaw mss cd ..
-
Activate Virtual Environment
mediapipe_env\Scripts\activate
-
Run the Flask Application
python app.py
-
Access the Web Interface
- Open your browser and go to
http://localhost:5000 - The interface will show all available controls
- Open your browser and go to
- Start Gesture Recognition: Click "Start Gesture Recognition"
- Available Gestures:
- ๐ Up: Navigate up or increase values
- ๐ Down: Navigate down or decrease values
- ๐ Left: Navigate left or previous
- ๐ Right: Navigate right or next
- โ Stop: Stop current action
- โฉ๏ธ Undo: Undo last action
- ๐ Redo: Redo last action
- ๐ซ None: No action
- Activate: Click "Call Agent (Opens Console)"
- Wake Word: Say "hey jarvis" to activate
- Commands:
- "take a screenshot"
- "open chrome"
- "increase volume"
- "what time is it"
- And many more...
- Start: Click "Start Air Controller"
- Gaming Controls: Use gestures for gaming input
- Customizable: Configure gesture-to-key mappings
Paraso/
โโโ app.py # Main Flask application
โโโ requirements.txt # Python dependencies
โโโ install_mediapipe.py # MediaPipe installation script
โโโ templates/ # Web interface templates
โ โโโ index.html # Main interface
โ โโโ css/ # Stylesheets
โ โโโ js/ # JavaScript files
โโโ Agentic-AI/ # AI voice assistant
โ โโโ agent_cmds.py # Main AI agent script
โ โโโ requirements.txt # AI dependencies
โ โโโ README.md # AI documentation
โโโ virtual-controlls/ # Gesture recognition system
โ โโโ adaptive-ai/ # Machine learning components
โ โ โโโ DATA/ # Training data
โ โ โโโ models/ # Trained ML models
โ โ โโโ collect_gestures.py # Data collection tool
โ โโโ air-controller.py # Gaming controller
โโโ mediapipe_env/ # Virtual environment
Edit virtual-controlls/adaptive-ai/key_mappings.json to customize gesture-to-key mappings:
{
"up": "w",
"down": "s",
"left": "a",
"right": "d",
"stop": "space",
"undo": "ctrl+z",
"redo": "ctrl+y"
}The AI agent supports natural language commands for:
- System Control: Volume, brightness, power management
- Application Launch: Open programs, websites, files
- Screenshots: Capture screen, specific windows, or regions
- Information: Time, date, system status
- Custom Actions: User-defined automation tasks
This project is part of Hackademia 2025 - Codekrafters hackathon.
- Team Name: Codekrafters
- Team Captain: @vigneshbs33
- Repository: 29-Codekrafters
- Follow hackathon rules and guidelines
- Maintain code quality and documentation
- Test features thoroughly before committing
- Respect the 60% AI-generated code limit
This project is developed for the Hackademia 2025 hackathon hosted by National College Jayanagar.
- Multi-language voice support
- Advanced gesture combinations
- Cloud-based AI processing
- Mobile app companion
- Integration with smart home devices
- Custom gesture training interface
- Performance optimization
- Cross-platform support
- ModuleNotFoundError: Ensure virtual environment is activated
- Camera Access: Grant camera permissions to the application
- Microphone Issues: Check microphone permissions and settings
- Gesture Recognition: Ensure good lighting and clear hand movements
For issues and questions, please refer to the hackathon organizers or create an issue in the repository.
Built by Team Codekrafters for Hackademia 2025