Skip to content

emanueleielo/VaibeVoice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

10 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

VaibVoice - AI-Powered Voice Transcription

Installation โ€ข Features โ€ข Usage โ€ข Technologies โ€ข Contributing โ€ข License

img.png

๐ŸŽ™๏ธ What is VaibVoice?

VaibVoice is an advanced AI-powered voice transcription application that converts your speech into intelligently formatted text. Designed to boost productivity, VaibVoice eliminates the need for manual typing while automatically formatting your content based on context.

๐Ÿš€ Problem Solved

  • No More Typing: Speak naturally and let VaibVoice handle the text conversion
  • Smart Formatting: Automatically formats emails, messages, and prompts appropriately
  • Time Saving: Reduce the time spent on writing and formatting content
  • Accessibility: Makes content creation accessible for everyone, including those with typing difficulties

๐Ÿ‘ฅ Who Is It For?

  • Professionals who need to create content quickly and efficiently
  • Individuals with typing difficulties or RSI (Repetitive Strain Injury)
  • Anyone looking to boost their writing productivity
  • Content creators, programmers, and business professionals

โœจ Features

๐ŸŽค Real-Time Voice Transcription

  • Powered by OpenAI's GPT-4o models for highly accurate transcription
  • Support for multiple languages
  • Customizable activation key for recording

๐Ÿง  Intelligent Formatting

  • Automatically detects content type (email, message, prompt)
  • Formats text appropriately based on context
  • Removes formatting instructions from the final output

๐Ÿ–ฅ๏ธ Intuitive User Interface

  • Clean, modern dashboard with usage statistics
  • Interactive playground for testing transcription
  • Comprehensive history management
  • Customizable settings

โš™๏ธ Personalization Options

  • Language selection for transcription
  • Configurable recording hotkey
  • Customizable notification sounds
  • Selection of AI models for transcription and formatting

๐Ÿ› ๏ธ Technologies

Backend

  • Python 3.8+
  • FastAPI for RESTful API
  • SQLite for data storage
  • OpenAI API for transcription and formatting

Frontend

  • React with TypeScript
  • Tailwind CSS for styling
  • Vite as build tool
  • shadcn/ui components

๐Ÿ“‹ Installation

Prerequisites

  • Python 3.8 or higher
  • Node.js and npm
  • OpenAI API key

Step-by-Step Installation

  1. Clone the repository

    git clone https://github.com/yourusername/vaibvoice.git
    cd vaibvoice
  2. Install Python dependencies

    pip install -r requirements.txt
  3. Run the application

    python run.py

    The script will automatically:

    • Install any missing Python dependencies
    • Build the frontend if it hasn't been built already (requires Node.js)
  4. Configure your OpenAI API key

    • Navigate to Settings in the application
    • Enter your OpenAI API key
    • Save your settings

๐ŸŽฎ Usage

Initial Setup

  1. Set your OpenAI API key in the Settings page
  2. Select your preferred language for transcription
  3. Configure your recording hotkey (default is Alt)

Basic Usage

  1. Place your cursor where you want the transcription to appear
  2. Press and hold your configured hotkey
  3. Speak clearly into your microphone
  4. Release the key when finished speaking
  5. Wait for processing to complete
  6. Your formatted text will appear at the cursor position

Smart Formatting Tips

For best results, start your dictation with instructions like:

  • "This is an email to my colleague, format it professionally..."
  • "Format this as a casual message to my friend..."
  • "This is a prompt for an AI system..."

The AI will understand your instructions, format accordingly, and remove the instructions from the final text.

๐Ÿ—๏ธ Project Architecture

Directory Structure

vaibvoice/
โ”œโ”€โ”€ gui/                  # Frontend React application
โ”œโ”€โ”€ vaibvoice/            # Backend Python application
โ”‚   โ”œโ”€โ”€ api/              # FastAPI routes and endpoints
โ”‚   โ”œโ”€โ”€ core/             # Core functionality (recording, transcription)
โ”‚   โ”œโ”€โ”€ db/               # Database models and repositories
โ”‚   โ””โ”€โ”€ models/           # Data models
โ”œโ”€โ”€ run.py                # Main entry point
โ””โ”€โ”€ requirements.txt      # Python dependencies

Data Flow

  1. User activates recording with the configured hotkey
  2. Audio is captured and sent to the OpenAI API
  3. Transcription is processed and formatted
  4. Formatted text is returned to the user interface
  5. Transcription is saved to the history database

๐Ÿ‘ฅ Contributing

We welcome contributions to VaibVoice! Here's how you can help:

Development Setup

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes
  4. Commit your changes: git commit -m 'Add some amazing feature'
  5. Push to the branch: git push origin feature/amazing-feature
  6. Open a Pull Request

Future Roadmap

  • Voice activation mode
  • Export functionality for transcription history
  • Additional language support
  • Mobile application
  • Cloud synchronization

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


About

๐ŸŽ™๏ธ VaibVoice โ€“ Real-time AI voice transcription with smart formatting and full customization.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors