VaibVoice - AI-Powered Voice Transcription

Installation • Features • Usage • Technologies • Contributing • License

🎙️ What is VaibVoice?

VaibVoice is an advanced AI-powered voice transcription application that converts your speech into intelligently formatted text. Designed to boost productivity, VaibVoice eliminates the need for manual typing while automatically formatting your content based on context.

🚀 Problem Solved

No More Typing: Speak naturally and let VaibVoice handle the text conversion
Smart Formatting: Automatically formats emails, messages, and prompts appropriately
Time Saving: Reduce the time spent on writing and formatting content
Accessibility: Makes content creation accessible for everyone, including those with typing difficulties

👥 Who Is It For?

Professionals who need to create content quickly and efficiently
Individuals with typing difficulties or RSI (Repetitive Strain Injury)
Anyone looking to boost their writing productivity
Content creators, programmers, and business professionals

✨ Features

🎤 Real-Time Voice Transcription

Powered by OpenAI's GPT-4o models for highly accurate transcription
Support for multiple languages
Customizable activation key for recording

🧠 Intelligent Formatting

Automatically detects content type (email, message, prompt)
Formats text appropriately based on context
Removes formatting instructions from the final output

🖥️ Intuitive User Interface

Clean, modern dashboard with usage statistics
Interactive playground for testing transcription
Comprehensive history management
Customizable settings

⚙️ Personalization Options

Language selection for transcription
Configurable recording hotkey
Customizable notification sounds
Selection of AI models for transcription and formatting

🛠️ Technologies

Backend

Python 3.8+
FastAPI for RESTful API
SQLite for data storage
OpenAI API for transcription and formatting

Frontend

React with TypeScript
Tailwind CSS for styling
Vite as build tool
shadcn/ui components

📋 Installation

Prerequisites

Python 3.8 or higher
Node.js and npm
OpenAI API key

Step-by-Step Installation

Clone the repository

git clone https://github.com/yourusername/vaibvoice.git
cd vaibvoice

Install Python dependencies
```
pip install -r requirements.txt
```
Run the application
```
python run.py
```
The script will automatically:
- Install any missing Python dependencies
- Build the frontend if it hasn't been built already (requires Node.js)
Configure your OpenAI API key
- Navigate to Settings in the application
- Enter your OpenAI API key
- Save your settings

🎮 Usage

Initial Setup

Set your OpenAI API key in the Settings page
Select your preferred language for transcription
Configure your recording hotkey (default is Alt)

Basic Usage

Place your cursor where you want the transcription to appear
Press and hold your configured hotkey
Speak clearly into your microphone
Release the key when finished speaking
Wait for processing to complete
Your formatted text will appear at the cursor position

Smart Formatting Tips

For best results, start your dictation with instructions like:

"This is an email to my colleague, format it professionally..."
"Format this as a casual message to my friend..."
"This is a prompt for an AI system..."

The AI will understand your instructions, format accordingly, and remove the instructions from the final text.

🏗️ Project Architecture

Directory Structure

vaibvoice/
├── gui/                  # Frontend React application
├── vaibvoice/            # Backend Python application
│   ├── api/              # FastAPI routes and endpoints
│   ├── core/             # Core functionality (recording, transcription)
│   ├── db/               # Database models and repositories
│   └── models/           # Data models
├── run.py                # Main entry point
└── requirements.txt      # Python dependencies

Data Flow

User activates recording with the configured hotkey
Audio is captured and sent to the OpenAI API
Transcription is processed and formatted
Formatted text is returned to the user interface
Transcription is saved to the history database

👥 Contributing

We welcome contributions to VaibVoice! Here's how you can help:

Development Setup

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes
Commit your changes: git commit -m 'Add some amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

Future Roadmap

Voice activation mode
Export functionality for transcription history
Additional language support
Mobile application
Cloud synchronization

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
gui		gui
vaibvoice		vaibvoice
LICENSE		LICENSE
README.md		README.md
img.png		img.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VaibVoice - AI-Powered Voice Transcription

🎙️ What is VaibVoice?

🚀 Problem Solved

👥 Who Is It For?

✨ Features

🎤 Real-Time Voice Transcription

🧠 Intelligent Formatting

🖥️ Intuitive User Interface

⚙️ Personalization Options

🛠️ Technologies

Backend

Frontend

📋 Installation

Prerequisites

Step-by-Step Installation

🎮 Usage

Initial Setup

Basic Usage

Smart Formatting Tips

🏗️ Project Architecture

Directory Structure

Data Flow

👥 Contributing

Development Setup

Future Roadmap

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VaibVoice - AI-Powered Voice Transcription

🎙️ What is VaibVoice?

🚀 Problem Solved

👥 Who Is It For?

✨ Features

🎤 Real-Time Voice Transcription

🧠 Intelligent Formatting

🖥️ Intuitive User Interface

⚙️ Personalization Options

🛠️ Technologies

Backend

Frontend

📋 Installation

Prerequisites

Step-by-Step Installation

🎮 Usage

Initial Setup

Basic Usage

Smart Formatting Tips

🏗️ Project Architecture

Directory Structure

Data Flow

👥 Contributing

Development Setup

Future Roadmap

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages