🎤 Smart Audio Transcript

Turn your voice into text with AI-powered transcription

A modern, feature-rich speech-to-text application that combines the power of Google's Gemini AI with a beautiful web interface and floating recording controls. Perfect for content creators, developers, students, and anyone who needs fast, accurate transcription.

✨ Features

🎯 Smart Recording

One-click recording with customizable hotkeys
Pause & resume without losing your audio
Multiple microphones support with easy device switching

🤖 AI-Powered Transcription

Google Gemini AI for accurate, context-aware transcription
Multi-language support - 120+ Supported Languages
Customizable AI prompts for specialized use cases

🖥️ Modern Interface

Beautiful UI with dark theme and responsive design
Floating recording overlay that stays on top while you work
System tray integration for background operation
Real-time status and audio feedback

⚡ Productivity Features

Auto-paste transcribed text directly to your active application
auto-save in cliboard it automatically saves the transcribed text to clipboard
Background operation - keeps running in system tray

🚀 Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Get Your API Key

Get a free Google Gemini API key

3. Run the App

python main.py

4. Add Your API Key (in the app)

In the left sidebar, click API Keys
Paste your key into the Gemini API Key field
The key is saved automatically (you can change it anytime)

Tip: Alternatively, you can create a .env file with GEMINI_API_KEY=your_key.

🎮 How It Works

Start Recording - press Ctrl+Shift+Space to start/stop recording
Speak Naturally - The floating overlay shows recording status
AI Processing - Gemini AI transcribes with context awareness
Auto-Paste - Text appears in your active application and is saved to clipboard

🛠️ Configuration

Hotkeys

Toggle Mode: Press once to start/stop (default: Ctrl+Shift+Space)
Hold Mode: Hold to record, release to stop (default: Ctrl)

Audio Settings

Silence Threshold: Adjust sensitivity for your environment
Microphone Selection: Choose your preferred input device
Ambient Calibration: Automatic noise floor detection

AI Customization

Custom Prompts: Tailor transcription for your specific needs
Language Preservation: Maintains original scripts and accents

🏗️ Architecture

Smart Audio Transcript uses a modern hybrid architecture:

🌐 Web Interface (Eel) - Settings and configuration
🎯 Native Overlay (CustomTkinter) - Floating recording controls
🗂️ System Tray (pystray) - Background management
🎤 Core Engine (Python) - Audio processing & AI integration

This gives you the best of both worlds: a modern web UI for settings and responsive native controls for recording.

🤝 Contributing

We welcome contributions! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Made with ❤️ for the open source community

Transform your voice into text with the power of AI

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
audio		audio
images		images
src		src
web		web
.env		.env
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎤 Smart Audio Transcript

✨ Features

🎯 Smart Recording

🤖 AI-Powered Transcription

🖥️ Modern Interface

⚡ Productivity Features

🚀 Quick Start

1. Install Dependencies

2. Get Your API Key

3. Run the App

4. Add Your API Key (in the app)

🎮 How It Works

🛠️ Configuration

Hotkeys

Audio Settings

AI Customization

🏗️ Architecture

🤝 Contributing

About

Uh oh!

Languages

ahmed0x77/Voice-Transcriber-GUI

Folders and files

Latest commit

History

Repository files navigation

🎤 Smart Audio Transcript

✨ Features

🎯 Smart Recording

🤖 AI-Powered Transcription

🖥️ Modern Interface

⚡ Productivity Features

🚀 Quick Start

1. Install Dependencies

2. Get Your API Key

3. Run the App

4. Add Your API Key (in the app)

🎮 How It Works

🛠️ Configuration

Hotkeys

Audio Settings

AI Customization

🏗️ Architecture

🤝 Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages