A powerful command-line tool for transcribing voice notes and automatically generating structured meeting minutes in Notion. Perfect for engineers, project managers, and professionals who want to streamline their meeting documentation workflow.
This tool is part of Flocode's open-source initiative to empower civil and structural engineers with practical AI-powered tools. As a free community resource, you're welcome to take this tool, modify it, and make it your own! π
Feature | Description |
---|---|
π€ High-Quality Transcription | Utilizes Groq (whisper-large-v3-turbo) and OpenAI Whisper for accurate speech-to-text conversion |
π€ AI-Powered Summarization | Leverages Google Gemini (with OpenAI fallback) to generate structured meeting minutes |
π Notion Integration | Automatically creates formatted entries in your Notion database with proper markdown rendering |
π Real-Time Notifications | Desktop notifications keep you informed of processing status |
πΎ Local Backups | Maintains local markdown backups of every transcription for your records |
π Secure Configuration | All API keys managed securely through environment variables |
π Drag & Drop Processing | Simply drag audio files onto the batch script for instant processing |
π Queue Management | Add multiple files and URLs to a processing queue for batch operations |
Before you begin, ensure you have the following installed:
1. Clone the repository:
git clone https://github.com/your-username/whisper_2.0.git
cd whisper_2.0
2. Install dependencies:
uv sync
3. Install FFmpeg:
πͺ Windows (with Chocolatey)
choco install ffmpeg
π macOS (with Homebrew)
brew install ffmpeg
π§ Linux (with apt)
sudo apt update && sudo apt install ffmpeg
1. Create your .env
file:
cp .env.example .env
2. Configure your API keys:
Legend: β Required | π Optional (recommended) | βͺ Optional (nice-to-have)
Verify everything is configured correctly:
uv run python tests/test_voice_system.py
Drag & Drop Processing:
- Create a desktop shortcut to
quick_process.bat
- Drag your audio file onto the shortcut
- Get notified of successfull completion β¨
Perfect for managing multiple files:
uv run python scripts/process_voice_notes.py --interactive
Available commands:
add <file_or_url>
- Add to processing queuequeue
- Show current queueprocess
- Process next itemp
- Process all itemsclear
- Clear queue- Direct file paths work too!
For direct processing:
uv run python scripts/process_voice_notes.py /path/to/your/audio.mp3
Step 1: Audio Capture
- Use VoiceMeeter Banana to record both desktop audio and microphone input into a single audio file for transcription.
Step 2: Instant Processing
- Create a desktop shortcut to
quick_process.bat
- After your meeting ends, drag the audio file directly onto the shortcut
- The script handles everything automatically:
- β Transcribes the entire conversation
- β Generates structured meeting minutes
- β Saves to Notion with proper formatting
- β Creates local markdown backup
- β Sends you a completion notification
For a more traditional approach, use select_and_process.bat
to browse and select files through a Windows dialog.
βοΈ Legal Notice: Always obtain proper consent before recording conversations. Comply with local laws and regulations.
graph TD
A[π Audio File] --> B[π Queue Management]
B --> C[βοΈ Audio Chunking]
C --> D[π€ Transcription<br/>Groq/OpenAI Whisper]
D --> E[π€ AI Summarization<br/>Gemini/GPT-4]
E --> F[π Notion Integration<br/>Formatted Markdown]
E --> G[πΎ Local Backup<br/>Markdown File]
F --> H[π Success Notification]
G --> H
Process Flow:
- π Queue Management - Files and URLs organized in processing queue
- π΅ Audio Processing - Files chunked for optimal API handling
- π Transcription - High-quality speech-to-text conversion
- π§ Summarization - AI-powered meeting minutes generation
- πΎ Dual Storage - Notion database + local markdown backup
We welcome contributions from the engineering community! This is an open-source Flocode initiative.
- π Found a bug? Open an issue
- π‘ Have an idea? Start a discussion
- π§ Want to contribute? Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
Free for commercial and personal use. π
Empowering engineers with practical AI tools, one voice note at a time.
James π