Skip to content

joreilly86/whisper_2.0

Repository files navigation

🎀 Voice Note Transcription & Notion Integration

Python 3.12+ MIT License Flocode Community Open Source

A powerful command-line tool for transcribing voice notes and automatically generating structured meeting minutes in Notion. Perfect for engineers, project managers, and professionals who want to streamline their meeting documentation workflow.


🌊 About Flocode

This tool is part of Flocode's open-source initiative to empower civil and structural engineers with practical AI-powered tools. As a free community resource, you're welcome to take this tool, modify it, and make it your own! πŸš€


✨ Features

Feature Description
🎀 High-Quality Transcription Utilizes Groq (whisper-large-v3-turbo) and OpenAI Whisper for accurate speech-to-text conversion
πŸ€– AI-Powered Summarization Leverages Google Gemini (with OpenAI fallback) to generate structured meeting minutes
πŸ“ Notion Integration Automatically creates formatted entries in your Notion database with proper markdown rendering
πŸ”” Real-Time Notifications Desktop notifications keep you informed of processing status
πŸ’Ύ Local Backups Maintains local markdown backups of every transcription for your records
πŸ”’ Secure Configuration All API keys managed securely through environment variables
πŸš€ Drag & Drop Processing Simply drag audio files onto the batch script for instant processing
πŸ“Š Queue Management Add multiple files and URLs to a processing queue for batch operations

πŸš€ Getting Started

πŸ“‹ Prerequisites

Before you begin, ensure you have the following installed:

  • Python Python 3.12+
  • UV uv: Fast Python package installer
    pip install uv
  • FFmpeg FFmpeg: Required for audio processing

πŸ“¦ Installation

1. Clone the repository:

git clone https://github.com/your-username/whisper_2.0.git
cd whisper_2.0

2. Install dependencies:

uv sync

3. Install FFmpeg:

πŸͺŸ Windows (with Chocolatey)
choco install ffmpeg
🍎 macOS (with Homebrew)
brew install ffmpeg
🐧 Linux (with apt)
sudo apt update && sudo apt install ffmpeg

βš™οΈ Configuration

1. Create your .env file:

cp .env.example .env

2. Configure your API keys:

Service Environment Variable Purpose Required
OpenAI OPENAI_API_KEY Transcription & Summarization fallback βœ…
Groq GROQ_API_KEY Fast transcription (recommended) πŸ”„
Google GEMINI_API_KEY Enhanced summarization πŸ”„
Notion NOTION_API_KEY Database integration βœ…
Notion NOTION_DATABASE_ID Target database βœ…
🏒 COMPANY_NAME Personalized minutes βšͺ
🏒 COMPANY_SHORTHAND Company abbreviation βšͺ

Legend: βœ… Required | πŸ”„ Optional (recommended) | βšͺ Optional (nice-to-have)

πŸ§ͺ Test Your Setup

Verify everything is configured correctly:

uv run python tests/test_voice_system.py

πŸ’» Usage

🎯 Quick Start (Recommended)

Drag & Drop Processing:

  1. Create a desktop shortcut to quick_process.bat
  2. Drag your audio file onto the shortcut
  3. Get notified of successfull completion ✨

πŸ”„ Interactive Mode

Perfect for managing multiple files:

uv run python scripts/process_voice_notes.py --interactive

Available commands:

  • add <file_or_url> - Add to processing queue
  • queue - Show current queue
  • process - Process next item
  • p - Process all items
  • clear - Clear queue
  • Direct file paths work too!

⚑ Command-Line

For direct processing:

uv run python scripts/process_voice_notes.py /path/to/your/audio.mp3

🎯 My Preferred Workflow

πŸŽ™οΈ Recording Setup with VoiceMeeter Banana

Step 1: Audio Capture

  • Use VoiceMeeter Banana to record both desktop audio and microphone input into a single audio file for transcription.

Step 2: Instant Processing

  1. Create a desktop shortcut to quick_process.bat
  2. After your meeting ends, drag the audio file directly onto the shortcut
  3. The script handles everything automatically:
    • βœ… Transcribes the entire conversation
    • βœ… Generates structured meeting minutes
    • βœ… Saves to Notion with proper formatting
    • βœ… Creates local markdown backup
    • βœ… Sends you a completion notification

πŸŽ›οΈ Alternative: GUI File Selection

For a more traditional approach, use select_and_process.bat to browse and select files through a Windows dialog.

βš–οΈ Legal Notice: Always obtain proper consent before recording conversations. Comply with local laws and regulations.

πŸ”§ How It Works

graph TD
    A[πŸ“ Audio File] --> B[πŸ”„ Queue Management]
    B --> C[βœ‚οΈ Audio Chunking]
    C --> D[🎀 Transcription<br/>Groq/OpenAI Whisper]
    D --> E[πŸ€– AI Summarization<br/>Gemini/GPT-4]
    E --> F[πŸ“ Notion Integration<br/>Formatted Markdown]
    E --> G[πŸ’Ύ Local Backup<br/>Markdown File]
    F --> H[πŸ”” Success Notification]
    G --> H
Loading

Process Flow:

  1. πŸ“‹ Queue Management - Files and URLs organized in processing queue
  2. 🎡 Audio Processing - Files chunked for optimal API handling
  3. πŸ“ Transcription - High-quality speech-to-text conversion
  4. 🧠 Summarization - AI-powered meeting minutes generation
  5. πŸ’Ύ Dual Storage - Notion database + local markdown backup

🀝 Contributing

We welcome contributions from the engineering community! This is an open-source Flocode initiative.

  • πŸ› Found a bug? Open an issue
  • πŸ’‘ Have an idea? Start a discussion
  • πŸ”§ Want to contribute? Submit a pull request

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

Free for commercial and personal use. πŸŽ‰


🌊 Built with ❀️ for the Flocode Community

Empowering engineers with practical AI tools, one voice note at a time.

James 🌊


Flocode Newsletter

About

Audio transcription tool and meeting minutes generator for Professional Engineers.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published