Skip to content

patrikx3/meet-assistant

Repository files navigation

Donate for PatrikX3 / P3X Contact Corifeus / P3X Corifeus @ Facebook Uptime ratio (90 days)

🎙️🧠 P3X Meet Assistant — live meeting transcription with OpenAI GPT-4o Transcribe, GPU speaker diarization, and 10-language support v2026.4.123

🌌 Bugs are evident™ - MATRIX️
🚧 This project is under active development!
📢 We welcome your feedback and contributions.

NodeJS LTS is supported

🛠️ Built on NodeJs version

v24.14.1

📝 Description

Meet Assistant

Real-time AI speech-to-text for meetings and conversations. Captures speaker audio, transcribes it live using OpenAI GPT-4o Transcribe, and auto-labels each utterance by voice (Speaker 1, Speaker 2, ...). Ships with 10 European languages out of the box.

PyPI Python Downloads engine diarization license

Quickstart

pip install p3x-meet-assistant
export OPENAI_API_KEY=sk-...
p3x-meet-assistant

Open http://localhost:8088. That's the whole thing — the wheel bundles the full web UI, so no Node.js, no git clone, no build step.

Features

  • Live transcription via OpenAI GPT-4o Transcribe — the highest-accuracy speech model available today
  • Automatic speaker diarization — colored Speaker 1 / 2 / 3 ... labels based on voice fingerprint
  • 10 languages: English, Hungarian, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Czech
  • Browser-based UI — Dark / Light theme, adjustable font size, one-click transcript export
  • System-audio capture on Linux (PulseAudio / PipeWire) or any browser-tab audio via MIC / TAB buttons
  • Rolling prompt context keeps proper nouns, jargon, and acronyms consistent across chunks
  • Session auto-save to sessions/YYYY-MM-DD-HH-MM.txt
  • Distributed as a single pip-installable Python package

One language at a time — why

Auto-detect mode (trying two languages and picking the best) produces far more hallucinations than explicitly selecting a single language. Pick what you're actually hearing and accuracy jumps dramatically.

Links

Platforms — works on Linux, macOS, and Windows

Meet Assistant runs on any OS with Python 3.10+ and a modern browser. The only feature that's Linux-specific is the optional server-side system-audio capture. Everywhere else you use the browser's built-in audio capture — same APIs Google Meet uses.

Platform Install Microphone System / tab audio GPU diarization
Linux (desktop) pip install 'p3x-meet-assistant[linux-capture]' + sudo apt install portaudio19-dev MIC button ✓ Auto-captured via PulseAudio / PipeWire (plus TAB button) ✓ NVIDIA CUDA
macOS pip install p3x-meet-assistant MIC button TAB button (Chrome/Edge; share a tab with audio) ✓ CPU or Apple Silicon eGPU
Windows pip install p3x-meet-assistant MIC button TAB button ✓ NVIDIA CUDA or CPU
Cloud server pip install p3x-meet-assistant — (no local audio) Browser capture from the user's machine Optional CPU diarization

macOS specifics

  • The standard pip install p3x-meet-assistant works. No homebrew needed for the default setup.
  • For meetings in Google Meet / Zoom / Teams, use the TAB button in the browser — it works identically to how you'd share audio in a Meet call.
  • To capture system audio outside a browser tab (e.g. a desktop Zoom app), install BlackHole or Loopback to create a virtual audio device, then select it as the browser's microphone input.
  • For GPU speaker diarization on Apple Silicon, the [gpu] extra installs torch; it runs on the Metal backend automatically.

Windows specifics

  • The standard pip install p3x-meet-assistant works on Windows 10/11.
  • Open PowerShell or Command Prompt and run p3x-meet-assistant.
  • TAB button captures any browser-tab audio (the same permission flow as Meet's "Share a tab" with the "Share audio" checkbox).
  • If you want system-wide capture, tools like VB-Audio Cable or Voicemeeter expose a virtual microphone that routes all system audio into the browser.
  • NVIDIA GPU diarization works out of the box via the [gpu] extra.

Requirements — bare minimum

  • Python 3.10+
  • A modern browser (Chrome, Firefox, Edge)
  • An OpenAI API key

Node.js is not required when installing from PyPI — the wheel ships the pre-built frontend.

Requirements — optional extras

  • Linux system-audio capture: portaudio19-dev package + the [linux-capture] pip extra
  • Speaker diarization: any NVIDIA GPU with ~500 MB VRAM (GTX 1650 / RTX 2060 and up) + the [gpu] pip extra. CPU fallback works but is slower.

No GPU is fine — the app degrades gracefully. You lose speaker labels but everything else works.

Install from PyPI

The recommended path for anyone who just wants to use Meet Assistant. The Python wheel bundles the pre-built frontend, so there's no Node.js, no build step, no git clone — just pip install and go.

1. One-time setup — create a virtual environment

Skip straight to step 2 if you already use a venv or a managed environment like pipx, poetry, or uv.

python3 -m venv ~/.venvs/meet-assistant
source ~/.venvs/meet-assistant/bin/activate

Installing into the system Python works too, but a venv keeps dependencies isolated. On some modern Linux distros (Ubuntu 24.04+, Debian 12+) system-wide pip install is blocked by PEP 668 — a venv (or pipx) is required.

2. Install the package

Pick the variant that matches your hardware. All four commands install the same core package; the optional extras pull in additional wheels for features you want.

Command What you get Wheel size Recommended for
pip install p3x-meet-assistant Cloud transcription (GPT-4o) + browser audio capture ~300 kB + deps (~40 MB) Laptops, macOS/Windows, cloud servers
pip install 'p3x-meet-assistant[gpu]' Above + GPU speaker diarization (resemblyzer + torch) ~700 MB total Workstations with any NVIDIA GPU
pip install 'p3x-meet-assistant[linux-capture]' Above + server-side PulseAudio / PipeWire capture (SpeechRecognition + PyAudio) ~40 MB + system portaudio Linux desktops that want system-audio capture
pip install 'p3x-meet-assistant[all]' Everything together ~700 MB Full local workstation install

Linux users with [linux-capture] or [all] need the PortAudio dev headers before pip install:

sudo apt install portaudio19-dev

3. Provide your OpenAI API key

Get a key at https://platform.openai.com/api-keys, then either:

Option A — environment variable (quickest):

export OPENAI_API_KEY=sk-...

Add it to ~/.bashrc / ~/.zshrc if you want it permanent.

Option B — .env file in your working directory:

cd ~/my-meetings                  # wherever you run the command from
echo "OPENAI_API_KEY=sk-..." > .env

Meet Assistant automatically loads .env from the current working directory on startup.

4. Run it

p3x-meet-assistant

Open http://localhost:8088 in your browser. Pick a language from the top dropdown, then either:

  • Click MIC to transcribe your microphone
  • Click TAB to share a browser tab with audio (Google Meet, YouTube, a Facebook stream — anything with "Share audio" enabled)
  • On Linux with [linux-capture] installed, the server auto-detects the system speaker monitor and starts transcribing immediately

Every transcript is appended to sessions/YYYY-MM-DD-HH-MM.txt in your working directory automatically.

Upgrade or uninstall

pip install --upgrade p3x-meet-assistant            # latest stable
pip install 'p3x-meet-assistant==2026.4.109'        # pin to a specific release
pip uninstall p3x-meet-assistant                    # remove

Release notes for every version: https://github.com/patrikx3/meet-assistant/releases.

What gets installed

The wheel contains:

  • meet_assistant/ — the Python package (FastAPI server, OpenAI client, diarizer, state manager)
  • meet_assistant/dist/ — the pre-built Vite frontend (HTML, JS, CSS, Font Awesome fonts)
  • Entry point p3x-meet-assistantmeet_assistant.cli:main

What's not in the wheel (excluded by MANIFEST.in): secure/, agents/, .claude/, .vscode/, AGENTS.md, CLAUDE.md, source-only configs, the dev launcher, and any tokens. Safe to install from PyPI.

Troubleshooting a pip install

Symptom Fix
externally-managed-environment (PEP 668) Use a venv (python3 -m venv) or pipx install p3x-meet-assistant
Could not build wheels for PyAudio on Linux Install portaudio19-dev: sudo apt install portaudio19-dev
Could not build wheels for PyAudio on macOS brew install portaudio then retry
No module named 'torch' at runtime Install the [gpu] extra or skip diarization
Port 8088 already in use Run with p3x-meet-assistant --port 9000 (or any free port)
No OpenAI API key found Set OPENAI_API_KEY in your shell or .env in the working directory

Install from source (development workflow)

Only needed if you want to hack on the code itself.

git clone https://github.com/patrikx3/meet-assistant.git
cd meet-assistant

# Linux only — for PulseAudio capture
sudo apt install portaudio19-dev

# Python venv
python3 -m venv venv

# Pick ONE based on your hardware:
./venv/bin/pip install -r requirements.txt        # full local with GPU
./venv/bin/pip install -r requirements-cloud.txt  # cloud-only, no GPU

# Frontend build
yarn install
yarn build:web

# Dev launcher (auto-reload)
./meet-assistant-web.py --dev

Want diarization later on an already-installed source checkout? Just add:

./venv/bin/pip install resemblyzer

Command-line options

Flag Default Description
--port PORT 8088 Web server port
--host HOST 0.0.0.0 Web server host
--dev off Auto-reload on Python file changes
--device INDEX 11 PyAudio device index for the speaker monitor (Linux only)

Speaker diarization

If the diarizer loaded successfully (check the startup console output), every transcribed line is prefixed with a Speaker N label, color-coded in the UI. Clusters live in memory for the session — click the Clear button to wipe them and start fresh.

  • Runs on GPU (CUDA) automatically, falls back to CPU if no GPU is available
  • Adds ~20 ms per chunk on a modern NVIDIA card — imperceptible
  • Language-independent (voice fingerprint, not words)
  • Tuning knob: SIMILARITY_THRESHOLD in web/diarizer.py. Lower = more merging, higher = more splitting. Default: 0.75.

Troubleshooting the clustering:

  • Same person gets split across multiple speakers → lower the threshold to ~0.65
  • Different people collapse into one speaker → raise the threshold to ~0.82

How audio is captured

Linux with PulseAudio / PipeWire (default on a local workstation): The app auto-detects your speakers' monitor source and records everything that plays on them — meeting audio, video calls, YouTube, Facebook streams, anything audible. No routing setup needed.

Any OS (or cloud deployment): Click the MIC button to transcribe your microphone, or the TAB button to share a browser tab with audio (identical to Google Meet's "Share tab with audio" feature). Uses the standard getUserMedia / getDisplayMedia browser APIs.

Use with Google Meet / Zoom / Teams / Facebook Live

Start Meet Assistant, then join your call or open your stream as normal. On Linux, system audio is captured automatically. On other platforms, click TAB and select the meeting tab with "Share audio" enabled.

Development

# Frontend dev server with HMR (port 5173, proxies /ws to :8088)
npm run dev

# Production build
npm run build:web

# Backend with auto-reload on file changes
./meet-assistant-web.py --dev

VS Code: open the project and press F5 — preset launch configs are wired up.

Project structure

meet-assistant-web.py         # Entry point
web/
  __init__.py                 # Package init, .env auto-load, audio bootstrap
  audio.py                    # PulseAudio source detection
  engines.py                  # OpenAI GPT-4o Transcribe wrapper + hallucination filter
  diarizer.py                 # Speaker diarization (resemblyzer on CUDA)
  state.py                    # App state, WebSocket broadcast, capture loop
  server.py                   # FastAPI app, routes, WebSocket handler
  src/                        # Frontend source (Vite)
    index.html
    main.js
    style.css
  dist/                       # Built frontend (gitignored)
requirements.txt              # Full deps with GPU diarization
requirements-cloud.txt        # Lean deps, cloud-only (no diarization)
.env.example                  # Template for your API key

Troubleshooting

Symptom Fix
No OpenAI API key found Set OPENAI_API_KEY in .env or export it in your shell
No monitor source found You're not on PulseAudio/PipeWire — use the MIC or TAB browser buttons
Diarizer unavailable Install resemblyzer: ./venv/bin/pip install resemblyzer — or ignore if you don't want speaker labels
One person tagged as multiple speakers Lower SIMILARITY_THRESHOLD in web/diarizer.py to ~0.65
Multiple people collapsed into one speaker Raise SIMILARITY_THRESHOLD to ~0.82
Too many hallucinations on silent audio Already filtered — see _is_hallucination in web/engines.py

License

MIT


🌐 Meet Assistant SaaS — meeting.corifeus.com

Don't want to install anything? Try the hosted version at meeting.corifeus.com — full meeting workflow built for European businesses, no setup, no API key, no command line.

What the hosted version offers:

  • 21-language live translation during the meeting
  • AI summaries, action items, decisions, attendees, key quotes auto-generated after every meeting
  • Custom vocabulary — your client / company / industry terms corrected automatically (Pro+ tier)
  • Searchable meeting library — find any decision or promise across all your past meetings
  • Shareable read-only links — send a clean meeting summary to a client or teammate, no signup needed on their end
  • One-click email summary after each meeting
  • Premium engine on every plan — no downgraded model, ever
  • EU billing — Stripe Tax + VAT-compliant + EUR-priced (Solo €19.99 / Pro €39.99 / Business €99.99 per month, no lock-in)
  • GDPR-compliant by default — browser-language auto-detection, no tracking cookies, your meetings stored encrypted

Try the live demo (1 minute free, no signup) or browse the public sample meeting at meeting.corifeus.com/sample.


Corifeus Network

AI-powered network & email toolkit — free, no signup.

Web · network.corifeus.com MCP · npm i -g p3x-network-mcp

  • AI Network Assistant — ask in plain language, get a full domain health report
  • Network Audit — DNS, SSL, security headers, DNSBL, BGP, IPv6, geolocation in one call
  • Diagnostics — DNS lookup & global propagation, WHOIS, reverse DNS, HTTP check, my-IP
  • Mail Tester — live SPF/DKIM/DMARC + spam score + AI fix suggestions, results emailed (localized)
  • Monitoring — TCP / HTTP / Ping with alerts and public status pages
  • MCP server — 17 tools exposed to Claude Code, Codex, Cursor, any MCP client
  • Installclaude mcp add p3x-network -- npx p3x-network-mcp
  • Try"audit example.com", "why do my emails land in spam? test me@example.com"
  • Sourcepatrikx3/network · patrikx3/network-mcp
  • Contactpatrikx3.com · donate

❤️ Support Our Open-Source Project

If you appreciate our work, consider ⭐ starring this repository or 💰 making a donation to support server maintenance and ongoing development. Your support means the world to us—thank you!


🌍 About My Domains

All my domains, including patrikx3.com, corifeus.eu, and corifeus.com, are developed in my spare time. While you may encounter minor errors, the sites are generally stable and fully functional.


📈 Versioning Policy

Version Structure: We follow a Major.Minor.Patch versioning scheme:

  • Major: 📅 Corresponds to the current year.
  • Minor: 🌓 Set as 4 for releases from January to June, and 10 for July to December.
  • Patch: 🔧 Incremental, updated with each build.

🚨 Important Changes: Any breaking changes are prominently noted in the readme to keep you informed.

P3X-MEET-ASSISTANT Build v2026.4.123

Donate for PatrikX3 / P3X Contact Corifeus / P3X Like Corifeus @ Facebook

About

Real-time AI speech-to-text for meetings with GPT-4o Transcribe and GPU speaker diarization

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors