A lightweight, local-first AI “panelist” built for a live Beer Driven Devs event.
Bubbles is not a product. It’s theatre with guardrails.
It listens (imperfectly) to a live discussion, maintains a rough sense of the room, and speaks only when explicitly invited by the moderator. It has a beer-themed animated presence on a tablet and can be dramatically “overfilled” (overflow animation) if we need to cut it off.
If it works: great. If it doesn’t: we thank it for its service and move on.
Figure: It worked!- A fun hack
- A local AI demo
- A moderated participant in a live panel
- An exploration of AI presence in human discussion
- A general-purpose conversational system
- An enterprise architecture reference
- An autonomous agent
- A replacement for humans
Laptop Host (Laptop)
- Records mic input (via OBS)
- Runs Whisper STT
- Maintains rolling transcript
- Periodically summarises discussion
- Generates responses via local LLM
- Converts responses to speech (TTS)
- Broadcasts state updates (SignalR)
Bubbles Display (Tablet – .NET MAUI)
- Fullscreen animated beer UI
- Shows states: Idle / Listening / Thinking / Speaking / Overflow / Disabled
- No AI logic
Moderator Control (Phone or MAUI mini-app)
- Trigger AI
- Cancel response
- Disable AI
- No autonomous interjections
- Short responses (≤150 words)
- Self-deprecating humour only
- Never attack individuals
- Must be killable instantly
- Failure is acceptable
- Panel runs normally
- STT and summary update in background
- Moderator triggers Bubbles
- Bubbles “thinks”
- Response generated and spoken
- Return to listening
- If needed → overflow → disabled
“Looks like we’ve overfilled it. Bubbles, thanks for joining us tonight.”
/docs
software-spec.md # Original requirements
hardware-setup.md # Hardware configuration
prompts.md # LLM prompt templates
local-pipeline-guide.md # 🆕 Local AI pipeline architecture & developer guide
example-implementations.md # 🆕 Real implementation examples (Whisper, Ollama, etc.)
/src
/API # ASP.NET Core coordination layer
/Services
/Interfaces # Service abstractions (STT, LLM, TTS)
/Implementations # Mock & real implementations
AIPanelistOrchestrator.cs # Main pipeline coordinator
TranscriptBufferService.cs # Rolling transcript buffer
/Controllers
PanelistController.cs # REST API endpoints
/Hubs
BubblesHub.cs # SignalR hub for state broadcasting
/Bubbles # .NET MAUI display app (tablet)
/ModeratorApp # .NET MAUI control app (Phone)
/Shared # Shared models and state managementcd src/API
dotnet runThe API starts with mock implementations by default for easy testing.
Via REST API:
curl -X POST http://localhost:5141/api/panelist/triggerVia Moderator App:
- Set panelist state to "Listening" - the API will automatically trigger a response
The system works end-to-end with mock implementations:
- ✅ Continuous mock transcription every 5 seconds
- ✅ Periodic summary generation every 45 seconds
- ✅ Full response generation pipeline (Thinking → Speaking → Listening)
- ✅ State broadcasting via SignalR
- ✅ Cancel/disable functionality
Real implementations are now available! 🎉
The system includes production-ready implementations:
- Whisper.net - Local STT with Windows audio capture (NAudio)
- Ollama - Local LLM inference for summaries and responses
- Azure Cognitive Services - High-quality text-to-speech
- Windows Audio - Real audio device management and playback
See SETUP_REAL_SERVICES.md for complete setup instructions.
Quick Example - Full Local Stack:
- Install Ollama: https://ollama.ai/download
- Pull model:
ollama pull llama2 - Start Ollama:
ollama serve - Edit
src/API/appsettings.json:
{
"Ollama": {
"Endpoint": "http://localhost:11434",
"Model": "llama2"
},
"AIPanelist": {
"SttServiceType": "Whisper",
"LlmServiceType": "Ollama",
"TtsServiceType": "Mock",
"AudioDeviceServiceType": "Windows",
"AudioPlaybackServiceType": "Windows"
}
}- Run:
dotnet run
The system will:
- Capture audio from your microphone
- Transcribe with Whisper (auto-downloads model on first run)
- Generate summaries and responses with Ollama
- Broadcast states via SignalR
Configuration is simple - just edit appsettings.json to switch between mock and real services. No code changes required!
The current audio capture uses standard Windows input devices (microphones). This works great for in-person events but won't capture audio from remote participants in Teams/Zoom calls - their audio comes through your speakers as output, not input.
Current Workarounds:
- Use a physical setup where remote audio plays through speakers and is picked up by a room mic
- Use virtual audio cable software (VB-Cable, VoiceMeeter) to route system audio to a virtual input device
Future Enhancement (TODO):
Add WASAPI loopback capture support to WindowsAudioDeviceService. This would expose system audio as a selectable input device, allowing direct capture of remote meeting audio without physical workarounds. The loopback device should be disabled by default and opt-in via the moderator app.
- SETUP_REAL_SERVICES.md - Quick start guide for real AI services ⭐
- local-pipeline-guide.md - Architecture and developer guide
- example-implementations.md - Complete implementation examples and code
This project uses Qwen3-TTS for voice synthesis. A few things to note:
The TTS server supports voice cloning from a reference audio sample. You'll need:
- A
.wavfile of the voice you want to clone (~10-30 seconds of clear speech) - A transcript of that audio
Configure via environment variables TTS_REF_AUDIO and TTS_REF_TEXT, or edit the defaults in server.py.
See src/QwenAPI/setup-guide.md for detailed setup instructions.
For the full experience, you can pre-generate:
Intro.wav- Intro audio (place insrc/API/wwwroot/audio/)- Filler phrases - Audio clips for natural pauses (place in
src/API/wwwroot/audio/filler-phrases/)
On Windows, PyTorch CUDA has issues running natively. The AppHost is configured to run the TTS server via WSL:
- Set up WSL with Ubuntu (e.g.,
Ubuntu-22.04) - Copy
src/QwenAPI/server.pyandrequirements.txtto your WSL environment - Create a Python venv and install dependencies
- Update the path in
AppHost.csto match your WSL setup
If you're on Linux or macOS, edit AppHost.cs - there's a commented-out AddPythonApp option you can use instead of the WSL executable.
Because we joked about inviting an AI to the panel…
And then realised it would be fun.
Cheers 🍻

