A base for your portfolio piece to land your next AI engineering job. AI-powered voice transcription with Whisper and LLM cleaning. Browser-based recording interface with FastAPI backend.
📺 Recommended Video Tutorial: For project structure and API details, watch the full tutorial on YouTube: https://youtu.be/WUo5tKg2lnE
Agentic Branch: Switch to the branch checkpoint-agentic-openrouter to build on the agentic demo from the full video on YouTube: https://youtu.be/uR_lvAZFBw0
Features:
- 🎤 Browser-based voice recording
- 🔊 English Whisper speech-to-text (runs locally)
- 🤖 LLM cleaning (removes filler words, fixes errors)
- 🔌 OpenAI API-compatible (works with Ollama, LM Studio, OpenAI, or any OpenAI-compatible API)
- 📋 One-click copy to clipboard
Note that the vanilla version uses a smaller language model running on your CPU. This means the AI may not listen to system prompts that well depending on the transcript. The challenge for you is to change this portfolio app to advance the solution and make it your own.
For example:
- Modify it for a specific industry
- Add GPU acceleration + stronger local LLM
- Use a cloud AI model
- Real-time transcription/LLM streaming
- Multi-language support beyond English
📚 Need help and want to learn more?
Full courses on AI Engineering are available at https://www.skool.com/ai-engineer
This project is devcontainer-first. The easiest way to get started:
- Click "Reopen in Container" in VS Code
- Or:
Cmd/Ctrl+Shift+P→ "Dev Containers: Reopen in Container" - Wait ~5-10 minutes for initial build and model download
VS Code automatically:
- Builds and starts both containers (app + Ollama)
- Installs Python and Node.js dependencies
- Downloads the Ollama model
- Creates
backend/.envwith working defaults
Skip to Running the App.
The devcontainer is the easiest supported setup method for beginners. If you choose to install manually, you'll need:
- Python 3.12+, Node.js 24+, uv, and an LLM server (Ollama or LM Studio)
- Copy
backend/.env.exampletobackend/.envand configure - Install dependencies with
uv sync(backend) andnpm install(frontend) - Start your LLM server and pull models:
ollama pull llama3.1:8b
For detailed setup, use the devcontainer above.
Open two terminals and run:
Terminal 1 - Backend:
cd backend
uv run uvicorn app:app --reload --host 0.0.0.0 --port 8000 --timeout-keep-alive 600Note:
--timeout-keep-alive 600sets a 10-minute timeout for long audio processing
Terminal 2 - Frontend:
cd frontend
npm run devBrowser: Open http://localhost:3000
This app is compatible with any OpenAI API-format LLM provider:
- Ollama (default - works out of the box in devcontainer)
- LM Studio (local alternative)
- OpenAI API (cloud-based)
- Any other OpenAI-compatible API
The devcontainer automatically creates backend/.env with working Ollama defaults. No configuration needed to get started.
To use a different provider, edit backend/.env:
LLM_BASE_URL- API endpointLLM_API_KEY- API keyLLM_MODEL- Model name
Container won't start or is very slow:
Configure Docker Desktop resources:
- Open Docker Desktop → Settings → Resources
- Set CPUs to maximum available (8+ cores recommended)
- Set Memory to at least 16GB
- Click Apply & Restart
Expected specs: Modern laptop/desktop with 8+ CPU cores and 16GB RAM. More CPU = faster LLM responses.
Microphone not working:
- Use Chrome or Firefox (Safari may have issues)
- Check browser permissions: Settings → Privacy → Microphone
Backend fails to start:
- Check Whisper model downloads:
~/.cache/huggingface/ - Ensure enough disk space (models are ~150MB)
LLM errors:
- Make sure Ollama service is running (it auto-starts with devcontainer)
- Check model is downloaded: Model downloads automatically during devcontainer setup
- Transcription still works without LLM (raw Whisper only)
LLM is slow:
- See "Container won't start or is very slow" section above for Docker resource configuration
- Fallback option: Switch to another model (edit
LLM_MODELinbackend/.env)⚠️ Trade-off: 3b is faster but significantly worse at cleaning transcripts
- Best alternative: Use a cloud API like OpenAI for instant responses with excellent quality (edit
.env)
Cannot access localhost:3000 or localhost:8000 from host machine:
- Docker Desktop: Go to Settings → Resources → Network
- Enable "Use host networking" (may require Docker Desktop restart)
- Restart the frontend and backend servers
Port already in use:
- Backend: Change port with
--port 8001 - Frontend: Edit
vite.config.js, changeport: 3000