AI Transcript App

A base for your portfolio piece to land your next AI engineering job. AI-powered voice transcription with Whisper and LLM cleaning. Browser-based recording interface with FastAPI backend.

📺 Recommended Video Tutorial: For project structure and API details, watch the full tutorial on YouTube: https://youtu.be/WUo5tKg2lnE Agentic Branch: Switch to the branch checkpoint-agentic-openrouter to build on the agentic demo from the full video on YouTube: https://youtu.be/uR_lvAZFBw0

Features:

🎤 Browser-based voice recording
🔊 English Whisper speech-to-text (runs locally)
🤖 LLM cleaning (removes filler words, fixes errors)
🔌 OpenAI API-compatible (works with Ollama, LM Studio, OpenAI, or any OpenAI-compatible API)
📋 One-click copy to clipboard

Note that the vanilla version uses a smaller language model running on your CPU. This means the AI may not listen to system prompts that well depending on the transcript. The challenge for you is to change this portfolio app to advance the solution and make it your own.

For example:

Modify it for a specific industry
Add GPU acceleration + stronger local LLM
Use a cloud AI model
Real-time transcription/LLM streaming
Multi-language support beyond English

📚 Need help and want to learn more?

Full courses on AI Engineering are available at https://www.skool.com/ai-engineer

Quick Start

🚀 Dev Container (Recommended)

This project is devcontainer-first. The easiest way to get started:

1. Prerequisites

2. Open in Dev Container

Click "Reopen in Container" in VS Code
Or: Cmd/Ctrl+Shift+P → "Dev Containers: Reopen in Container"
Wait ~5-10 minutes for initial build and model download

VS Code automatically:

Builds and starts both containers (app + Ollama)
Installs Python and Node.js dependencies
Downloads the Ollama model
Creates backend/.env with working defaults

Skip to Running the App.

🛠️ Manual Installation

The devcontainer is the easiest supported setup method for beginners. If you choose to install manually, you'll need:

Python 3.12+, Node.js 24+, uv, and an LLM server (Ollama or LM Studio)
Copy backend/.env.example to backend/.env and configure
Install dependencies with uv sync (backend) and npm install (frontend)
Start your LLM server and pull models: ollama pull llama3.1:8b

For detailed setup, use the devcontainer above.

Running the App

Open two terminals and run:

Terminal 1 - Backend:

cd backend
uv run uvicorn app:app --reload --host 0.0.0.0 --port 8000 --timeout-keep-alive 600

Note: --timeout-keep-alive 600 sets a 10-minute timeout for long audio processing

Terminal 2 - Frontend:

cd frontend
npm run dev

Browser: Open http://localhost:3000

Configuration

OpenAI API Compatibility

This app is compatible with any OpenAI API-format LLM provider:

Ollama (default - works out of the box in devcontainer)
LM Studio (local alternative)
OpenAI API (cloud-based)
Any other OpenAI-compatible API

The devcontainer automatically creates backend/.env with working Ollama defaults. No configuration needed to get started.

To use a different provider, edit backend/.env:

LLM_BASE_URL - API endpoint
LLM_API_KEY - API key
LLM_MODEL - Model name

Troubleshooting

Container won't start or is very slow:

⚠️ This app runs an LLM on CPU and requires adequate Docker resources.

Configure Docker Desktop resources:

Open Docker Desktop → Settings → Resources
Set CPUs to maximum available (8+ cores recommended)
Set Memory to at least 16GB
Click Apply & Restart

Expected specs: Modern laptop/desktop with 8+ CPU cores and 16GB RAM. More CPU = faster LLM responses.

Microphone not working:

Use Chrome or Firefox (Safari may have issues)
Check browser permissions: Settings → Privacy → Microphone

Backend fails to start:

Check Whisper model downloads: ~/.cache/huggingface/
Ensure enough disk space (models are ~150MB)

LLM errors:

Make sure Ollama service is running (it auto-starts with devcontainer)
Check model is downloaded: Model downloads automatically during devcontainer setup
Transcription still works without LLM (raw Whisper only)

LLM is slow:

See "Container won't start or is very slow" section above for Docker resource configuration
Fallback option: Switch to another model (edit LLM_MODEL in backend/.env)
- ⚠️ Trade-off: 3b is faster but significantly worse at cleaning transcripts
Best alternative: Use a cloud API like OpenAI for instant responses with excellent quality (edit .env)

Cannot access localhost:3000 or localhost:8000 from host machine:

Docker Desktop: Go to Settings → Resources → Network
Enable "Use host networking" (may require Docker Desktop restart)
Restart the frontend and backend servers

Port already in use:

Backend: Change port with --port 8001
Frontend: Edit vite.config.js, change port: 3000

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.devcontainer		.devcontainer
.vscode		.vscode
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Transcript App

Quick Start

🚀 Dev Container (Recommended)

1. Prerequisites

2. Open in Dev Container

🛠️ Manual Installation

Running the App

Configuration

OpenAI API Compatibility

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Transcript App

Quick Start

🚀 Dev Container (Recommended)

1. Prerequisites

2. Open in Dev Container

🛠️ Manual Installation

Running the App

Configuration

OpenAI API Compatibility

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages