TwinTutor - Voice AI Challenge

A full-stack application that combines YouTube video transcripts with AI-powered tutoring, featuring a quad-view interface with video player, live transcript, AI chat, and voice interface.

🏗️ Architecture

Backend: FastAPI (Python) with Gemini AI integration
Frontend: React + TypeScript with Vite
AI Services: Google Gemini, ElevenLabs TTS

📋 Prerequisites

Python 3.13+
Node.js 18+
npm or yarn

🚀 Setup Instructions

1. Install Backend Dependencies

# Using uv (recommended):
uv sync

# To add new packages:
uv add <package-name>

2. Install Frontend Dependencies

npm install

3. Environment Variables

Create a .env file in the root directory with the following variables:

Backend (.env):

GEMINI_API_KEY=your_gemini_api_key_here
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here  # For TTS audio generation
ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM  # Optional, defaults to Rachel voice
ELEVENLABS_SST_API_KEY=your_elevenlabs_api_key_here

Frontend (.env or .env.local):

VITE_API_BASE_URL=http://localhost:8000  # Backend API URL (defaults to http://localhost:8000)
VITE_GEMINI_API_KEY=your_gemini_api_key_here  # Optional, only if using direct Gemini calls
VITE_ELEVENLABS_API_KEY=your_elevenlabs_api_key_here  # For voice cloning (frontend feature)
VITE_YOUTUBE_API_KEY=your_youtube_api_key_here  # Optional, for enhanced YouTube features

Note: The frontend is now connected to the backend API. When you load a video, it automatically initializes a backend session. The chat bot uses the backend's /api/ask endpoint which provides transcript-based context.

4. Run the Application

Terminal 1 - Backend:

python main.py
# Or: uvicorn main:app --reload

The backend will run on http://localhost:8000

Terminal 2 - Frontend:

npm run dev

The frontend will run on http://localhost:5173 (or another port if 5173 is taken)

📡 API Endpoints

Backend API (FastAPI)

GET / - Health check

POST /api/init-video - Initialize a video session

{
  "video_url": "https://www.youtube.com/watch?v=..."
}

POST /api/ask - Ask a question to the AI tutor

{
  "session_id": "uuid-here",
  "question": "What is this video about?"
}

🎯 Features

Video Player: YouTube video playback with transcript integration
Live Transcript: Real-time transcript display
AI Chat Bot: Context-aware chat using Gemini AI
Voice Interface: Voice cloning and text-to-speech with ElevenLabs
Quad-View Layout: Four-panel interface for optimal learning experience

🔧 Development

Backend Development

# Run with auto-reload
uvicorn main:app --reload

Frontend Development

# Run dev server
npm run dev

# Build for production
npm run build

# Preview production build
npm run preview

📝 Notes

The backend uses in-memory session storage (sessions are lost on server restart)
The call_elevenlabs_tts function generates audio files and saves them to static/audio/
Audio files are served via FastAPI's static file mounting at /static/audio/
Frontend and backend can work independently, but full integration requires connecting frontend to backend API endpoints

🐛 Troubleshooting

Module not found errors: Make sure all dependencies are installed
API key errors: Verify your .env file has all required keys
CORS issues: The backend should handle CORS, but if issues occur, check FastAPI CORS middleware
Port conflicts: Change ports in main.py (backend) or vite.config.ts (frontend)

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src		src
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
design.mdc		design.mdc
index.html		index.html
main.py		main.py
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
uv.lock		uv.lock
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TwinTutor - Voice AI Challenge

🏗️ Architecture

📋 Prerequisites

🚀 Setup Instructions

1. Install Backend Dependencies

2. Install Frontend Dependencies

3. Environment Variables

4. Run the Application

📡 API Endpoints

Backend API (FastAPI)

🎯 Features

🔧 Development

Backend Development

Frontend Development

📝 Notes

🐛 Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TwinTutor - Voice AI Challenge

🏗️ Architecture

📋 Prerequisites

🚀 Setup Instructions

1. Install Backend Dependencies

2. Install Frontend Dependencies

3. Environment Variables

4. Run the Application

📡 API Endpoints

Backend API (FastAPI)

🎯 Features

🔧 Development

Backend Development

Frontend Development

📝 Notes

🐛 Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages