An audio transcriber application that captures audio and transforms it into text using OpenAI's Whisper API.
There is a two-series Medium articles explaining everything in this project:
- Part 1: Basic requirements, flowchart diagrams, and core UI components implementation
- Part 2: Integration with OpenAI SDK with Next.js Server Action
Feel free to read those articles before exploring the codes to get more context about this project.
AI Transcribe is an application focused on transcribing audio into text and insights. It captures audio through your device's microphone, processes it, and returns accurate transcriptions within seconds. The project demonstrates how to build a complete audio transcription system with a modern UI using Next.js. See demo video below to see it in action.
Screen.Recording.2025-04-26.at.00.41.11.mp4
- Real-time Audio Capture: Records audio through your device's microphone
- Audio Processing: Chunks and processes audio for optimal transcription performance
- OpenAI Whisper Integration: Utilizes the powerful Whisper API for accurate speech-to-text conversion
- Modern UI: Built with a clean, responsive interface using Tailwind CSS and Shadcn UI
- Frontend: Next.js 15, React 19, Tailwind CSS
- UI Components: Shadcn UI components
- API Integration: OpenAI API, Next.js Server Action
- State Management: React Hooks
- Styling: Tailwind CSS with class-variance-authority
- Node.js 18.17 or higher
- An OpenAI API key
-
Clone the repository:
git clone https://github.com/ifindev/ai-transcribe.git cd ai-transcribe
-
Install dependencies:
npm install
-
Create a
.env.local
file in the root directory with the following variables:OPENAI_API_KEY=your-openai-api-key
-
Start the development server:
npm run dev
-
Open http://localhost:3000 in your browser to use the application.
- Click on the record button to start capturing audio
- Speak clearly into your microphone
- The application will process your speech and display the transcription in real-time
- You can pause, resume, or stop the recording at any time
- Review your transcription history in the recordings list
src/app
: Next.js app router pages and layoutssrc/components
: Reusable UI componentssrc/hooks
: Custom React hooks for audio recording and processingsrc/modules
: Feature-specific modules (workspace, etc.)src/services
: Service layer for external API integrationssrc/actions
: Server actions for API requestssrc/models
: TypeScript type definitionssrc/utils
: Utility functionssrc/libs
: Third party library instantioation
- Support for multiple languages
- Speaker identification
- Searchable transcription history
- Export options (PDF, Word, etc.)
- Automatic summarization using AI
Make sure to grant microphone access permission when prompted by your browser. If you accidentally denied it, you may need to reset permissions in your browser settings.
For best results:
- Use a high-quality microphone
- Minimize background noise
- Speak clearly and at a moderate pace
- Position the microphone close to the speaker
MIT License
Copyright (c) 2025 - Muhammad Arifin