Skip to content

ifindev/ai-transcribe

Repository files navigation

AI Transcribe

An audio transcriber application that captures audio and transforms it into text using OpenAI's Whisper API.

Screenshot 2025-04-25 at 10 47 21

Detailed Write up

There is a two-series Medium articles explaining everything in this project:

Feel free to read those articles before exploring the codes to get more context about this project.

Overview

AI Transcribe is an application focused on transcribing audio into text and insights. It captures audio through your device's microphone, processes it, and returns accurate transcriptions within seconds. The project demonstrates how to build a complete audio transcription system with a modern UI using Next.js. See demo video below to see it in action.

Screen.Recording.2025-04-26.at.00.41.11.mp4

Key Features

  • Real-time Audio Capture: Records audio through your device's microphone
  • Audio Processing: Chunks and processes audio for optimal transcription performance
  • OpenAI Whisper Integration: Utilizes the powerful Whisper API for accurate speech-to-text conversion
  • Modern UI: Built with a clean, responsive interface using Tailwind CSS and Shadcn UI

Tech Stack

  • Frontend: Next.js 15, React 19, Tailwind CSS
  • UI Components: Shadcn UI components
  • API Integration: OpenAI API, Next.js Server Action
  • State Management: React Hooks
  • Styling: Tailwind CSS with class-variance-authority

Getting Started

Prerequisites

  • Node.js 18.17 or higher
  • An OpenAI API key

Environment Setup

  1. Clone the repository:

    git clone https://github.com/ifindev/ai-transcribe.git
    cd ai-transcribe
  2. Install dependencies:

    npm install
  3. Create a .env.local file in the root directory with the following variables:

    OPENAI_API_KEY=your-openai-api-key
    
  4. Start the development server:

    npm run dev
  5. Open http://localhost:3000 in your browser to use the application.

Using the Application

  1. Click on the record button to start capturing audio
  2. Speak clearly into your microphone
  3. The application will process your speech and display the transcription in real-time
  4. You can pause, resume, or stop the recording at any time
  5. Review your transcription history in the recordings list

Project Structure

  • src/app: Next.js app router pages and layouts
  • src/components: Reusable UI components
  • src/hooks: Custom React hooks for audio recording and processing
  • src/modules: Feature-specific modules (workspace, etc.)
  • src/services: Service layer for external API integrations
  • src/actions: Server actions for API requests
  • src/models: TypeScript type definitions
  • src/utils: Utility functions
  • src/libs: Third party library instantioation

Future Enhancements

  • Support for multiple languages
  • Speaker identification
  • Searchable transcription history
  • Export options (PDF, Word, etc.)
  • Automatic summarization using AI

Troubleshooting

Microphone Access Issues

Make sure to grant microphone access permission when prompted by your browser. If you accidentally denied it, you may need to reset permissions in your browser settings.

Transcription Quality Issues

For best results:

  • Use a high-quality microphone
  • Minimize background noise
  • Speak clearly and at a moderate pace
  • Position the microphone close to the speaker

License

MIT License

Copyright (c) 2025 - Muhammad Arifin

About

AI audio Transcriber with OpenAI Whisper ๐ŸŽ™๏ธ๐Ÿค–

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published