Skip to content

Latest commit

 

History

History
185 lines (137 loc) · 3.68 KB

File metadata and controls

185 lines (137 loc) · 3.68 KB

PT - Audio Transcription Tool

An audio transcription tool based on Next.js and OpenAI Whisper API, supporting audio file transcription and intelligent summary generation.

✨ Features

  • 🎯 Support both file upload and URL input
  • 🎙️ Support for Xiaoyuzhou podcast transcription
  • 📝 High-quality audio transcription using OpenAI Whisper API
  • 📊 AI-powered content summarization
  • 🎨 Modern UI design
  • 💾 Download transcripts and summaries
  • 🎵 Built-in audio player
  • 🖥️ CLI tool support (pt command)
  • 📋 SRT subtitle format output
  • 🔄 Chunked processing for large audio files
  • ⚡ Parallel transcription for better performance
  • 📤 Multiple output formats (text, JSON, markdown, SRT)

📦 CLI Installation

Install via npm

npm install -g @winterfx/pt

Configure API Key

Choose one of the following methods:

Option 1: Environment Variable (Recommended)

# Add to ~/.zshrc or ~/.bashrc
export API_KEY="your-api-key"
export BASE_URL="https://api.openai.com/v1"  # optional

Option 2: Config File (~/.pt/.env)

mkdir -p ~/.pt
cat > ~/.pt/.env << 'EOF'
API_KEY=your-api-key
BASE_URL=https://api.openai.com/v1
EOF

Option 3: Current Directory (.env)

cat > .env << 'EOF'
API_KEY=your-api-key
BASE_URL=https://api.openai.com/v1
EOF

🚀 Web App Development

Prerequisites

  • Node.js 18+
  • OpenAI API Key
  • FFmpeg (required for audio processing)

Installing FFmpeg

# macOS
brew install ffmpeg

# Linux (Ubuntu/Debian)
sudo apt-get install ffmpeg

# Windows
choco install ffmpeg

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/podcast-transcription.git
cd podcast-transcription
  1. Install dependencies:
npm install
# or
yarn install
# or
pnpm install
  1. Configure environment variables: Create a .env.local file and add:
API_KEY=your_openai_api_key
BASE_URL=your_endpoint
  1. Start the development server:
npm run dev
# or
yarn dev
# or
pnpm dev

Visit http://localhost:3000 to view the app.

Docker Deployment

  1. Build the Docker image:
docker build -t podcast-transcription .
  1. Run the container:
docker run -p 3000:3000 podcast-transcription

🖥️ CLI Tool

The project includes a command-line tool pt for transcribing audio files directly from the terminal.

CLI Usage

pt <input> [options]

Arguments:

  • <input> - Local file path or audio URL

Options:

  • -s, --summary - Generate AI summary after transcription
  • -l, --language <lang> - Language code: auto, en, zh, etc. (default: auto)
  • -o, --output <file> - Output file path (default: stdout)
  • --output-format <format> - Output format: text, json, markdown, srt (default: text)
  • -q, --quiet - Suppress progress output

CLI Examples

# Transcribe a local audio file
pt /path/to/podcast.mp3

# Transcribe with AI summary
pt podcast.mp3 --summary

# Generate SRT subtitles
pt podcast.mp3 --output-format srt -o subtitles.srt

# JSON output with summary
pt podcast.mp3 --summary --output-format json -o result.json

# Transcribe from URL
pt https://example.com/audio.mp3 --summary

Running the CLI

# Via npm script
npm run pt <input> [options]

# Or after global install
npm link
pt <input> [options]

🤝 Contributing

Pull Requests and Issues are welcome!

📄 License

MIT License - See LICENSE file for details.

Star History

Star History Chart