Skip to content

Transform articles and PDFs into audio and key insights. Perfect for learning on the go!

License

Notifications You must be signed in to change notification settings

mostofashakib/SnapListen

Repository files navigation

SnapListen

Transform articles and PDFs into audio with AI-powered summarization. Perfect for learning on the go!

SnapListen Demo TypeScript OpenAI

✨ Features

  • 📄 PDF Support: Upload and process PDF documents with intelligent text extraction
  • 🎵 AI Audio Generation: Convert text to natural-sounding speech using OpenAI TTS
  • 📝 Smart Summarization: Get key insights and golden nuggets from your content
  • ⚡ Progressive Loading: Large files are processed in chunks for optimal performance
  • 🎛️ Audio Controls: Play, pause, stop, and adjust playback speed (1x to 2x)
  • 📊 Timeline: Visual progress bar with current time and duration
  • 🎨 Modern UI: Beautiful, responsive interface built with shadcn/ui and Tailwind CSS

🚀 Quick Start

Prerequisites

  • Node.js 18+
  • npm or yarn
  • OpenAI API key

Installation

  1. Clone the repository

    git clone <your-repo-url>
    cd SnapListen
  2. Install dependencies

    npm install
  3. Set up environment variables

    cp .env.example .env.local

    Add your OpenAI API key to .env.local:

    OPENAI_API_KEY=your_openai_api_key_here
    
  4. Run the development server

    npm run dev
  5. Open your browser Navigate to http://localhost:3000

📖 How to Use

Method 1: Paste Text

  1. Click on "Paste Text" tab
  2. Copy and paste your article text into the text area
  3. Click "Convert to Audio"
  4. Wait for processing to complete
  5. Use the audio player controls to listen

Method 2: Upload PDF

  1. Click on "Upload PDF" tab
  2. Drag and drop a PDF file or click to browse
  3. Wait for PDF processing (large files are automatically chunked)
  4. The extracted text will appear in the text area
  5. Click "Convert to Audio" to generate audio
  6. Use the audio player controls to listen

Audio Player Controls

  • ▶️ Play/Pause: Start or pause audio playback
  • ⏹️ Stop: Stop playback and reset to beginning
  • 🎚️ Speed Control: Adjust playback speed (1x, 1.25x, 1.5x, 1.75x, 2x)
  • 📊 Timeline: Click anywhere on the progress bar to seek to that position
  • ⏱️ Time Display: Shows current time and total duration

🛠️ Technical Details

Architecture

  • Frontend: Next.js 14 with App Router, React 18, TypeScript
  • UI Components: shadcn/ui with Radix UI primitives
  • Styling: Tailwind CSS with custom design system
  • PDF Processing: pdf-parse with intelligent chunking
  • AI Services: OpenAI GPT-4o-mini for text processing and TTS
  • Audio: HTML5 Audio API with progressive loading

File Processing

  • PDF Support: Extracts text from PDFs using pdf-parse
  • Text Chunking: Automatically splits large texts into 4000-character chunks
  • Progressive Loading: Processes chunks sequentially to avoid API limits
  • Error Handling: Comprehensive error handling with user-friendly messages

API Endpoints

  • POST /api/summarize - Generate AI summary from text
  • POST /api/tts - Convert text to speech with chunking support
  • POST /api/process-pdf - Extract and process text from PDF files

🔧 Configuration

Environment Variables

OPENAI_API_KEY=your_openai_api_key_here

Customization

  • Chunk Size: Modify maxChunkSize in /app/api/process-pdf/route.ts (default: 4000)
  • TTS Voice: Change voice in /app/api/tts/route.ts (default: "alloy")
  • Speed Options: Modify SPEED_OPTIONS in /components/AudioPlayer.tsx

📁 Project Structure

SnapListen/
├── app/
│   ├── api/
│   │   ├── process-pdf/route.ts    # PDF processing endpoint
│   │   ├── summarize/route.ts      # AI summarization endpoint
│   │   └── tts/route.ts           # Text-to-speech endpoint
│   ├── globals.css                # Global styles
│   ├── layout.tsx                 # Root layout
│   └── page.tsx                   # Main page
├── components/
│   ├── ui/                        # shadcn/ui components
│   ├── ArticleInput.tsx          # Input component
│   ├── AudioPlayer.tsx           # Audio player component
│   └── Summary.tsx               # Summary display component
├── lib/
│   └── utils.ts                  # Utility functions
├── types/
│   └── index.ts                  # TypeScript type definitions
└── README.md

🚨 Troubleshooting

Common Issues

PDF Processing Fails

  • Ensure the PDF contains extractable text (not just images)
  • Check file size (very large files may take longer)
  • Verify OpenAI API key is correctly set

Audio Playback Issues

  • Check browser audio permissions
  • Ensure stable internet connection for TTS generation
  • Try refreshing the page if audio doesn't load

TypeScript Errors

  • Run npm install to ensure all dependencies are installed
  • Check that @types/pdf-parse is installed for PDF processing

Performance Tips

  • For very large PDFs, processing may take several minutes
  • The app automatically chunks large files to optimize performance
  • Audio generation starts immediately after the first chunk is processed

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📞 Support

If you encounter any issues or have questions, please:

  1. Check the troubleshooting section above
  2. Search existing issues in the repository
  3. Create a new issue with detailed information about your problem

Happy listening with SnapListen! 🎧

About

Transform articles and PDFs into audio and key insights. Perfect for learning on the go!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published