Transform articles and PDFs into audio with AI-powered summarization. Perfect for learning on the go!
- 📄 PDF Support: Upload and process PDF documents with intelligent text extraction
- 🎵 AI Audio Generation: Convert text to natural-sounding speech using OpenAI TTS
- 📝 Smart Summarization: Get key insights and golden nuggets from your content
- ⚡ Progressive Loading: Large files are processed in chunks for optimal performance
- 🎛️ Audio Controls: Play, pause, stop, and adjust playback speed (1x to 2x)
- 📊 Timeline: Visual progress bar with current time and duration
- 🎨 Modern UI: Beautiful, responsive interface built with shadcn/ui and Tailwind CSS
- Node.js 18+
- npm or yarn
- OpenAI API key
-
Clone the repository
git clone <your-repo-url> cd SnapListen
-
Install dependencies
npm install
-
Set up environment variables
cp .env.example .env.local
Add your OpenAI API key to
.env.local:OPENAI_API_KEY=your_openai_api_key_here -
Run the development server
npm run dev
-
Open your browser Navigate to http://localhost:3000
- Click on "Paste Text" tab
- Copy and paste your article text into the text area
- Click "Convert to Audio"
- Wait for processing to complete
- Use the audio player controls to listen
- Click on "Upload PDF" tab
- Drag and drop a PDF file or click to browse
- Wait for PDF processing (large files are automatically chunked)
- The extracted text will appear in the text area
- Click "Convert to Audio" to generate audio
- Use the audio player controls to listen
▶️ Play/Pause: Start or pause audio playback- ⏹️ Stop: Stop playback and reset to beginning
- 🎚️ Speed Control: Adjust playback speed (1x, 1.25x, 1.5x, 1.75x, 2x)
- 📊 Timeline: Click anywhere on the progress bar to seek to that position
- ⏱️ Time Display: Shows current time and total duration
- Frontend: Next.js 14 with App Router, React 18, TypeScript
- UI Components: shadcn/ui with Radix UI primitives
- Styling: Tailwind CSS with custom design system
- PDF Processing: pdf-parse with intelligent chunking
- AI Services: OpenAI GPT-4o-mini for text processing and TTS
- Audio: HTML5 Audio API with progressive loading
- PDF Support: Extracts text from PDFs using pdf-parse
- Text Chunking: Automatically splits large texts into 4000-character chunks
- Progressive Loading: Processes chunks sequentially to avoid API limits
- Error Handling: Comprehensive error handling with user-friendly messages
POST /api/summarize- Generate AI summary from textPOST /api/tts- Convert text to speech with chunking supportPOST /api/process-pdf- Extract and process text from PDF files
OPENAI_API_KEY=your_openai_api_key_here- Chunk Size: Modify
maxChunkSizein/app/api/process-pdf/route.ts(default: 4000) - TTS Voice: Change voice in
/app/api/tts/route.ts(default: "alloy") - Speed Options: Modify
SPEED_OPTIONSin/components/AudioPlayer.tsx
SnapListen/
├── app/
│ ├── api/
│ │ ├── process-pdf/route.ts # PDF processing endpoint
│ │ ├── summarize/route.ts # AI summarization endpoint
│ │ └── tts/route.ts # Text-to-speech endpoint
│ ├── globals.css # Global styles
│ ├── layout.tsx # Root layout
│ └── page.tsx # Main page
├── components/
│ ├── ui/ # shadcn/ui components
│ ├── ArticleInput.tsx # Input component
│ ├── AudioPlayer.tsx # Audio player component
│ └── Summary.tsx # Summary display component
├── lib/
│ └── utils.ts # Utility functions
├── types/
│ └── index.ts # TypeScript type definitions
└── README.md
PDF Processing Fails
- Ensure the PDF contains extractable text (not just images)
- Check file size (very large files may take longer)
- Verify OpenAI API key is correctly set
Audio Playback Issues
- Check browser audio permissions
- Ensure stable internet connection for TTS generation
- Try refreshing the page if audio doesn't load
TypeScript Errors
- Run
npm installto ensure all dependencies are installed - Check that
@types/pdf-parseis installed for PDF processing
- For very large PDFs, processing may take several minutes
- The app automatically chunks large files to optimize performance
- Audio generation starts immediately after the first chunk is processed
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Next.js for the amazing React framework
- shadcn/ui for beautiful UI components
- OpenAI for AI and TTS services
- pdf-parse for PDF text extraction
If you encounter any issues or have questions, please:
- Check the troubleshooting section above
- Search existing issues in the repository
- Create a new issue with detailed information about your problem
Happy listening with SnapListen! 🎧