A powerful Python utility that automatically converts PDF documents into narrated video presentations with synchronized slides and audio.
This tool takes a PDF document as input and performs the following transformations:
- Extracts text content from the PDF
- Uses GPT-4o to generate slide content and voice-over scripts
- Creates a PowerPoint presentation with bullet points
- Generates text-to-speech audio narration
- Combines slide images and audio into a complete video presentation
Perfect for quickly converting documentation, articles, or educational materials into engaging video content without manual editing.
- Automated Content Processing: Extracts and chunks PDF text for optimal processing
- AI-Powered Summarization: Uses OpenAI's GPT-4o to create concise slide bullets and detailed voice-over scripts
- Professional Presentation Generation: Creates structured PowerPoint presentations
- Natural Text-to-Speech: Converts scripts to audio using gTTS (Google Text-to-Speech)
- Complete Video Production: Combines slides and audio into a synchronized video
- PyMuPDF (fitz): For PDF text extraction
- OpenAI: For AI-powered content generation
- gTTS: For text-to-speech conversion
- python-pptx: For PowerPoint generation
- moviepy: For video creation
- pdf2image: For converting slides to images
- pydantic: For data validation
- python-dotenv: For environment variable management
- LibreOffice: Required for converting PowerPoint to PDF
- Poppler: Required by pdf2image for PDF processing
-
Clone the repository:
git clone https://github.com/yourusername/pdf-to-video-converter.git cd pdf-to-video-converter -
Install required Python packages:
pip install PyMuPDF openai gTTS python-pptx moviepy pdf2image pydantic python-dotenv -
Install external dependencies:
- LibreOffice: https://www.libreoffice.org/download/
- Poppler:
- On Ubuntu:
sudo apt-get install poppler-utils - On macOS:
brew install poppler - On Windows: Download from http://blog.alivate.com.au/poppler-windows/
- On Ubuntu:
-
Create a
.envfile with your OpenAI configuration:ENDPOINT=https://api.openai.com/v1 # Or your custom endpoint TOKEN=your_openai_api_key
- Import the main function and provide a path to your PDF:
from pdf_to_video import main
main("/path/to/your/document.pdf")- The script will generate:
voice.mp3: The narration audio filepresentation.pptx: The PowerPoint presentationfinal_video.mp4: The complete presentation video
- Adjust
max_chunk_charsinchunk_text()to modify how the PDF content is split - Modify the prompt in
generate_chunk_content()to change how the AI creates slides and scripts - Update the presentation styling in
generate_presentation()for different visual designs
- Text Extraction: The PDF is processed to extract all text content
- Content Chunking: Text is divided into manageable chunks (default 2500 chars)
- AI Processing: Each chunk is sent to GPT-4o to generate slide bullet points and voice-over script
- Audio Generation: The complete script is converted to speech using gTTS
- Presentation Creation: A PowerPoint presentation is created with the bullet points
- Slide Conversion: Slides are converted to images via LibreOffice and pdf2image
- Video Assembly: Slides and audio are combined into the final video using moviepy
- This project uses OpenAI's GPT-4o for natural language processing
- Text-to-speech provided by Google's TTS service