Skip to content

vhar-astro/storyboardgenerator

Repository files navigation

Storyboard Generator 🎬

A powerful AI-powered storyboard generator that creates sequential frame-by-frame storyboards from a single input image/sketch using Google Gemini API.

Features

  • 🎨 Generate storyboards from sketches or images - Transform a single image into a complete storyboard sequence
  • 🌍 Multilingual support - Works with prompts in Russian, English, and other languages
  • Flexible frame rates - Customize duration and FPS to match your needs
  • 📄 Multiple output formats - Export as PDF storyboard or numbered JPEG sequence
  • 🏷️ Professional labels - Each frame includes frame number, timestamp, and description
  • 🎨 AI-Powered Image Generation - Uses Gemini 2.5 Flash Image for real frame generation
  • 💰 Cost-effective - Uses efficient Gemini models (gemini-flash-latest for text, gemini-2.5-flash-image for images)

Installation

Prerequisites

Setup

  1. Clone or navigate to the repository:

    cd ~/projects/storyboardgen/storyboardgenerator
  2. Create a virtual environment (recommended):

    python3 -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Configure API key:

    cp .env.example .env
    # Edit .env and add your Gemini API key

    Or set it as an environment variable:

    export GEMINI_API_KEY="your_api_key_here"

Usage

Basic Command Structure

python main.py -i INPUT_IMAGE -p "PROMPT" -d DURATION -f FPS [OPTIONS]

Required Arguments

  • -i, --input - Path to input image/sketch (JPEG, PNG, etc.)
  • -p, --prompt - Text description of the action/scene
  • -d, --duration - Scene duration in seconds
  • -f, --fps - Frames per second (FPS) rate

Optional Arguments

  • -o, --output-format - Output format: jpeg, pdf, or both (default: both)
  • -n, --name - Project name for output files (default: storyboard)
  • --output-dir - Output directory (default: output)
  • --frames-per-page - Frames per page in PDF: 1, 2, or 4 (default: 2)
  • --no-labels - Generate frames without labels
  • --api-key - Provide API key directly (alternative to .env file)

Examples

Example 1: Person Approaching Flowers

Generate a 5-second scene at 3 FPS (15 frames total):

python main.py \
  -i ../Untitled.jpeg \
  -p "Человек подходит к цветам, срывает 1 и подносит к носу" \
  -d 5 \
  -f 3 \
  -n flowers_scene

Output:

  • 15 frames (5 seconds × 3 FPS)
  • Files: output/flowers_scene/flowers_scene_frame_01.jpg through flowers_scene_frame_15.jpg
  • PDF: output/flowers_scene_storyboard.pdf

Example 2: Mouse Going to Sleep

Generate a 40-second scene at 8 FPS (320 frames total):

python main.py \
  -i ../mice.jpeg \
  -p "Мышка лежит и готовится ко сну. Она потягивается и чешет ушко, затем укладывается поудобнее и засыпает." \
  -d 40 \
  -f 8 \
  -n mouse_sleep \
  -o pdf

Output:

  • 320 frames (40 seconds × 8 FPS)
  • PDF: output/mouse_sleep_storyboard.pdf

Example 3: Custom Output Options

Generate with specific output settings:

python main.py \
  -i sketch.jpg \
  -p "Action scene with character running" \
  -d 10 \
  -f 8 \
  -n action_scene \
  -o both \
  --frames-per-page 4 \
  --output-dir my_storyboards

Output Format

JPEG Sequence

When using JPEG output, frames are saved as:

output/
└── project_name/
    ├── project_name_frame_001.jpg
    ├── project_name_frame_002.jpg
    ├── project_name_frame_003.jpg
    └── ...

Each frame includes:

  • Frame number (e.g., "Frame 1 of 15")
  • Timestamp (e.g., "Time: 0.00s")
  • Scene description (truncated if long)

PDF Storyboard

The PDF output includes:

  • Professional title page with project name and frame count
  • Frames arranged in a grid (1, 2, or 4 per page)
  • Borders around each frame
  • Consistent formatting throughout

Project Structure

storyboardgenerator/
├── src/
│   ├── storyboard_generator.py  # Core generation logic
│   ├── output_handler.py        # PDF and JPEG export
│   └── cli.py                   # Command-line interface
├── examples/                    # Example scripts
├── output/                      # Generated storyboards (created automatically)
├── main.py                      # Main entry point
├── requirements.txt             # Python dependencies
├── .env.example                 # Environment variable template
├── .gitignore                   # Git ignore rules
└── README.md                    # This file

Technical Details

Models Used

  • Text/Multimodal Processing: gemini-flash-latest - Fast and cost-effective for understanding scenes and generating descriptions
  • Image Generation: gemini-2.5-flash-image - State-of-the-art AI image generation with visual consistency

Frame Generation Process

  1. Input Analysis: The input image and prompt are analyzed by Gemini
  2. Frame Breakdown: AI generates detailed descriptions for each frame
  3. Image Generation: Each frame is generated using Gemini 2.5 Flash Image model, maintaining visual consistency with the reference image
  4. Fallback Protection: If image generation fails, the system automatically falls back to brightness variation method
  5. Labeling: Professional labels are added to each frame
  6. Export: Frames are compiled into PDF and/or JPEG sequence

Calculation

Total Frames = Duration (seconds) × FPS

Examples:

  • 5 seconds × 3 FPS = 15 frames
  • 40 seconds × 8 FPS = 320 frames
  • 10 seconds × 24 FPS = 240 frames

Troubleshooting

API Key Issues

If you get an API key error:

# Check if .env file exists and contains your key
cat .env

# Or set it directly in your shell
export GEMINI_API_KEY="your_actual_key"

Image Loading Errors

Ensure your input image:

  • Exists at the specified path
  • Is in a supported format (JPEG, PNG, etc.)
  • Is readable (check file permissions)

Memory Issues with Large Storyboards

For very large storyboards (e.g., 320+ frames):

  • Consider generating in batches
  • Use JPEG output instead of PDF
  • Reduce image resolution if needed

Advanced Usage

Using as a Python Module

from src.storyboard_generator import StoryboardGenerator
from src.output_handler import OutputHandler

# Initialize
generator = StoryboardGenerator(api_key="your_key")
output_handler = OutputHandler("output")

# Load image
image = generator.load_image("input.jpg")

# Generate frames
total_frames = generator.calculate_frames(duration=5, fps=3)
descriptions = generator.generate_frame_descriptions(image, "Scene description", total_frames)

# Generate and label frames
frames = []
for i in range(total_frames):
    frame = generator.generate_frame_image(image, descriptions[i], i+1, total_frames)
    labeled = generator.add_frame_labels(frame, i+1, total_frames, i/3, descriptions[i])
    frames.append(labeled)

# Export
output_handler.save_jpeg_sequence(frames, "my_project")
output_handler.save_pdf(frames, "my_project")

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

This project is provided as-is for educational and commercial use.

Credits

Support

For issues, questions, or feature requests, please open an issue on the repository.


Happy Storyboarding! 🎬✨

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors