Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

8mm Scene Detection and Clip Extraction

Detect scenes on low-resolution video and apply precise cuts to high-resolution originals. Optimized for both black & white and color 8mm film footage with automatic detection and adaptive processing.

Features

  • Dual-resolution workflow: Detect scenes on 480p, cut 4K originals
  • B&W optimization: Luma-only detection, higher thresholds, longer minimum scene length
  • Color support: Full RGB analysis for color footage
  • Auto-detection: Automatically determine if footage is B&W or color
  • Mixed archives: Process collections with both B&W and color reels
  • Adaptive thresholds: Automatically adjust sensitivity based on footage type
  • Contrast enhancement: Optional boost for low-contrast B&W footage
  • Fast, lossless cutting: Uses FFmpeg stream copy (no re-encoding)
  • Archival quality: Keep 4K originals while processing at lower resolution

Installation

# Install dependencies
pip install scenedetect[opencv] opencv-python numpy

# Make script executable (optional)
chmod +x scene_cut.py

Quick Start

Basic Examples

# Single B&W video
python scene_cut.py -i video_4k.mp4 -o clips_4k/ --bw

# Single color video
python scene_cut.py -i video_4k.mp4 -o clips_4k/ --color

# Directory with auto-detection
python scene_cut.py -d videos_4k/ -o clips_4k/ --auto-detect

Usage

Single Video Processing

B&W Footage

# Basic B&W processing (uses luma-only detection)
python scene_cut.py -i bw_video_4k.mp4 -o clips_4k/ --bw

# B&W with adaptive threshold
python scene_cut.py -i bw_video_4k.mp4 -o clips_4k/ --bw --adaptive

# B&W with contrast enhancement (for low-contrast footage)
python scene_cut.py -i bw_video_4k.mp4 -o clips_4k/ --bw --enhance-bw --adaptive

# B&W with custom threshold
python scene_cut.py -i bw_video_4k.mp4 -o clips_4k/ --bw --threshold 35

Color Footage

# Basic color processing
python scene_cut.py -i color_video_4k.mp4 -o clips_4k/ --color

# Color with custom threshold (more sensitive)
python scene_cut.py -i color_video_4k.mp4 -o clips_4k/ --color --threshold 25

# Color with adaptive threshold
python scene_cut.py -i color_video_4k.mp4 -o clips_4k/ --color --adaptive

With Pre-existing Low-Res Version

# Use existing 480p version for detection
python scene_cut.py -i video_4k.mp4 -l video_480p.mp4 -o clips_4k/ --bw

Directory Processing

B&W Archive

# Process directory of B&W videos
python scene_cut.py -d bw_videos_4k/ -o clips_4k/ --bw --adaptive

# B&W with enhancement for grainy/low-contrast footage
python scene_cut.py -d bw_videos_4k/ -o clips_4k/ \
    --bw \
    --adaptive \
    --enhance-bw \
    --threshold 30

Color Archive

# Process directory of color videos
python scene_cut.py -d color_videos_4k/ -o clips_4k/ --color --threshold 27

Mixed Archive (B&W + Color)

Option 1: Auto-detect (Recommended)

# Automatically detect B&W vs color for each video
python scene_cut.py -d videos_4k/ -o clips_4k/ \
    --auto-detect \
    --adaptive \
    --enhance-bw

Option 2: Use footage map

Create footage_map.json:

{
  "section_001.mp4": "bw",
  "section_002.mp4": "bw",
  "section_003.mp4": "color",
  "section_004.mp4": "bw",
  "section_005.mp4": "color"
}

Then run:

python scene_cut.py -d videos_4k/ -o clips_4k/ \
    --mixed \
    --footage-map footage_map.json \
    --adaptive

Create Low-Res Versions for Future Use

# Create 480p versions and extract clips from both resolutions
python scene_cut.py -d videos_4k/ \
    -ld videos_480p/ \
    -o clips_4k/ \
    -ol clips_480p/ \
    --create-lowres \
    --bw \
    --adaptive

# Later, re-cut with different threshold using existing 480p
python scene_cut.py -d videos_4k/ \
    -ld videos_480p/ \
    -o clips_4k_v2/ \
    --bw \
    --threshold 35

Testing and Dry Runs

# Test detection without cutting videos
python scene_cut.py -d videos_4k/ -o clips_4k/ \
    --bw \
    --threshold 30 \
    --dry-run \
    --save-timestamps

# This creates timestamps.json with scene information
# Review timestamps before committing to cuts

Command-Line Options

Input Options

-i, --input           Single high-resolution video file
-d, --directory       Directory containing high-resolution videos
-l, --lowres          Low-res video file (for -i) or directory (for -d)
-ld, --lowres-dir     Directory to save/find low-res versions
--create-lowres       Create and save low-res versions (requires -ld)
--lowres-height       Height for low-res videos (default: 480)

Output Options

-o, --output          Output directory for high-res clips (required)
-ol, --output-lowres  Output directory for low-res clips (optional)
--save-timestamps     Save timestamps as JSON files
--dry-run            Detect scenes but don't cut videos

Footage Type Options

--bw                 Black and white footage (luma-only detection)
--color              Color footage (default, analyzes all channels)
--mixed              Mixed B&W and color (requires --footage-map)
--auto-detect        Auto-detect B&W vs color for each video
--footage-map        JSON file mapping filenames to footage types

Detection Parameters

-t, --threshold      Scene detection threshold (15-40, default: 27)
--adaptive          Auto-adjust threshold based on footage type
--enhance-bw        Enhance contrast for B&W footage
-m, --min-duration  Minimum clip duration in seconds (default: 2.0)

Understanding Footage Types

Black & White (--bw)

When processing B&W footage, the following optimizations are automatically applied:

  1. Luma-only detection (luma_only=True)

    • Only analyzes brightness (Y) channel
    • Ignores color (U, V) channels
    • Prevents false cuts from color noise/artifacts
  2. Higher threshold (default: 32 vs 27 for color)

    • B&W has more subtle scene changes
    • Higher threshold prevents cuts on grain/flicker
  3. Longer minimum scene length (15 frames)

    • Avoids false cuts on film grain
    • Reduces flicker-induced false positives
  4. Adaptive threshold (--adaptive)

    • Multiplies threshold by 1.2× for B&W
    • Example: 32 → 38.4
  5. Optional enhancement (--enhance-bw)

    • Boosts contrast by 20%
    • Normalizes brightness histogram
    • Helps detection on low-contrast footage

Color (--color)

Standard scene detection for color footage:

  • Analyzes all RGB channels
  • Lower threshold (default: 27)
  • Detects both color and brightness changes

Auto-Detection (--auto-detect)

Automatically determines footage type by:

  1. Sampling 10 frames from each video
  2. Measuring average color saturation
  3. Classifying as B&W if saturation < 25
  4. Applying appropriate detection settings

Requires opencv-python for analysis.

Recommended Settings

Clean B&W 8mm Footage

python scene_cut.py -d videos/ -o clips/ \
    --bw \
    --threshold 32 \
    --adaptive

Grainy/Low-Contrast B&W Footage

python scene_cut.py -d videos/ -o clips/ \
    --bw \
    --threshold 30 \
    --enhance-bw \
    --adaptive

Color 8mm Footage

python scene_cut.py -d videos/ -o clips/ \
    --color \
    --threshold 27

Mixed Archive (Best for Most Cases)

python scene_cut.py -d videos/ -o clips/ \
    --auto-detect \
    --adaptive \
    --enhance-bw

Threshold Guidelines

For B&W Footage

  • 28-30: Very sensitive, more cuts (clean footage)
  • 32-35: Balanced (recommended for most B&W 8mm)
  • 35-40: Less sensitive (very grainy or noisy footage)

For Color Footage

  • 23-25: Very sensitive, more cuts
  • 27-30: Balanced (recommended for most color 8mm)
  • 30-35: Less sensitive (grainy footage)

Finding the Right Threshold

# Test different thresholds with dry-run
for threshold in 25 30 35; do
    python scene_cut.py -i test_video.mp4 -o test_clips/ \
        --bw \
        --threshold $threshold \
        --dry-run \
        --save-timestamps
    # Review timestamps_*.json files
done

Output Structure

output_directory/
├── video_001/
│   ├── video_001_clip_001_bw.mp4
│   ├── video_001_clip_002_bw.mp4
│   └── video_001_clip_003_bw.mp4
├── video_002/
│   ├── video_002_clip_001_color.mp4
│   └── video_002_clip_002_color.mp4
└── all_timestamps.json  (if --save-timestamps used)

Clip filenames include:

  • Source video stem
  • Clip number (padded to 3 digits)
  • Footage type suffix (_bw or _color)

Typical Workflow

1. Initial Processing

# Process 4K archive with auto-detection
python scene_cut.py -d /Volumes/8mm/4k_sections/ \
    -o ./clips_4k/ \
    --auto-detect \
    --adaptive \
    --enhance-bw \
    --save-timestamps

2. Create 480p Archive for Fast Processing

# Create 480p versions for future use
python scene_cut.py -d /Volumes/8mm/4k_sections/ \
    -ld ./sections_480p/ \
    -o ./clips_4k/ \
    -ol ./clips_480p/ \
    --create-lowres \
    --auto-detect \
    --adaptive

3. Re-cut with Different Settings (Fast)

# Use existing 480p for detection, cut 4K with new threshold
python scene_cut.py -d /Volumes/8mm/4k_sections/ \
    -ld ./sections_480p/ \
    -o ./clips_4k_v2/ \
    --bw \
    --threshold 35

4. For CLIP Embeddings Pipeline

# Create both 4K (archival) and 480p (for embeddings)
python scene_cut.py -d /Volumes/8mm/4k_sections/ \
    -ld ./sections_480p/ \
    -o ./clips_4k/ \
    -ol ./clips_480p/ \
    --create-lowres \
    --auto-detect \
    --adaptive

# Use clips_480p for CLIP embeddings
# Keep clips_4k for final export

Troubleshooting

Too Many Short Clips (Over-segmentation)

Problem: Getting hundreds of 1-2 second clips from grain/flicker

Solutions:

# Increase threshold
python scene_cut.py -i video.mp4 -o clips/ --bw --threshold 35

# Increase minimum duration
python scene_cut.py -i video.mp4 -o clips/ --bw --min-duration 3.0

# Use both
python scene_cut.py -i video.mp4 -o clips/ --bw --threshold 35 --min-duration 3.0

Missing Scene Changes (Under-segmentation)

Problem: Long clips that should be split

Solutions:

# Lower threshold
python scene_cut.py -i video.mp4 -o clips/ --bw --threshold 28

# For low-contrast B&W, use enhancement
python scene_cut.py -i video.mp4 -o clips/ --bw --threshold 30 --enhance-bw

Auto-Detection Not Working

Problem: opencv-python not installed or detection failing

Solution:

# Install opencv
pip install opencv-python

# Or use manual footage type
python scene_cut.py -i video.mp4 -o clips/ --bw  # or --color

FFmpeg Not Found

Problem: ffmpeg not in PATH

Solution:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt-get install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

Performance Notes

Processing Time (100 clips)

Stage Resolution Time
Scene detection 480p ~5-10 min
Scene detection 4K ~15-25 min
Cutting (stream copy) 4K ~2-3 min
Total (480p detection) - ~7-13 min
Total (4K detection) - ~17-28 min

Recommendation: Always detect on 480p for 3-4× speedup with identical results.

Disk Space

For 100 three-minute 4K sections:

  • 4K sections: ~50-60 GB
  • 480p sections: ~15-20 GB
  • 4K clips: ~50-60 GB (same as sections)
  • 480p clips: ~15-20 GB

Total with both: ~130-160 GB

Advanced Usage

Custom FFmpeg Parameters

Edit the cut_video_with_timestamps() function to add custom FFmpeg options:

cmd = [
    'ffmpeg',
    '-ss', str(ts['start']),
    '-i', str(video_path),
    '-t', str(ts['duration']),
    '-c:v', 'libx264',     # Re-encode instead of copy
    '-crf', '18',          # High quality
    '-preset', 'slow',     # Better compression
    str(output_file)
]

Integration with Other Tools

# Generate embeddings from 480p clips
python scene_cut.py -d videos_4k/ -ld sections_480p/ -o clips_4k/ -ol clips_480p/ --create-lowres
python generate_embeddings.py -d clips_480p/ -o embeddings.npy

# Cluster by embeddings
python cluster_videos.py -e embeddings.npy -d clips_4k/

Technical Details

Scene Detection Algorithm

Uses PySceneDetect's ContentDetector:

  • Compares consecutive frames using color/luma histograms
  • Threshold determines how different frames must be
  • B&W mode uses only luminance histogram (luma-only)
  • Color mode uses RGB histograms

Timestamp Transfer

  1. Detect scenes on 480p video → get timestamps
  2. Apply exact timestamps to 4K video using FFmpeg stream copy
  3. Result: Fast detection, lossless 4K cuts

B&W Enhancement Filter Chain

scale=-1:480,eq=contrast=1.2:brightness=0.05,normalize
  • scale: Resize to 480p
  • eq: Boost contrast 20%, lift shadows 5%
  • normalize: Spread histogram to full range

License

This script uses:

Support

For issues or questions:

  1. Check PySceneDetect docs: https://scenedetect.com/
  2. Verify FFmpeg installation: ffmpeg -version
  3. Test with dry-run to check detection quality

Changelog

v1.0.0

  • Initial release
  • B&W and color support
  • Auto-detection
  • Luma-only mode for B&W
  • Adaptive thresholds
  • Contrast enhancement