to run the gui:

pip install uv  
uv sync  
uv run python lt1.py

Use with the official ltxv2 models and full gemma text encoder from the main ltx page. This repository is under active development and alot of features are quite broken but the basics should work well. If you need some helps to get it going I will try...

These are working settings to get a 609 frame video with a 24gb gpu

This repository is organized as a monorepo with three main packages:

ltx-core - Core model implementation, inference stack, and utilities
ltx-pipelines - High-level pipeline implementations for text-to-video, image-to-video, and other generation modes
ltx-trainer - Training and fine-tuning tools for LoRA, full fine-tuning, and IC-LoRA

Each package has its own README and documentation. See the Documentation section below.

📚 Documentation

Each package includes comprehensive documentation:

LTX-Core README - Core model implementation, inference stack, and utilities
LTX-Pipelines README - High-level pipeline implementations and usage guides
LTX-Trainer README - Training and fine-tuning documentation with detailed guides

Model Links

LTX-2 Core Models (Lightricks)

Download from Lightricks/LTX-2 on HuggingFace:

File	Description
`ltx-2-19b-dev.safetensors`	Main 19B dev checkpoint
`ltx-2-19b-distilled.safetensors`	Distilled model
`ltx-2-19b-distilled-lora-384.safetensors`	Distilled LoRA
`ltx-2-spatial-upscaler-x2-1.0.safetensors`	2x spatial upscaler

Text Encoder (Gemma)

Model	Link
gemma-3-12b-it-qat-q4_0-unquantized	google/gemma-3-12b-it-qat-q4_0-unquantized

Video Frame Interpolation

I have created a repository with all the interpolation files in one place here: https://huggingface.co/maybleMyers/interpolate/tree/main

GIMM-VFI

Download from GSean/GIMM-VFI on HuggingFace:

File	Description
`gimmvfi_r_arb.pt`	GIMM-VFI-R (RAFT-based)
`gimmvfi_r_arb_lpips.pt`	GIMM-VFI-R-P (RAFT + Perceptual)
`gimmvfi_f_arb.pt`	GIMM-VFI-F (FlowFormer-based)
`gimmvfi_f_arb_lpips.pt`	GIMM-VFI-F-P (FlowFormer + Perceptual)
`flowformer_sintel.pth`	FlowFormer optical flow (also on Google Drive)
`raft-things.pth`	RAFT optical flow (also from princeton-vl/RAFT)

BiM-VFI

File	Link
`bim_vfi.pth`	Google Drive

Place VFI checkpoints in GIMM-VFI/pretrained_ckpt/

Upscalers

Model	Link
RealESRGAN_x2plus.pth	GitHub Release
RealESRGAN_x4plus.pth	GitHub Release
003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN.pth	SwinIR GitHub Release
basicvsr_plusplus_reds4.pth	OpenMMLab

Place upscaler checkpoints in GIMM-VFI/pretrained_ckpt/

Depth Estimation

Model	Link
ZoeDepth (Intel/zoedepth-nyu-kitti)	HuggingFace (auto-downloaded by transformers)

Model Directory Structure

The default directory structure expected by lt1.py:

ltx/
├── weights/                                    # LTX-2 core models
│   ├── ltx-2-19b-dev.safetensors              # Main checkpoint
│   ├── ltx-2-19b-distilled.safetensors        # Distilled model (optional)
│   ├── ltx-2-19b-distilled-lora-384.safetensors  # Distilled LoRA
│   └── ltx-2-spatial-upscaler-x2-1.0.safetensors # Spatial upscaler
│
├── gemma-3-12b-it-qat-q4_0-unquantized/       # Text encoder
│   ├── config.json
│   ├── model-00001-of-00005.safetensors
│   ├── model-00002-of-00005.safetensors
│   ├── model-00003-of-00005.safetensors
│   ├── model-00004-of-00005.safetensors
│   ├── model-00005-of-00005.safetensors
│   └── ...
│
├── GIMM-VFI/pretrained_ckpt/                  # Interpolation & upscaler models
│   ├── gimmvfi_r_arb.pt                       # GIMM-VFI-R
│   ├── gimmvfi_r_arb_lpips.pt                 # GIMM-VFI-R-P
│   ├── gimmvfi_f_arb.pt                       # GIMM-VFI-F
│   ├── gimmvfi_f_arb_lpips.pt                 # GIMM-VFI-F-P
│   ├── flowformer_sintel.pth                  # Required for FlowFormer variants
│   ├── raft-things.pth                        # Required for RAFT variants
│   ├── bim_vfi.pth                            # BiM-VFI model
│   ├── RealESRGAN_x2plus.pth                  # 2x upscaler
│   ├── RealESRGAN_x4plus.pth                  # 4x upscaler
│   ├── 003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN.pth  # SwinIR 4x
│   └── basicvsr_plusplus_reds4.pth            # BasicVSR++ video upscaler
│
├── lora/                                       # Custom LoRAs (optional)
│   └── your-lora.safetensors
│
└── outputs/                                    # Generated videos

Saving Custom Model Paths

All model paths in the GUI can be customized. Use the Save Defaults button to persist your settings:

| Generation | Save Defaults | LTX checkpoint, Gemma path, spatial upscaler, VAE, distilled LoRA, LoRA folder, all generation parameters |

Settings are saved to ui_configs/ as JSON files and automatically loaded on startup.

Note: The Post-Processing tab (interpolation/upscaling) does not have a Save Defaults button. These models use hardcoded paths in GIMM-VFI/pretrained_ckpt/. You can override paths per-session using the "Custom Model Path" fields, but they won't persist.

Troubleshooting

CUDA / GPU Issues

CUDA out of memory

Enable CPU Offloading in Model Settings
Enable Block Swap for DiT and Text Encoder to reduce VRAM usage
Reduce DiT Blocks in GPU (try 10-15 for 24GB VRAM)
Reduce Text Encoder Blocks in GPU (try 4-6)
Lower resolution or frame count
Use FP8 quantized checkpoints (ltx-2-19b-dev-fp8.safetensors)

CUDA version mismatch / not detected

Ensure CUDA >= 12.7 is installed
Check PyTorch CUDA version matches system: python -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)"
Reinstall PyTorch with correct CUDA version from pytorch.org

Model Loading Errors

FileNotFoundError: Checkpoint not found

Verify model paths in the GUI match actual file locations
Check that models are downloaded completely (not corrupted/partial)
Use absolute paths if relative paths fail

Error loading Gemma text encoder

Ensure you have the full unquantized Gemma model, not GGUF format
Accept the license on HuggingFace before downloading
Check the gemma-3-12b-it-qat-q4_0-unquantized folder contains config.json and model files

GIMM-VFI / BiM-VFI model errors

Ensure all checkpoints are in GIMM-VFI/pretrained_ckpt/
For FlowFormer variants, verify flowformer_sintel.pth is present
For RAFT variants, verify raft-things.pth is present

Generation Issues

Black or corrupted output video

Check input image/video dimensions are divisible by 32
Frame count must be divisible by 8, plus 1 (e.g., 9, 17, 25, 33...)
Try reducing inference steps or changing the seed

Very slow generation

Enable block swap to trade speed for VRAM
Disable CPU offloading if you have sufficient VRAM
Use distilled model for faster inference (fewer steps needed)

Prompt enhancement not working

Verify Gemma model path is correct
Check "Enhance Prompt" is enabled
Gemma requires significant VRAM; enable text encoder block swap

Installation Issues

uv sync fails

Update uv: pip install -U uv
Clear cache: uv cache clean
Try with fresh venv: uv venv && uv sync

Import errors / missing modules

Run from the project root directory
Ensure virtual environment is activated: uv run python lt1.py
Check Python version >= 3.12

Gradio UI not loading

Check for port conflicts (default 7860)
Try specifying a different port in launch options
Disable any VPN/proxy that might block localhost

Video Frame Interpolation Issues

Interpolation produces artifacts

Try a different model variant (RAFT vs FlowFormer)
Use perceptual variants (-P) for better quality
Reduce interpolation multiplier for fast motion scenes

Upscaler produces blurry results

SwinIR-L generally produces sharper results than RealESRGAN
BasicVSR++ is optimized for video temporal consistency
Check input video quality - upscalers can't recover lost detail

Common Fixes

# Clear PyTorch cache
rm -rf ~/.cache/torch

# Clear HuggingFace cache (redownloads models)
rm -rf ~/.cache/huggingface

# Check GPU memory usage
nvidia-smi

# Monitor GPU during generation
watch -n 1 nvidia-smi

Name		Name	Last commit message	Last commit date
Latest commit History 341 Commits
GIMM-VFI		GIMM-VFI
lora		lora
modules/gradio		modules/gradio
packages		packages
weights		weights
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
depth_map_generator.py		depth_map_generator.py
interpolate_video.py		interpolate_video.py
lt1.py		lt1.py
ltx_generate_video.py		ltx_generate_video.py
ltx_video_extend.py		ltx_video_extend.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

to run the gui:

These are working settings to get a 609 frame video with a 24gb gpu

📚 Documentation

Model Links

LTX-2 Core Models (Lightricks)

Text Encoder (Gemma)

Video Frame Interpolation

GIMM-VFI

BiM-VFI

Upscalers

Depth Estimation

Model Directory Structure

Saving Custom Model Paths

Troubleshooting

CUDA / GPU Issues

Model Loading Errors

Generation Issues

Installation Issues

Video Frame Interpolation Issues

Common Fixes

About

Uh oh!

Releases

Packages

Languages

License

maybleMyers/ltx

Folders and files

Latest commit

History

Repository files navigation

to run the gui:

These are working settings to get a 609 frame video with a 24gb gpu

📚 Documentation

Model Links

LTX-2 Core Models (Lightricks)

Text Encoder (Gemma)

Video Frame Interpolation

GIMM-VFI

BiM-VFI

Upscalers

Depth Estimation

Model Directory Structure

Saving Custom Model Paths

Troubleshooting

CUDA / GPU Issues

Model Loading Errors

Generation Issues

Installation Issues

Video Frame Interpolation Issues

Common Fixes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages