Skip to content

maybleMyers/ltx

Β 
Β 

Repository files navigation

to run the gui:

pip install uv  
uv sync  
uv run python lt1.py

Use with the official ltxv2 models and full gemma text encoder from the main ltx page. This repository is under active development and alot of features are quite broken but the basics should work well. If you need some helps to get it going I will try...

These are working settings to get a 609 frame video with a 24gb gpu

LTX-2 Video Generator

This repository is organized as a monorepo with three main packages:

  • ltx-core - Core model implementation, inference stack, and utilities
  • ltx-pipelines - High-level pipeline implementations for text-to-video, image-to-video, and other generation modes
  • ltx-trainer - Training and fine-tuning tools for LoRA, full fine-tuning, and IC-LoRA

Each package has its own README and documentation. See the Documentation section below.

πŸ“š Documentation

Each package includes comprehensive documentation:

Model Links

LTX-2 Core Models (Lightricks)

Download from Lightricks/LTX-2 on HuggingFace:

File Description
ltx-2-19b-dev.safetensors Main 19B dev checkpoint
ltx-2-19b-distilled.safetensors Distilled model
ltx-2-19b-distilled-lora-384.safetensors Distilled LoRA
ltx-2-spatial-upscaler-x2-1.0.safetensors 2x spatial upscaler

Text Encoder (Gemma)

Model Link
gemma-3-12b-it-qat-q4_0-unquantized google/gemma-3-12b-it-qat-q4_0-unquantized

Video Frame Interpolation

I have created a repository with all the interpolation files in one place here: https://huggingface.co/maybleMyers/interpolate/tree/main

GIMM-VFI

Download from GSean/GIMM-VFI on HuggingFace:

File Description
gimmvfi_r_arb.pt GIMM-VFI-R (RAFT-based)
gimmvfi_r_arb_lpips.pt GIMM-VFI-R-P (RAFT + Perceptual)
gimmvfi_f_arb.pt GIMM-VFI-F (FlowFormer-based)
gimmvfi_f_arb_lpips.pt GIMM-VFI-F-P (FlowFormer + Perceptual)
flowformer_sintel.pth FlowFormer optical flow (also on Google Drive)
raft-things.pth RAFT optical flow (also from princeton-vl/RAFT)

BiM-VFI

File Link
bim_vfi.pth Google Drive

Place VFI checkpoints in GIMM-VFI/pretrained_ckpt/

Upscalers

Model Link
RealESRGAN_x2plus.pth GitHub Release
RealESRGAN_x4plus.pth GitHub Release
003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN.pth SwinIR GitHub Release
basicvsr_plusplus_reds4.pth OpenMMLab

Place upscaler checkpoints in GIMM-VFI/pretrained_ckpt/

Depth Estimation

Model Link
ZoeDepth (Intel/zoedepth-nyu-kitti) HuggingFace (auto-downloaded by transformers)

Model Directory Structure

The default directory structure expected by lt1.py:

ltx/
β”œβ”€β”€ weights/                                    # LTX-2 core models
β”‚   β”œβ”€β”€ ltx-2-19b-dev.safetensors              # Main checkpoint
β”‚   β”œβ”€β”€ ltx-2-19b-distilled.safetensors        # Distilled model (optional)
β”‚   β”œβ”€β”€ ltx-2-19b-distilled-lora-384.safetensors  # Distilled LoRA
β”‚   └── ltx-2-spatial-upscaler-x2-1.0.safetensors # Spatial upscaler
β”‚
β”œβ”€β”€ gemma-3-12b-it-qat-q4_0-unquantized/       # Text encoder
β”‚   β”œβ”€β”€ config.json
β”‚   β”œβ”€β”€ model-00001-of-00005.safetensors
β”‚   β”œβ”€β”€ model-00002-of-00005.safetensors
β”‚   β”œβ”€β”€ model-00003-of-00005.safetensors
β”‚   β”œβ”€β”€ model-00004-of-00005.safetensors
β”‚   β”œβ”€β”€ model-00005-of-00005.safetensors
β”‚   └── ...
β”‚
β”œβ”€β”€ GIMM-VFI/pretrained_ckpt/                  # Interpolation & upscaler models
β”‚   β”œβ”€β”€ gimmvfi_r_arb.pt                       # GIMM-VFI-R
β”‚   β”œβ”€β”€ gimmvfi_r_arb_lpips.pt                 # GIMM-VFI-R-P
β”‚   β”œβ”€β”€ gimmvfi_f_arb.pt                       # GIMM-VFI-F
β”‚   β”œβ”€β”€ gimmvfi_f_arb_lpips.pt                 # GIMM-VFI-F-P
β”‚   β”œβ”€β”€ flowformer_sintel.pth                  # Required for FlowFormer variants
β”‚   β”œβ”€β”€ raft-things.pth                        # Required for RAFT variants
β”‚   β”œβ”€β”€ bim_vfi.pth                            # BiM-VFI model
β”‚   β”œβ”€β”€ RealESRGAN_x2plus.pth                  # 2x upscaler
β”‚   β”œβ”€β”€ RealESRGAN_x4plus.pth                  # 4x upscaler
β”‚   β”œβ”€β”€ 003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN.pth  # SwinIR 4x
β”‚   └── basicvsr_plusplus_reds4.pth            # BasicVSR++ video upscaler
β”‚
β”œβ”€β”€ lora/                                       # Custom LoRAs (optional)
β”‚   └── your-lora.safetensors
β”‚
└── outputs/                                    # Generated videos

Saving Custom Model Paths

All model paths in the GUI can be customized. Use the Save Defaults button to persist your settings:

| Generation | Save Defaults | LTX checkpoint, Gemma path, spatial upscaler, VAE, distilled LoRA, LoRA folder, all generation parameters |

Settings are saved to ui_configs/ as JSON files and automatically loaded on startup.

Note: The Post-Processing tab (interpolation/upscaling) does not have a Save Defaults button. These models use hardcoded paths in GIMM-VFI/pretrained_ckpt/. You can override paths per-session using the "Custom Model Path" fields, but they won't persist.

Troubleshooting

CUDA / GPU Issues

CUDA out of memory

  • Enable CPU Offloading in Model Settings
  • Enable Block Swap for DiT and Text Encoder to reduce VRAM usage
  • Reduce DiT Blocks in GPU (try 10-15 for 24GB VRAM)
  • Reduce Text Encoder Blocks in GPU (try 4-6)
  • Lower resolution or frame count
  • Use FP8 quantized checkpoints (ltx-2-19b-dev-fp8.safetensors)

CUDA version mismatch / not detected

  • Ensure CUDA >= 12.7 is installed
  • Check PyTorch CUDA version matches system: python -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)"
  • Reinstall PyTorch with correct CUDA version from pytorch.org

Model Loading Errors

FileNotFoundError: Checkpoint not found

  • Verify model paths in the GUI match actual file locations
  • Check that models are downloaded completely (not corrupted/partial)
  • Use absolute paths if relative paths fail

Error loading Gemma text encoder

  • Ensure you have the full unquantized Gemma model, not GGUF format
  • Accept the license on HuggingFace before downloading
  • Check the gemma-3-12b-it-qat-q4_0-unquantized folder contains config.json and model files

GIMM-VFI / BiM-VFI model errors

  • Ensure all checkpoints are in GIMM-VFI/pretrained_ckpt/
  • For FlowFormer variants, verify flowformer_sintel.pth is present
  • For RAFT variants, verify raft-things.pth is present

Generation Issues

Black or corrupted output video

  • Check input image/video dimensions are divisible by 32
  • Frame count must be divisible by 8, plus 1 (e.g., 9, 17, 25, 33...)
  • Try reducing inference steps or changing the seed

Very slow generation

  • Enable block swap to trade speed for VRAM
  • Disable CPU offloading if you have sufficient VRAM
  • Use distilled model for faster inference (fewer steps needed)

Prompt enhancement not working

  • Verify Gemma model path is correct
  • Check "Enhance Prompt" is enabled
  • Gemma requires significant VRAM; enable text encoder block swap

Installation Issues

uv sync fails

  • Update uv: pip install -U uv
  • Clear cache: uv cache clean
  • Try with fresh venv: uv venv && uv sync

Import errors / missing modules

  • Run from the project root directory
  • Ensure virtual environment is activated: uv run python lt1.py
  • Check Python version >= 3.12

Gradio UI not loading

  • Check for port conflicts (default 7860)
  • Try specifying a different port in launch options
  • Disable any VPN/proxy that might block localhost

Video Frame Interpolation Issues

Interpolation produces artifacts

  • Try a different model variant (RAFT vs FlowFormer)
  • Use perceptual variants (-P) for better quality
  • Reduce interpolation multiplier for fast motion scenes

Upscaler produces blurry results

  • SwinIR-L generally produces sharper results than RealESRGAN
  • BasicVSR++ is optimized for video temporal consistency
  • Check input video quality - upscalers can't recover lost detail

Common Fixes

# Clear PyTorch cache
rm -rf ~/.cache/torch

# Clear HuggingFace cache (redownloads models)
rm -rf ~/.cache/huggingface

# Check GPU memory usage
nvidia-smi

# Monitor GPU during generation
watch -n 1 nvidia-smi

About

LT1 gui for lightricks LTX-2 video generation model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 79.0%
  • Svelte 12.7%
  • TypeScript 5.5%
  • JavaScript 2.1%
  • CSS 0.3%
  • HTML 0.2%
  • Other 0.2%