Troubleshooting Guide

This guide covers common issues and solutions when training with LTX-Video-Trainer.

🔧 VRAM and Memory Issues

Memory management is crucial for successful training, especially with larger models like LTXV 13B.

LTXV 13B Memory Requirements

When training with the LTXV 13B model, you must enable gradient checkpointing:

optimization:
  enable_gradient_checkpointing: true  # Required for LTXV 13B

Note: Gradient checkpointing trades training speed for memory savings. It's essential for training LTXV 13B on consumer GPUs.

Memory Optimization Techniques

1. Enable 8-bit Text Encoder

Load the text encoder in 8-bit precision to save GPU memory during training:

quantization:
  load_text_encoder_in_8bit: true

This setting is also available in all data preparation scripts:

# Dataset preprocessing with 8-bit text encoder
python scripts/preprocess_dataset.py dataset.json \
    --resolution-buckets "768x768x25" \
    --load_text_encoder_in_8bit

# Caption generation with 8-bit quantization
python scripts/caption_videos.py videos/ \
    --output dataset.json \
    --use-8bit

2. Reduce Batch Size

Lower the batch size if you encounter out-of-memory errors:

data:
  batch_size: 1  # Start with 1 and increase gradually

3. Use Lower Resolution

Reduce spatial or temporal dimensions to save memory:

# Smaller spatial resolution
python scripts/preprocess_dataset.py dataset.json \
    --resolution-buckets "512x512x49"

# Fewer frames
python scripts/preprocess_dataset.py dataset.json \
    --resolution-buckets "768x768x25"  # 25 frames instead of 49

4. Memory-Optimized Configuration

Use the low VRAM configuration as a starting point:

# Based on configs/ltxv_2b_lora_low_vram.yaml
model_source: "LTXV_2B_0.9.6_DEV"

data:
  batch_size: 1

optimization:
  enable_gradient_checkpointing: true
  optimizer_type: "adamw8bit"  # 8-bit optimizer

quantization:
  load_text_encoder_in_8bit: true

Memory Usage Guidelines

Sequence Length Calculation:

sequence_length = (H/32) * (W/32) * ((F-1)/8 + 1)

Where:

H = Height, W = Width, F = Number of frames
32 = VAE spatial downsampling factor
8 = VAE temporal downsampling factor

Examples:

768x768x25: sequence_length = 24 × 24 × 4 = 2,304
768x448x89: sequence_length = 24 × 14 × 12 = 4,032
512x512x49: sequence_length = 16 × 16 × 7 = 1,792

Memory Requirements by Model:

LTXV 2B: ~16-40GB VRAM (depending on resolution and batch size)
LTXV 13B: ~40GB+ VRAM (requires gradient checkpointing)

⚠️ Common Usage Issues

Issue: "No module named 'ltxv_trainer'" Error

Solution: Ensure you're in the correct environment and have installed dependencies:

# Reinstall if needed
uv sync

# Activate virtual environment
source .venv/bin/activate

Issue: Slow Training Speed

Optimizations:

Disable gradient checkpointing (if you have enough VRAM):

optimization:
  enable_gradient_checkpointing: false

Increase batch size (if memory allows):
```
data:
  batch_size: 2  # Or higher
```
Use compiled models (experimental):
```
optimization:
  use_torch_compile: true
```

Issue: Poor Quality Validation Outputs

Solutions:

Use Image-to-Video Validation Instead of Text-to-Video:
- For more reliable validation, use image-to-video (first-frame conditioning) rather than text-to-video. This is supported via the images field in your validation config (see ValidationConfig in config.py):
```
validation:
  prompts:
    - "a professional portrait video of a person with blurry bokeh background"
  images:
    - "/path/to/first_frame.png"  # One image per prompt
```
- This approach provides a stronger conditioning signal and typically results in higher quality validation outputs.
Note on Diffusers Inference Quality:
- The default inference pipeline in 🤗 Diffusers is suboptimal for LTXV models: it does not include STG (Spatio-Temporal Guidance) or other inference-time tricks that improve video quality.
- For best results, use validation videos to track training progress, but for actual quality testing, export your LoRA and test it in ComfyUI using the recommended workflow: 👉 ComfyUI-LTXVideo
Other Tips:
- Check caption quality: Review and, if needed, manually edit captions for accuracy.
- Adjust LoRA rank: Try higher values for lora.rank (e.g., 32, 64, 128) for more capacity:
```
lora:
  rank: 64
```
- Increase training steps: Train longer if needed:
```
optimization:
  steps: 2000
```

Issue: LoRA Checkpoint Fails to Load in ComfyUI

Cause: LoRA checkpoints trained with this trainer are saved in Diffusers format, but ComfyUI expects a different format with diffusion_model prefixes instead of transformer prefixes.

Solution: Convert your checkpoint from Diffusers to ComfyUI format using the conversion script:

# Convert from Diffusers to ComfyUI format
python scripts/convert_checkpoint.py your_lora.safetensors --to-comfy --output_path your_lora_comfy.safetensors

What this does:

Converts transformer prefixes to diffusion_model prefixes
Maintains safetensors format for security
Creates a new file with _comfy suffix (if no output path specified)

After conversion:

Load the converted .safetensors file in ComfyUI
The LoRA should now load without errors

For more details on checkpoint conversion, see the Utility Scripts Reference.

🔍 Debugging Tools

Monitor GPU Memory Usage

Track memory usage during training:

# Watch GPU memory in real-time
watch -n 1 nvidia-smi

# Log memory usage to file
nvidia-smi --query-gpu=memory.used,memory.total --format=csv --loop=5 > memory_log.csv

Verify Preprocessed Data

Decode latents to check to visualize the pre-processed videos:

python scripts/decode_latents.py dataset/.precomputed/latents \
    --output-dir debug_output

Compare decoded videos with originals to ensure quality.

💡 Best Practices

Before Training

Test preprocessing with a small subset first
Verify all video files are accessible
Check available GPU memory
Review configuration against hardware capabilities

During Training

Monitor GPU memory usage
Check loss convergence regularly
Review validation samples periodically
Save checkpoints frequently

After Training

Test trained model with diverse prompts
Convert to ComfyUI format if needed
Document training parameters and results
Archive training data and configs

🆘 Getting Help

If you're still experiencing issues:

Check logs: Review console output and log files for error details
Search issues: Look through GitHub issues for similar problems
Provide details: When reporting issues, include:
- Hardware specifications (GPU model, VRAM)
- Configuration file used
- Complete error message
- Steps to reproduce the issue

🤝 Join the Community

Have questions, want to share your results, or need real-time help? Join our community Discord server to connect with other users and the development team!

Get troubleshooting help
Share your training results and workflows
Stay up to date with announcements and updates

We look forward to seeing you there!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubleshooting Guide

🔧 VRAM and Memory Issues

LTXV 13B Memory Requirements

Memory Optimization Techniques

1. Enable 8-bit Text Encoder

2. Reduce Batch Size

3. Use Lower Resolution

4. Memory-Optimized Configuration

Memory Usage Guidelines

⚠️ Common Usage Issues

Issue: "No module named 'ltxv_trainer'" Error

Issue: Slow Training Speed

Issue: Poor Quality Validation Outputs

Issue: LoRA Checkpoint Fails to Load in ComfyUI

🔍 Debugging Tools

Monitor GPU Memory Usage

Verify Preprocessed Data

💡 Best Practices

Before Training

During Training

After Training

🆘 Getting Help

🤝 Join the Community

FilesExpand file tree

troubleshooting.md

Latest commit

History

troubleshooting.md

File metadata and controls

Troubleshooting Guide

🔧 VRAM and Memory Issues

LTXV 13B Memory Requirements

Memory Optimization Techniques

1. Enable 8-bit Text Encoder

2. Reduce Batch Size

3. Use Lower Resolution

4. Memory-Optimized Configuration

Memory Usage Guidelines

⚠️ Common Usage Issues

Issue: "No module named 'ltxv_trainer'" Error

Issue: Slow Training Speed

Issue: Poor Quality Validation Outputs

Issue: LoRA Checkpoint Fails to Load in ComfyUI

🔍 Debugging Tools

Monitor GPU Memory Usage

Verify Preprocessed Data

💡 Best Practices

Before Training

During Training

After Training

🆘 Getting Help

🤝 Join the Community