Musubi Tuner WAN 2.2 GUI

A graphical user interface for WAN 2.2 LoRA training, built on top of the kohya-ss/musubi-tuner project. This GUI simplifies the configuration, execution, and monitoring of LoRA training workflows for WAN 2.2 models.

Overview

This fork provides an intuitive visual interface for training LoRA adapters on WAN 2.2 models, supporting both high-noise and low-noise DiT architectures. The GUI handles complex parameter configurations, real-time monitoring.

Note: This GUI is experimental and optimized for Windows systems. Other platforms may require adjustments.

Features

🎛️ Configuration Tabs

Model Paths & Dataset

Dataset configuration (TOML file) selection
High/Low Noise DiT model training toggles
VAE, CLIP, and T5 text encoder path configuration
Output directory and LoRA filename specification
Built-in file browsers for easy navigation

Training Parameters

Core settings: learning rate, epochs, save frequency, seed
LoRA configuration: network dimension (rank) and alpha
Optimizer selection (adamw8bit, prodigy, etc.)
Learning rate scheduler options

Advanced Settings

Memory optimizations: mixed precision, gradient checkpointing, data loader settings
WAN 2.2 specific: DiT model offloading and block swapping for VRAM efficiency
Flow matching: timestep sampling and distribution controls
Attention mechanisms: xFormers, Flash Attention, SDPA support
Logging integration: TensorBoard and Weights & Biases

Run & Monitor

Real-time console output and progress tracking
Live loss visualization (requires matplotlib)
VRAM usage monitoring (requires pynvml)
Training control: start, stop, and command preview

🔧 Additional Features

Settings Management: Save/load configurations, auto-save on exit
Input Validation: Required field checking with visual feedback
Command Generation: View and copy generated training commands
Resume Training: Checkpoint loading and advanced resumption options

Installation

Prerequisites

Python 3.10 or higher
NVIDIA GPU (recommended for optimal performance)
12GB+ VRAM (24GB+ recommended for high-resolution training)
32GB+ system RAM

Setup

Clone the Repository

git clone https://github.com/PGCRT/musubi-tuner_Wan2.2_GUI.git
cd musubi-tuner_Wan2.2_GUI

Create Virtual Environment

python -m venv venv
.\venv\Scripts\activate

Install Dependencies

# Install PyTorch with CUDA support
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

# Install Musubi Tuner
pip install -e .

# Optional: Install GUI enhancement packages
pip install matplotlib pynvml tensorboard

Configure Weights & Biases (Optional)

pip install wandb

wandb login
# Enter your API key when prompted

Download Models 2.2 HIGH NOISE https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_t2v_high_noise_14B_fp16.safetensors 2.2 LOW NOISE https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_t2v_low_noise_14B_fp16.safetensors

T5 https://huggingface.co/Kijai/WanVideo_comfy/blob/main/umt5-xxl-enc-bf16.safetensors

VAE https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors

Important: Use fp16 models only. Scaled models are not supported.

Usage

Launching the GUI

Windows: Double-click LAUNCH_GUI.bat

Other Platforms:

python musubi_tuner_gui.py

Basic Workflow

EDIT your .TOML dataset file in the dataset folder, IMPORTANT! Don't use backslashes "" when you paste your dataset path, use "/" slashs instead (on notepad ++, CTRL+H to replace all backslashes)

Launch ACCELERATE config (last tab, follow instructions, only once)

Configure Paths
- Set your dataset TOML file path
- Specify model paths (DiT, VAE, text encoders)
- Choose output directory and LoRA filename
Set Training Parameters
- Configure learning rate, epochs, and network dimensions
- Select optimizer and scheduler options
- Adjust advanced settings as needed
Enable Caching (First Run)
- Check latents and/or text encoder caching options
- Required for initial training or when dataset changes
Start Training
- Switch to "Run & Monitor" tab
- Click "Start Training" to begin
- Monitor progress via console, graph, and VRAM usage
Save Configuration
- Use "Save Settings" to save your configuration
- Settings auto-save on GUI close anyway

IMPORTANT: Use caching on your first run or when adding new data to your dataset to speed up initialization.
Monitor VRAM usage to optimize memory settings with blocks_to_swap, batch size, and resolution.

Dual Model Training Methods

Here are the four distinct methods for training both the High and Low noise models.

1. Manually Sequential Training

This is a fully manual workflow where you run two separate training jobs yourself, one for each model.

How to Configure:
1. Part 1: Check only Train High Noise Model and run the training to completion.
2. Part 2: Return, uncheck the low noise model, check only Train Low Noise Model, and start the second training.
Key Attributes:
- VRAM Usage: Moderate. Very friendly to lower-VRAM systems.
- Training Speed: Good. Each session runs at full speed.
- Best Use Case: The highly recommended alternative to Method 2 for users with limited VRAM. It is predictable, efficient, and avoids the crippling speed loss.

2. Sequentially (Two Separate Trainings)

This method automatically runs two training sessions back-to-back when you set different LoRA parameters for each model.

How to Configure:
1. Go to the "Model Paths & Dataset" tab and check both Train High Noise Model and Train Low Noise Model.
2. Go to the "Training Parameters" tab and enter different values for Network Dimension (Rank) or Network Alpha for the high noise model.
Key Attributes:
- VRAM Usage: Moderate. Only one DiT model is loaded into VRAM at a time.
- Training Speed: Good. Each session runs at full speed.
- Best Use Case: Required when you intentionally want different LoRA ranks or alphas for each model, using the same dataset and learning rate. / If you want to automate training both models during the night.

3. Combined (Single Run - Very High VRAM Usage)

This is the fastest and most efficient method, running a single unified process that trains both models simultaneously by keeping them both in VRAM. Similar to AI toolkit when both models are selected with LOW VRAM "OFF"

How to Configure:
1. Go to the "Model Paths & Dataset" tab and check both Train High Noise Model and Train Low Noise Model.
2. In the "Training Parameters" tab, leave the high noise model's Network Dimension and Network Alpha blank.
3. In the "Advanced Settings" tab, ensure Offload Inactive DiT Model is UNCHECKED.
Key Attributes:
- VRAM Usage: Very High. Requires enough VRAM to hold both DiT models (Too much for consumer grade GPU).
- Training Speed: Fastest. The most time-efficient method.
- Best Use Case: The default and highly recommended method for users with sufficient VRAM.

4. Combined (Single Run - VRAM Saving Mode)

This method attempts a combined run on low-VRAM systems by constantly swapping the inactive model between VRAM and system RAM, causing a high performance bottleneck. Similar to AI toolkit when both models are selected with LOW VRAM "ON"

How to Configure:
1. Go to the "Model Paths & Dataset" tab and check both Train High Noise Model and Train Low Noise Model.
2. In the "Training Parameters" tab, leave the high noise model's Network Dimension and Network Alpha blank.
3. In the "Advanced Settings" tab, CHECK the Offload Inactive DiT Model option.
Key Attributes:
- VRAM Usage: Low. Drastically reduces VRAM consumption.
- Training Speed: Slow. The performance cost is crippling.
- Best Use Case:
  
  WARNING: This option has no real practical use case and should be avoided. The speed penalty is so significant that it is almost always slower than running two separate trainings (Method 1 or 2).

License

Licensed under the Apache License 2.0, consistent with the original Musubi Tuner project. See the LICENSE file for full details.

Acknowledgments

Built upon the excellent work of the kohya-ss/musubi-tuner project. Special thanks to the original contributors and the broader AI training community.

Need Help? Check the Issues page or refer to the original Musubi Tuner documentation for additional guidance.

Name		Name	Last commit message	Last commit date
Latest commit History 622 Commits
.ai		.ai
.github		.github
dataset		dataset
docs		docs
src/musubi_tuner		src/musubi_tuner
.gitignore		.gitignore
.python-version		.python-version
Base_SETTINGS.json		Base_SETTINGS.json
LAUNCH_GUI.bat		LAUNCH_GUI.bat
README.ja.md		README.ja.md
README.md		README.md
cache_latents.py		cache_latents.py
cache_text_encoder_outputs.py		cache_text_encoder_outputs.py
convert_lora.py		convert_lora.py
flux_kontext_cache_latents.py		flux_kontext_cache_latents.py
flux_kontext_cache_text_encoder_outputs.py		flux_kontext_cache_text_encoder_outputs.py
flux_kontext_generate_image.py		flux_kontext_generate_image.py
flux_kontext_train_network.py		flux_kontext_train_network.py
fpack_cache_latents.py		fpack_cache_latents.py
fpack_cache_text_encoder_outputs.py		fpack_cache_text_encoder_outputs.py
fpack_generate_video.py		fpack_generate_video.py
fpack_train_network.py		fpack_train_network.py
hv_generate_video.py		hv_generate_video.py
hv_train.py		hv_train.py
hv_train_network.py		hv_train_network.py
last_settings.json		last_settings.json
lora_post_hoc_ema.py		lora_post_hoc_ema.py
merge_lora.py		merge_lora.py
musubi_tuner_gui.py		musubi_tuner_gui.py
pyproject.toml		pyproject.toml
qwen_image_cache_latents.py		qwen_image_cache_latents.py
qwen_image_cache_text_encoder_outputs.py		qwen_image_cache_text_encoder_outputs.py
qwen_image_generate_image.py		qwen_image_generate_image.py
qwen_image_train_network.py		qwen_image_train_network.py
wan_cache_latents.py		wan_cache_latents.py
wan_cache_text_encoder_outputs.py		wan_cache_text_encoder_outputs.py
wan_generate_video.py		wan_generate_video.py
wan_train_network.py		wan_train_network.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Musubi Tuner WAN 2.2 GUI

Overview

Features

🎛️ Configuration Tabs

🔧 Additional Features

Installation

Prerequisites

Setup

Usage

Launching the GUI

Basic Workflow

Dual Model Training Methods

1. Manually Sequential Training

2. Sequentially (Two Separate Trainings)

3. Combined (Single Run - Very High VRAM Usage)

4. Combined (Single Run - VRAM Saving Mode)

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Musubi Tuner WAN 2.2 GUI

Overview

Features

🎛️ Configuration Tabs

🔧 Additional Features

Installation

Prerequisites

Setup

Usage

Launching the GUI

Basic Workflow

Dual Model Training Methods

1. Manually Sequential Training

2. Sequentially (Two Separate Trainings)

3. Combined (Single Run - Very High VRAM Usage)

4. Combined (Single Run - VRAM Saving Mode)

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages