LoRA Trainer: LoRA training node in weight adapter scheme #8446

KohakuBlueleaf · 2025-06-06T21:35:40Z

As title, in this PR I modify the #7032 to use the new weight adapter scheme in LoRA Training Node created by @yoland68

Since the old PR is far behind the master and I have to sync the fork to utilize weight adapter scheme so I open a new PR.

The things I did in this PR:

Add "Train Module" for LoRA, Add training related API in weight adapter implementation
Add "Load LoRA Model" which use the state_dict from training node directly so it will use inference-specific implementation instead of the training-specific implementation which is way slower
Add gradient checkpointing in training, which make the training consume almostly same VRAM as inference
Unload all other module before training to ensure we have enough vram
Use tqdm progress bar in lora training

Next Step:

Finish "Train Module" for other algorithm then add algo selection
Add objective selection, currently the training node will parameterize the output to x0-pred than do loss on x0/x0-pred directly, which may not be optimal for model in different objective
Add better dataset implementation

example:

For more details: Comfy-Org/rfcs#26

Reorganized and cleaned up import statements, removing unused imports and adding specific module imports for better clarity and organization.

LoRA load/calculate_weight LoHa/LoKr/GLoRA load

For calculate weight I implement a fallback mechnism temporary for dev

* Allow disabling pe in flux code for some other models. * Initial Hunyuan3Dv2 implementation. Supports the multiview, mini, turbo models and VAEs. * Fix orientation of hunyuan 3d model. * A few fixes for the hunyuan3d models. * Update frontend to 1.13 (Comfy-Org#7331) * Add backend primitive nodes (Comfy-Org#7328) * Add backend primitive nodes * Add control after generate to int primitive * Nodes to convert images to YUV and back. Can be used to convert an image to black and white. * Update frontend to 1.14 (Comfy-Org#7343) * Native LotusD Implementation (Comfy-Org#7125) * draft pass at a native comfy implementation of Lotus-D depth and normal est * fix model_sampling kludges * fix ruff --------- Co-authored-by: comfyanonymous <[email protected]> * Automatically set the right sampling type for lotus. * support output normal and lineart once (Comfy-Org#7290) * [nit] Format error strings (Comfy-Org#7345) * ComfyUI version v0.3.27 * Fallback to pytorch attention if sage attention fails. * Add model merging node for WAN 2.1 * Add Hunyuan3D to readme. * Support more float8 types. * Add CFGZeroStar node. Works on all models that use a negative prompt but is meant for rectified flow models. * Support the WAN 2.1 fun control models. Use the new WanFunControlToVideo node. * Add WanFunInpaintToVideo node for the Wan fun inpaint models. * Update frontend to 1.14.6 (Comfy-Org#7416) Cherry-pick the fix: Comfy-Org/ComfyUI_frontend#3252 * Don't error if wan concat image has extra channels. * ltxv: fix preprocessing exception when compression is 0. (Comfy-Org#7431) * Remove useless code. * Fix latent composite node not working when source has alpha. * Fix alpha channel mismatch on destination in ImageCompositeMasked * Add option to store TE in bf16 (Comfy-Org#7461) * User missing (Comfy-Org#7439) * Ensuring a 401 error is returned when user data is not found in multi-user context. * Returning a 401 error when provided comfy-user does not exists on server side. * Fix comment. This function does not support quads. * MLU memory optimization (Comfy-Org#7470) Co-authored-by: huzhan <[email protected]> * Fix alpha image issue in more nodes. * Fix problem. * Disable partial offloading of audio VAE. * Add activations_shape info in UNet models (Comfy-Org#7482) * Add activations_shape info in UNet models * activations_shape should be a list * Support 512 siglip model. * Show a proper error to the user when a vision model file is invalid. * Support the wan fun reward loras. --------- Co-authored-by: comfyanonymous <[email protected]> Co-authored-by: Chenlei Hu <[email protected]> Co-authored-by: thot experiment <[email protected]> Co-authored-by: comfyanonymous <[email protected]> Co-authored-by: Terry Jia <[email protected]> Co-authored-by: Michael Kupchick <[email protected]> Co-authored-by: BVH <[email protected]> Co-authored-by: Laurent Erignoux <[email protected]> Co-authored-by: BiologicalExplosion <[email protected]> Co-authored-by: huzhan <[email protected]> Co-authored-by: Raphael Walker <[email protected]>

* LoRA/LoHa/LoKr/GLoRA working well * Removed TONS of code in lora.py

* use seperated dtype for trainable weight * force "training module only" before training * disable gradient after training * ensure same dtype after training

github-actions · 2025-06-11T17:14:30Z

(Automated Bot Message) CI Tests are running, you can view the results at https://ci.comfy.org/?branch=8446%2Fmerge

comfy/ldm/cosmos/blocks.py

comfy_extras/nodes_train.py

we apply grad ckpt in train node directly

razrien · 2025-06-14T05:40:28Z

Well this certainly looks cool as heck

agarzon · 2025-07-30T18:33:40Z

@KohakuBlueleaf can you please share the workflow show how to use this new merged feature? Thanks.

…8446)

yoland68 and others added 30 commits March 26, 2025 17:30

Feat: Add basic LoRA training support

225a196

For more details: Comfy-Org/rfcs#26

Fix ruff errors

2cd3c8a

Add remaining patch

f03ece1

Refactor import statements in nodes_train.py

bfc2f17

Reorganized and cleaned up import statements, removing unused imports and adding specific module imports for better clarity and organization.

Remove empty spaces

0edc48a

Move allow batch execution logic to different PR

b87f55e

Expand supported image file extensions in LoadImageSetNode

d58ad2d

Weight Adapter Scheme

6fb4cc0

Initial impl

4774c32

LoRA load/calculate_weight LoHa/LoKr/GLoRA load

Utilize new weight adapter in lora.py

c40686e

For calculate weight I implement a fallback mechnism temporary for dev

lint

8431747

Merge branch 'comfyanonymous:master' into kbl-new-lora

c792fad

Fix import error

726fdfc

Fix typing syntax error

a220e5c

Use correct v list

ff05027

Remove unused import

889f947

Finalize the modularized weight adapter impl

e8f3bc5

* LoRA/LoHa/LoKr/GLoRA working well * Removed TONS of code in lora.py

Add scheme of TrainBase class

14c2085

Basic train base impl of lora

bffbed8

linting

68c9e79

Merge branch 'yo-lora-trainer' into weight-adapter-train

aadc6c2

Utilize weight adapter scheme in basic training node

5098e94

Merge branch 'master' into weight-adapter-train

4616faa

Fix missed import in merging

dc74839

linting

c8bd95a

Merge branch 'master' into weight-adapter-train

3d3d14d

weight adapter fixes for training node

9c0cf36

Updates of training logic

5e43ec9

* use seperated dtype for trainable weight * force "training module only" before training * disable gradient after training * ensure same dtype after training

Add lora model loader for onfly usage

1870402

KohakuBlueleaf added 3 commits June 7, 2025 05:09

Add gradient checkpointing

c246a1d

Use tqdm for training loop

b3b36e5

check if need to disable pbar

1baa1bd

KohakuBlueleaf requested review from Kosinkadink, christian-byrne, comfyanonymous, ltdrdata, pythongosssss, robinjhuang, webfiltered and yoland68 as code owners June 6, 2025 21:35

KohakuBlueleaf added 3 commits June 10, 2025 22:10

Merge branch 'comfyanonymous:master' into weight-adapter-train

8cf3b53

Update lora.py

b8757c5

Update nodes_train.py

4dcd698

yoland68 added Run-CI-Test This is an administrative label to tell the CI to run full automatic testing on this PR now. Core Core team dependency labels Jun 11, 2025

comfyanonymous reviewed Jun 11, 2025

View reviewed changes

comfy/ldm/cosmos/blocks.py Outdated Show resolved Hide resolved

comfyanonymous reviewed Jun 12, 2025

View reviewed changes

comfy_extras/nodes_train.py Outdated Show resolved Hide resolved

comfyanonymous reviewed Jun 12, 2025

View reviewed changes

comfy_extras/nodes_train.py Outdated Show resolved Hide resolved

KohakuBlueleaf added 4 commits June 13, 2025 11:34

Fix typo

23523f5

Use encoded latents as input

218c3e3

Correct dtype handling and better default arg

31c8cc9

Remove grad ckpt from model impl

bbcc65e

we apply grad ckpt in train node directly

comfyanonymous merged commit 520eb77 into Comfy-Org:master Jun 13, 2025
5 checks passed

comfyui-wiki mentioned this pull request Jun 15, 2025

LoRA Training node docs Comfy-Org/embedded-docs#31

Open

adlerfaulkner pushed a commit to LucaLabsInc/ComfyUI that referenced this pull request Oct 16, 2025

LoRA Trainer: LoRA training node in weight adapter scheme (Comfy-Org#…

5c3536b

…8446)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA Trainer: LoRA training node in weight adapter scheme #8446

LoRA Trainer: LoRA training node in weight adapter scheme #8446

Uh oh!

KohakuBlueleaf commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

razrien commented Jun 14, 2025

Uh oh!

agarzon commented Jul 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

LoRA Trainer: LoRA training node in weight adapter scheme #8446

LoRA Trainer: LoRA training node in weight adapter scheme #8446

Uh oh!

Conversation

KohakuBlueleaf commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

razrien commented Jun 14, 2025

Uh oh!

agarzon commented Jul 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants