Control lora trainer hunyuan #373

neph1 · 2025-04-12T18:00:05Z

Still have some cleaning up to do, but this pr is back (I might migrate this and force push, or if it's eventually squashed).
It does things and produced something. I will do a longer run tomorrow, but I have no idea how to inference. ~~Is there an example script somewhere i can modify?~~ Found it

…-r-o-w/finetrainers into feature/control-lora-trainer

…-trainer-wan

neph1 · 2025-04-12T18:04:37Z

finetrainers/models/hunyuan_video/base_specification.py

        video = video.permute(0, 2, 1, 3, 4).contiguous()  # [B, F, C, H, W] -> [B, C, F, H, W]
-
+
+        compute_posterior = False


So far only made it work with compute_posterior false

a-r-r-o-w

Hey @neph1, thanks for the PR and testing with some runs! I know there's some duplication of code at the moment, but I plan to address that in the future with something else. For now, let's keep the duplication

I'll try to help with launching some training runs too for verifying the PR once we have the changes looking similar to the Wan/CogView control implementations #338

neph1 · 2025-04-13T16:22:41Z

Oh, I did follow the new implementation, just kept it on the same branch. At least I believe it follows what is in main now.
Not sure it works, though. I ran 1000 steps canny training on some simpsons dataset I found, think I managed to make the conversion script handle x_embedder, but it's only noise. Sadly can't inference with plain diffusers due to VRAM.

Got a bit disheartened, but I'll get back to it.

a-r-r-o-w · 2025-04-18T14:32:43Z

finetrainers/models/hunyuan_video/base_specification.py

            latents = moments.to(dtype=dtype)

-        return {self.output_names[0]: latents}
+        latents_mean = torch.tensor(vae.latent_channels)


@neph1 These changes seem incorrec to me and will cause worse generations. The previous implementation that did not perform this normalization was correct, I think.

Was this modified from Wan? If so, it's incorrect because they are different models and preprocess latents differently

a-r-r-o-w · 2025-04-18T14:33:53Z

Also, I'm a bit more free now. I was working on a major upcoming feature for speeding up training and inference, and it's nearing completion. If you'd like me to take over the PR and make the relevant changes, do a long run for validating correctness, please do LMK

neph1 · 2025-04-18T16:49:31Z

By all means, if you have the time. 👍

a-r-r-o-w and others added 30 commits March 9, 2025 23:48

update

dc85bbf

Merge branch 'main' into feature/control-lora-trainer

755fee8

update

8812036

update

e39d255

update

18bd9ce

update

3ef07fc

update

ea07973

add valid names to dataset docs

9f3d2cb

update

2af75b1

update

28b86c8

update

84ffbd3

update

1684ee5

update

483e891

update

657fb74

update

cd859b3

update

8cea261

Merge branch 'main' into feature/control-lora-trainer

90d6d38

fix

45bbf22

update

825976d

Merge branch 'main' into feature/control-lora-trainer

053757d

update

eaafeab

Merge branch 'feature/control-lora-trainer' of https://github.com/a-r…

9144f28

…-r-o-w/finetrainers into feature/control-lora-trainer

update

3745ae5

Merge branch 'main' of github.com:neph1/finetrainers

d7ba5e1

update

7245b5a

update

c1c600f

Merge branch 'main' into feature/control-lora-trainer

e1ef448

update

8587874

Merge branch 'main' into feature/control-lora-trainer

495e2b1

Merge branch 'feature/control-lora-trainer' into feature/control-lora…

322d610

…-trainer-wan

neph1 added 13 commits March 30, 2025 17:15

refactor

2853be5

clean up

d014b04

free memory for single gpus

7db51c4

init as hunyuan base

f5fb737

move back transformer to device after pass

b2b77a8

add todo about updating patch embedding layer

916bd33

update x_embed.proj

3c53e1e

output_names == 1 for sft trainer

52a6034

Merge branch 'main' of github.com:neph1/finetrainers

197e2fe

Merge branch 'main' of github.com:neph1/finetrainers

022fcfe

Merge branch 'main' into control-lora-trainer-hunyuan

7d31522

apply latest changes

d4002ce

remove legacy dataset

4fcb7c6

neph1 commented Apr 12, 2025

View reviewed changes

clean up

d894b05

a-r-r-o-w reviewed Apr 13, 2025

View reviewed changes

neph1 added 2 commits April 13, 2025 18:39

remove hunyuan_common.py

9d43e8a

fix sorting

9be0e0c

neph1 force-pushed the control-lora-trainer-hunyuan branch from 2a06b9c to 9be0e0c Compare April 13, 2025 17:36

neph1 added 6 commits April 13, 2025 21:12

fixes

1e12216

add training script and remove omni

681a62f

fix quality

edc50a8

optimize imports

09f8b7d

fixes

1d6e74f

move import

12d61f3

a-r-r-o-w reviewed Apr 18, 2025

View reviewed changes

reverting unnecessary hunyuan changes

3706569

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Control lora trainer hunyuan #373

Control lora trainer hunyuan #373

Uh oh!

neph1 commented Apr 12, 2025 •

edited

Loading

Uh oh!

neph1 Apr 12, 2025

Uh oh!

a-r-r-o-w left a comment

Uh oh!

neph1 commented Apr 13, 2025

Uh oh!

a-r-r-o-w Apr 18, 2025

Uh oh!

a-r-r-o-w commented Apr 18, 2025

Uh oh!

neph1 commented Apr 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		video = video.permute(0, 2, 1, 3, 4).contiguous() # [B, F, C, H, W] -> [B, C, F, H, W]


		compute_posterior = False

Control lora trainer hunyuan #373

Are you sure you want to change the base?

Control lora trainer hunyuan #373

Uh oh!

Conversation

neph1 commented Apr 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neph1 Apr 12, 2025

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w left a comment

Choose a reason for hiding this comment

Uh oh!

neph1 commented Apr 13, 2025

Uh oh!

a-r-r-o-w Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

a-r-r-o-w commented Apr 18, 2025

Uh oh!

neph1 commented Apr 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

neph1 commented Apr 12, 2025 •

edited

Loading