Skip to content

Conversation

@Beinsezii
Copy link
Contributor

@Beinsezii Beinsezii commented Aug 8, 2025

What does this PR do?

Saw some Flux.DEV loras in our DB with keys like

transformer.single_transformer_blocks.0.attn.to_k.alpha : Tensor @ torch.Size([])
transformer.single_transformer_blocks.0.attn.to_k.lora_down.weight : Tensor @ torch.Size([10, 3072])
transformer.single_transformer_blocks.0.attn.to_k.lora_up.weight : Tensor @ torch.Size([3072, 10])
transformer.single_transformer_blocks.0.attn.to_q.alpha : Tensor @ torch.Size([])
transformer.single_transformer_blocks.0.attn.to_q.lora_down.weight : Tensor @ torch.Size([8, 3072])
transformer.single_transformer_blocks.0.attn.to_q.lora_up.weight : Tensor @ torch.Size([3072, 8])
transformer.single_transformer_blocks.0.attn.to_v.alpha : Tensor @ torch.Size([])
transformer.single_transformer_blocks.0.attn.to_v.lora_down.weight : Tensor @ torch.Size([8, 3072])
transformer.single_transformer_blocks.0.attn.to_v.lora_up.weight : Tensor @ torch.Size([3072, 8])
transformer.single_transformer_blocks.0.norm.linear.alpha : Tensor @ torch.Size([])
transformer.single_transformer_blocks.0.norm.linear.lora_down.weight : Tensor @ torch.Size([9, 3072])
transformer.single_transformer_blocks.0.norm.linear.lora_up.weight : Tensor @ torch.Size([9216, 9])
transformer.single_transformer_blocks.0.proj_mlp.alpha : Tensor @ torch.Size([])
transformer.single_transformer_blocks.0.proj_mlp.lora_down.weight : Tensor @ torch.Size([9, 3072])
transformer.single_transformer_blocks.0.proj_mlp.lora_up.weight : Tensor @ torch.Size([12288, 9])
transformer.single_transformer_blocks.0.proj_out.alpha : Tensor @ torch.Size([])
transformer.single_transformer_blocks.0.proj_out.lora_down.weight : Tensor @ torch.Size([7, 15360])
transformer.single_transformer_blocks.0.proj_out.lora_up.weight : Tensor @ torch.Size([3072, 7])

So basically PEFT layers but Kohya adapter names. This might be a mistake on the trainer part but after picking around in the converter for a bit I figured out it can be an easy one line fix so that's what I've done here. I don't have the civit.ai URLs at the moment so I don't have a public link to weights.

The proj_out layers still fail but so does every other peft lora with proj out layers against main currently so I think that's an unrelated bug.

Before submitting

Who can review?

@sayakpaul

@Beinsezii
Copy link
Contributor Author

Bisect shows the proj_out problem commit as bc34fa8 so I can open an issue for that if need be

@sayakpaul
Copy link
Member

Can you show an example state dict? The changes you're introducing might be backwards-breaking.

@Beinsezii
Copy link
Contributor Author

The changes you're introducing might be backwards-breaking.

I assumed this would be impossible because lora_down and lora_up aren't read by peft anywhere? The diffusers loader mixin has a check for lora_down.weight which is hardcoded to use the sd1/xl unet converter which for flux models results in an empty rank dict and later an index err because there's no unet blocks.

# check with first key if is not in peft format
first_key = next(iter(state_dict.keys()))
if "lora_A" not in first_key:
state_dict = convert_unet_state_dict_to_peft(state_dict)

Can you show an example state dict?

https://huggingface.co/Beinsezii/peft_kohya_lora/blob/main/pytorch_lora_weights.safetensors

@sayakpaul
Copy link
Member

I understand now. Thanks!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sayakpaul sayakpaul merged commit 3c0531b into huggingface:main Aug 8, 2025
11 checks passed
@Beinsezii
Copy link
Contributor Author

Nice looks like a8e4797 fixed the proj_out too

@sayakpaul
Copy link
Member

Yes, hopefully, we will not run into those nasty issues for a while :)

@Beinsezii Beinsezii deleted the beinsezii/flux_lora_peft_layers_kohya_names branch August 8, 2025 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants