[Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL #11573

kashif · 2025-05-18T14:44:19Z

What does this PR do?

An implementation of Latent Perceptual Loss (LPL) for training Stable Diffusion XL models, based on the paper "Boosting Latent Diffusion with Perceptual Objectives" (Berrada et al., 2025). LPL is a perceptual loss that operates in the latent space of a VAE, helping to improve the quality and consistency of generated images by bridging the disconnect between the diffusion model and the autoencoder decoder.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2025-05-18T14:51:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul

Thank you! Very clean stuff!

Should we

Credit the first author for helping with a reference implementation?
Should we apply LoRA on unet instead of full fine-tuning?
Change lpl_sdxl.py to train_sdxl_lpl.py?

Maybe update the PR description with a visual example from full fine-tuning run?

sayakpaul · 2025-05-19T05:42:41Z

examples/research_projects/lpl/README.md

+    # ... other training arguments ...
+```
+
+### Key Parameters


Should also include a full representative training command. Currently, we only show LPL-specific bits in a command but lack one where a full-blown training can be launched (dataset_name, batch_size, etc.).

kashif added 13 commits May 1, 2025 11:12

initial

5e2c2ff

Merge branch 'huggingface:main' into lpl

03af565

added readme

fc71c8a

fix formatting

506b9ca

added logging

7178042

formatting

2fdac01

use config

09eb347

debug

a890cbe

better

9c9b4b0

handle SNR

634be4c

floats have no item()

593da36

remove debug

cd34c0b

Merge branch 'main' into lpl

fc198d3

kashif added 2 commits May 18, 2025 16:53

formatting

5569cd0

add paper link

d168f34

sayakpaul reviewed May 18, 2025

View reviewed changes

kashif and others added 3 commits May 18, 2025 17:11

acknowledge reference source

81312ba

rename script

26bede5

Merge branch 'main' into lpl

9606f2e

sayakpaul reviewed May 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL #11573

[Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL #11573

kashif commented May 18, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 18, 2025

Uh oh!

sayakpaul left a comment

Uh oh!

sayakpaul May 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL #11573

Are you sure you want to change the base?

[Research] Latent Perceptual Loss (LPL) for Stable Diffusion XL #11573

Conversation

kashif commented May 18, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented May 18, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants