Skip to content

Conversation

@Koratahiu
Copy link
Contributor

This draft implements the Conditional Embedding Perturbation (CEP) strategy proposed in the paper:
Slight Corruption in Pre-training Data Makes Better Diffusion Models (NeurIPS 2024 spotlight)

This method aims to improve the generation quality and diversity of diffusion models by mitigating the impact of "perfect" overfitting to training pairs. The paper demonstrates theoretically that standard training can cause the generated distribution to collapse to the empirical distribution of the training data.

CEP addresses this by introducing slight, dimension-scaled noise to the conditional embeddings (e.g., text encoder outputs) during training. By optimizing the objective, the model is forced to learn a smoother conditional manifold, reducing the distance to the true data distribution and preventing memorization.

Implementation Details

  • Adds a perturbation term $\delta$ to the text embeddings before they are passed to the model.
  • The noise is sampled from a Uniform distribution and scaled by the embedding dimension, ensuring the corruption remains "slight" regardless of the model architecture (SD 1.5 vs SDXL vs Flux).
  • All models are supported with UI

Usage

  • Enable Conditional Embedding Perturbation (CEP) (below timestep shifting)
  • Set CEP Gamma to 1

TODO

  • To be tested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant