Skip to content

Questions about CFG, loss behavior, and data usage in Qwen-Image-Edit distillation #69

@ThreeSRR

Description

@ThreeSRR

Hi, thanks for open-sourcing Qwen-Image-Lightning. We are distilling Qwen-Image-Edit with DMD2, and we have a few quick questions about training details:

  1. CFG during training
    How do you set real_scale_guidance and fake_scale_guidance in your experiments?
    We use real_scale_guidance = 4 and fake_scale_guidance = 1, but in the later stage of training the generated images show color artifacts and oil-painting–like textures. Is this expected, or do you recommend different values or scheduling strategies?

  2. Loss behavior
    In our experiments, the generator loss keeps decreasing, while the critic loss keeps increasing.
    Is this behavior normal in DMD2 distillation, or does it indicate instability or imbalance between the generator and critic?

  3. Training data for edit distillation
    When distilling the editing model Qwen-Image-Edit, do you include text-to-image data, or only edit-specific data?
    If T2I data is included, is there a recommended mixing ratio?

Thanks a lot for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions