Questions about CFG, loss behavior, and data usage in Qwen-Image-Edit distillation

Hi, thanks for open-sourcing **Qwen-Image-Lightning**. We are distilling **Qwen-Image-Edit** with DMD2, and we have a few quick questions about training details:

1. **CFG during training**  
   How do you set `real_scale_guidance` and `fake_scale_guidance` in your experiments?  
   We use `real_scale_guidance = 4` and `fake_scale_guidance = 1`, but in the later stage of training the generated images show color artifacts and oil-painting–like textures. Is this expected, or do you recommend different values or scheduling strategies?

2. **Loss behavior**  
   In our experiments, the generator loss keeps decreasing, while the critic loss keeps increasing.  
   Is this behavior normal in DMD2 distillation, or does it indicate instability or imbalance between the generator and critic?

3. **Training data for edit distillation**  
   When distilling the editing model Qwen-Image-Edit, do you include **text-to-image** data, or only edit-specific data?  
   If T2I data is included, is there a recommended mixing ratio?

Thanks a lot for your help!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions about CFG, loss behavior, and data usage in Qwen-Image-Edit distillation #69

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions about CFG, loss behavior, and data usage in Qwen-Image-Edit distillation #69

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions