Skip to content

Diffusion model setting when trained on whole network #13

@FelixFeiyu

Description

@FelixFeiyu

Hi,

I tried to use a three-layer CNN on CIFAR10 to reproduce the work like what the paper mentions.I chose the autoencoder.Latent_AE_cnn_big as ae_model in the ae_ddpm.yaml. The classification accuracy of the reconstructed CNN can achieve a comparable level 79%. However, when starting training the diff-network, the accuracy drops to 10%.

Is the diff-model I used or other setting correct?

name: ae_ddpm
ae_model:
  _target_: core.module.modules.autoencoder.Latent_AE_cnn_big
  in_dim: 39882 #2048

model:
  arch:
    _target_: core.module.wrapper.ema.EMA
    model:
      _target_: core.module.modules.od_unet.AE_CNN_bottleneck
      in_dim: 52

beta_schedule:
  start: 1e-4
  end: 2e-2
  schedule: linear
  n_timestep: 1000

model_mean_type: eps
model_var_type: fixedlarge
loss_type: mse

train:
  split_epoch: 30000
  optimizer:
    _target_: torch.optim.AdamW
    lr: 1e-3
    weight_decay: 2e-6

  ae_optimizer:
    _target_: torch.optim.AdamW
    lr: 1e-3
    weight_decay: 2e-6

  lr_scheduler:

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions