Diffusion model setting when trained on whole network

Hi,

I tried to use a three-layer CNN on CIFAR10 to reproduce the work like what the paper mentions.I chose the autoencoder.Latent_AE_cnn_big as ae_model in the ae_ddpm.yaml. The classification accuracy of the reconstructed CNN can achieve a comparable level 79%. However, when starting training the diff-network, the accuracy drops to 10%. 

Is the diff-model I used or other setting correct?
```
name: ae_ddpm
ae_model:
  _target_: core.module.modules.autoencoder.Latent_AE_cnn_big
  in_dim: 39882 #2048

model:
  arch:
    _target_: core.module.wrapper.ema.EMA
    model:
      _target_: core.module.modules.od_unet.AE_CNN_bottleneck
      in_dim: 52

beta_schedule:
  start: 1e-4
  end: 2e-2
  schedule: linear
  n_timestep: 1000

model_mean_type: eps
model_var_type: fixedlarge
loss_type: mse

train:
  split_epoch: 30000
  optimizer:
    _target_: torch.optim.AdamW
    lr: 1e-3
    weight_decay: 2e-6

  ae_optimizer:
    _target_: torch.optim.AdamW
    lr: 1e-3
    weight_decay: 2e-6

  lr_scheduler:
```

Thank you
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Diffusion model setting when trained on whole network #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Diffusion model setting when trained on whole network #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions