DreamBooth checkpoint saving issues #1068
-
Hopefully, I'm not just missing something, but I haven't been able to identify the issue. When training with DreamBooth on colab with checkpoint saving set at 500 steps, more often than not it gives me the "Converting to CKPT ..." printout, but then instead of saving the checkpoint, it throws "Killed" before "Done, resuming training..." and doesn't save anything that round. Here's a shot of my run that just finished up. I wasn't able to grab a new one before it cleared everything, but it ended up saving both the 3000 step checkpoint and the 3000 step final checkpoint. I truly appreciate you making this so easy and available, and thank you pre-emptively for any help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
not enough RAM with free colab, but the final checkpoint will be saved successfully |
Beta Was this translation helpful? Give feedback.
not enough RAM with free colab, but the final checkpoint will be saved successfully