Replies: 2 comments
-
It's fine. I got a good results with 8 GB VRAM too, Training time is around 6 Hrs with 100+ Pic. |
Beta Was this translation helpful? Give feedback.
-
I also have a 8gb card and from a training session on a Colab with a batch size of 5 (haven't tried higher but it probably works too) didn't notice much difference in terms of visual accuracy. It just means it will be faster, the higher the batch size the quicker the epochs will be. An epoch is completed when all the images from the dataset are trained one time, so let's say you have 10 images, with a batch size of 1 you'll need 10 steps to complete an epoch, with a batch size of 5 an epoch is completed every 2 steps. This also means that your iterations per second rate will increase because every iteration does 5x more work now but in the end it's faster if you calculate the seconds per epoch metric. I'm still experimenting but in the end I don't think it matters that much, learning rate is the real value you should play around with. Use the txt2img preview feature on the training tab and set a fixed seed for all your previews (I like 5 for portraits), this will make sure that all your training previews will be done on the same base and you'll see that tweaking the LR will give you very distinct results. 5e-3 will be faster but it will iterate too quickly on smaller details, 5e-4 on the other hand is slower (it takes more steps to change the output significantly) but the details are more fine tuned. EDIT: These LR values are for Textual Inversion, for Hypernetworks adjust accordingly. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
With 8GB VRAM I can only train embeddings/hypernetworks with a batch size of 1.
Can I expect good results?
Would the result have higher quality wenn I rent a GPU only for training?
Beta Was this translation helpful? Give feedback.
All reactions