Skip to content

How To Do Stable Diffusion LORA Training By Using Web UI On Different Models Tested SD 15 SD 21

FurkanGozukara edited this page Oct 27, 2025 · 1 revision

How To Do Stable Diffusion LORA Training By Using Web UI On Different Models - Tested SD 1.5, SD 2.1

How To Do Stable Diffusion LORA Training By Using Web UI On Different Models - Tested SD 1.5, SD 2.1

image Hits Patreon BuyMeACoffee Furkan Gözükara Medium Codio Furkan Gözükara Medium

YouTube Channel Furkan Gözükara LinkedIn Udemy Twitter Follow Furkan Gözükara

Our Discord : https://discord.gg/HbqgGaZVmr. Ultimate guide to the LoRA training. If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰 https://www.patreon.com/SECourses

Playlist of Stable Diffusion Tutorials, #Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, #LoRA, AI Upscaling, Pix2Pix, Img2Img:

https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3

Welcome to the ultimate beginner's guide to training with #StableDiffusion models using Automatic1111 Web UI. In this video, we will walk you through the entire process of setting up and training a Stable Diffusion model, from installing the LoRA extension to preparing your training set and tuning your training parameters. We'll also cover advanced training options and show you how to generate new images using your trained model. By the end of this video, you'll have a solid understanding of how to use Stable Diffusion to train your own custom models and generate high-quality images.

You should watch these two videos prior to this one if you don't have sufficient knowledge about Stable Diffusion or Automatic1111 Web UI:

1 - Easiest Way to Install & Run Stable Diffusion Web UI on PC by Using Open Source Automatic Installer - https://youtu.be/AZg6vzWHOTA

2 - How to Use SD 2.1 & Custom Models on Google Colab for Training with Dreambooth & Image Generation - https://youtu.be/AZg6vzWHOTA

00:00:00 Introduction speech

00:01:07 How to install the LoRA extension to the Stable Diffusion Web UI

00:02:36 Preparation of training set images by properly sized cropping

00:02:54 How to crop images using Paint .NET, an open-source image editing software

00:05:02 What is Low-Rank Adaptation (LoRA)

00:05:35 Starting preparation for training using the DreamBooth tab - LoRA

00:06:50 Explanation of all training parameters, settings, and options

00:08:27 How many training steps equal one epoch

00:09:09 Save checkpoints frequency

00:09:48 Save a preview of training images after certain steps or epochs

00:10:04 What is batch size in training settings

00:11:56 Where to set LoRA training in SD Web UI

00:13:45 Explanation of Concepts tab in training section of SD Web UI

00:14:00 How to set the path for training images

00:14:28 Classification Dataset Directory

00:15:22 Training prompt - how to set what to teach the model

00:15:55 What is Class and Sample Image Prompt in SD training

00:17:57 What is Image Generation settings and why we need classification image generation in SD training

00:19:40 Starting the training process

00:21:03 How and why to tune your Class Prompt (generating generic training images)

00:22:39 Why we generate regularization generic images by class prompt

00:23:27 Recap of the setting up process for training parameters, options, and settings

00:29:23 How much GPU, CPU, and RAM the class regularization image generation uses

00:29:57 Training process starts after class image generation completed

00:30:04 Displaying the generated class regularization images folder for SD 2.1

00:30:31 The speed of the training process - how many seconds per iteration on an RTX 3060 GPU

00:31:19 Where LoRA training checkpoints (weights) are saved

00:32:36 Where training preview images are saved and our first training preview image

00:33:10 When we will decide to stop training

00:34:09 How to resume training after training has crashed or you close it down

00:36:49 Lifetime vs. session training steps

00:37:54 After 30 epochs, resembling images start to appear in the preview folder

00:38:19 The command line printed messages are incorrect in some cases

00:39:05 Training step speed, a certain number of seconds per iteration (IT)

00:39:44 How I'm picking a checkpoint to generate a full model .ckpt file

00:40:23 How to generate a full model .ckpt file from a LoRA checkpoint .pt file

00:41:17 Generated/saved file name is incorrect, but it is generated from the correct selected .pt file

00:42:01 Doing inference (generating new images) using the text2img tab with our newly trained and generated model

00:42:47 The results of SD 2.1 Version 768 pixel model after training with the LoRA method and teaching a human face

00:44:38 Setting up the training parameters/options for SD version 1.5 this time

00:48:35 Re-generating class regularization images since SD 1.5 uses 512 pixel resolution

00:49:11 Displaying the generated class regularization images folder for SD 1.5

00:50:16 Training of Stable Diffusion 1.5 using the LoRA methodology and teaching a face has been completed and the results are displayed

00:51:09 The inference (text2img) results with SD 1.5 training

00:51:19 You have to do more inference with LoRA since it has less precision than DreamBooth

00:51:39 How to give more attention/emphasis to certain keywords in the SD Web UI

00:52:51 How to generate more than 100 images

00:54:46 How to check PNG info to see used prompts and settings

00:55:24 How to upscale using AI models

00:56:12 Fixing face image quality, especially eyes, with GFPGAN visibility

00:56:32 How to batch post-process

00:57:00 Where batch-generated images are saved

Video Transcription

  • 00:00:00 Greetings everyone. Welcome to the most beginner  friendly guide for how to do training on Stable  

  • 00:00:06 Diffusion models by using Automatic1111 web UI.  In this tutorial I will train portrait images of  

  • 00:00:12 my brother by using Low-Rank Adaptation, as  known as LoRA training method on the Stable  

  • 00:00:18 Diffusion 2.1 768 pixels model. If you do not  have prior knowledge, please watch these two  

  • 00:00:25 videos on our channel. On our channel, go to  the playlist section and in here you see we  

  • 00:00:32 have Stable Diffusion DreamBooth playlist And in  here first watch easiest way to install and run  

  • 00:00:41 Stable Diffusion web UI on PC. So this will teach  you how to install web UI on PC and how to run it.  

  • 00:00:50 And then watch how to use Stable Diffusion version  2.1 and different models in the web UI. This will  

  • 00:00:57 teach you how to download and install different  models and use them with the web UI. After that,  

  • 00:01:03 you are ready to watch this tutorial and  follow me. To be able to train with LoRA,  

  • 00:01:10 you need to go to the extensions tab  here and install DreamBooth extension,  

  • 00:01:14 check for updates. And if you don't know how to  install from available, first go to available tab,  

  • 00:01:22 load from and in here search DreamBooth. Since  I am currently hiding extension with installed  

  • 00:01:30 it is not showing. But when I disable it, you  see DreamBooth is already installed but it has  

  • 00:01:36 updates. So I am going to update it. OK, so  for updating, we just click apply and restart  

  • 00:01:42 UI and it updates. Now, when we check for updates,  you see we have the latest version. And, as I  

  • 00:01:48 said, for installation, go to available. And when  I check this installed, it shows the DreamBooth is  

  • 00:01:54 here and click install. OK, that's all it. After  that, and after you restart your application,  

  • 00:02:00 you may need to do a full restart for DreamBooth  tab to appear. You will get this tab. OK, and once  

  • 00:02:08 you are in here and from the models, you have the  version 2.1. You are ready to follow me. You see,  

  • 00:02:18 current Stable Diffusion checkpoint is 2.1.  OK, first of all, before starting our training,  

  • 00:02:24 we need to prepare our images. Since I am going  to use 768 pixels version, I need to set my images  

  • 00:02:33 as 768 pixels. So my images are inside in this  folder. I didn't still set their resolution. So  

  • 00:02:44 first I will show you how you can crop them with  an open source, a free software: Paint .NET. Let  

  • 00:02:51 me show you Paint .NET. OK, this is the Paint  .NET and you can download paint dot net from its  

  • 00:02:58 official website in here. It's an open source .NET  based software. Alternatively, you can use this  

  • 00:03:04 website, which is free to resize and crop your  images. But I prefer Paint .NET. I will show how  

  • 00:03:10 to crop one of the images. So I am going to drag  and drop this image into here. Click open. And  

  • 00:03:16 in here you see there is rectangle select, and in  here I click fixed ratio. I set it one hundred and  

  • 00:03:23 one hundred, like this. Then I am selecting the  image like this as I want. I click ctrl-C to copy,  

  • 00:03:30 I click ctrl-R for resize. I am typing something  smaller and clicking enter. Then I am clicking  

  • 00:03:38 ctrl-V expanding. Then I am clicking ctrl-R  again to resize. And I am exactly resizing as 768  

  • 00:03:48 pixels. Then I save it with one hundred percent  quality. You can also use PNG or JPG images.  

  • 00:03:58 Alternatively, you can use Birme dot net as well.  However, you may not trust this website. It is up  

  • 00:04:06 to you. So I will select two images from here,  upload them. And in here I am going to select  

  • 00:04:12 768 pixels like this: 768 pixels. OK, and you can  just set where you want cut to be like this. Then  

  • 00:04:23 you need to click save as zip. It will download a  zip like this. You can click it and you can then  

  • 00:04:31 extract them into your folder and overwrite  existing files And they will be exactly as. Let me  

  • 00:04:39 show you 768 pixels. I will open with paint, not  net, And you see they are 768 pixels. So this is  

  • 00:04:48 the way you need to prepare your images. OK, all  images are cropped by 768 pixels and 768 pixels.  

  • 00:04:56 Now we are ready to do training. So go to our  Stable Diffusion web UI. So you may wonder what  

  • 00:05:05 is LoRA? LoRA is a low-rank adaptation for faster  text to image diffusion fine tuning. It uses both  

  • 00:05:12 UNET and CLIP. It is faster than DreamBooth. Also,  its checkpoints are much smaller than the full  

  • 00:05:18 checkpoint of DreamBooth. When you do a checkpoint  with DreamBooth, it generates full .ckpt file.  

  • 00:05:27 However, LoRA generates much smaller  files. And when you are done,  

  • 00:05:32 you can generate the full checkpoint file. To  do training we go to the DreamBooth tab here  

  • 00:05:38 and we first need to generate our model. I am  going to use my brother. I need to select the  

  • 00:05:45 source checkpoint. I have selected version  2.1 like this: I am using the EMA version  

  • 00:05:51 and I am not going to click this. This is not  necessary. OK, and just click Create button.  

  • 00:06:01 After you click Create button, you see it is  downloading the necessary files from the internet  

  • 00:06:08 like this. So you need to wait this download.  If you are not seeing anything on the web UI,  

  • 00:06:14 always check the running command line  window to see what is happening like this:  

  • 00:06:22 OK, the model has been generated. You see  checkpoint successfully extracted to Models  

  • 00:06:27 DreamBooth my brother working. We can also check  it from our installed folder. Let's go to C drive  

  • 00:06:33 and I have installed in StableDiffusion web UI,  in Models and in here in StableDiffusion. And now  

  • 00:06:42 no, not in StableDiffusion in DreamBooth folder,  and in here you see there is working directory and  

  • 00:06:47 there is my brother directory, as you can see.  OK, let's return back to our interface. In here  

  • 00:06:54 you see there is LoRA Weight. So this defines what  percentage of LoRA Weight should be applied to the  

  • 00:07:01 UNET when training or creating a checkpoint, and  it is same for Text Weight. Setting this as 1 may  

  • 00:07:09 cause overtraining, over tuning. However, since  we are going to do generate our portrait images,  

  • 00:07:19 our own portrait images, and we are just teaching  one face, this is fine for now. You can pick this  

  • 00:07:26 half model. It will enable FP16 Precision, which  results in a smaller checkpoint with minimal loss  

  • 00:07:33 in quality. But we don't need this for LoRA,  since the checkpoints are already low size.  

  • 00:07:39 And when you click this, checkpoints will be saved  to a subdirectory in the selected checkpoints  

  • 00:07:44 folder. So I am going to click training wizard  person. OK, and let's set up our parameters.  

  • 00:07:52 This is really important. So how many training  steps we want to do for each image? How many  

  • 00:07:59 images do I have? I have images. Let me show once  again: total: 16. OK, since I am going to compare  

  • 00:08:08 checkpoints quality, I am going to set this very  high because I will early terminate the training  

  • 00:08:16 or I will decide whether I have trained enough or  not. OK, I am setting max training steps as zero,  

  • 00:08:23 pause after an epochs zero, amount of time passed  between epochs. By the way, one epoch equal to 16  

  • 00:08:31 steps because I have 16 images. OK, and I am not  going to set any with pause between epochs. So  

  • 00:08:40 use lifetime steps, epochs when saving. Let's  say you have stopped or paused your training  

  • 00:08:47 and then later at a time you continue it. So use  lifetime means that it will consider your previous  

  • 00:08:55 training epochs steps as well. However, if you  unclick this, it will use only this session of  

  • 00:09:02 training steps and epochs when saving. So I am  just unclicking it. This is really important.  

  • 00:09:10 The save checkpoint frequency by n steps. OK,  since I didn't click this, it will check by n  

  • 00:09:18 steps. Since I have 16 images. If I set this 16,  it will save checkpoint after each epoch. OK,  

  • 00:09:30 if it is confusing for you, you can just click  this and you can set this 10. So it will save  

  • 00:09:35 checkpoints every 10 epochs. In this case, since  I have 16 images, it will be after 160 training  

  • 00:09:47 steps. OK, this is fine. I will also save a  preview of the image after each checkpoint  

  • 00:09:53 so that I can decide whether that checkpoint is  good or not. I will explain this in the video,  

  • 00:10:01 so don't worry about that, You will understand  it. Batch size: OK, how many images to process  

  • 00:10:07 process at once per training step? We are going  to process one image per training step and we will  

  • 00:10:16 do same for classifier regularization images to  generate at once. If you have more than one GPU,  

  • 00:10:24 you can increase batch size to process them  in parallel I suppose. Learning rate and  

  • 00:10:31 other rates. I am not going to touch them, but  you can try to obtain better learning rates or  

  • 00:10:38 encoder rates. You can also scale the learning  rate, but I am just leaving them as default. OK,  

  • 00:10:47 image processing: This is important. Since I  am using 768 pixel version, I am setting it  

  • 00:10:55 as 768. However, if you use another version,  like version 1.5, in that case you need to  

  • 00:11:04 use 512 pixels. So this resolution depends on  your Stable Diffusion model, version and type.  

  • 00:11:13 I am not going to do any cropping since I have  cropped. Apply horizontal flip. It means that  

  • 00:11:21 the images will be flipped as well, so it will  add more variation to your images. You can set  

  • 00:11:28 this. Do we have pretrained VAE Name or Path? No,  we don't have. We will use the base model vae.  

  • 00:11:35 When you watch my previous videos, you will learn  what is vae and how to set them. OK, concept list:  

  • 00:11:42 I am not going to use any concept list as well,  since I am just going to train for teaching one  

  • 00:11:50 portrait image. And advanced tab: OK,  this is important. This is where we set  

  • 00:11:55 our training methodology. We are going to use  LoRA methodology. So use 8bit Adam. This is,  

  • 00:12:03 enable this to save VRAM. If your graphic  card VRAM is not much or let's say you  

  • 00:12:10 have encountered not enough VRAM problem while  training, you can set this. I am going to set  

  • 00:12:16 this for now because I'm not sure how much VRAM  it is going to take. And you can also set FP 16.  

  • 00:12:24 So, mixed precision. You probably want this to be  FP16. If using Xformers. You definitely want this  

  • 00:12:32 to be FP 16. And if you have watched my previous  videos, you know that we are already using  

  • 00:12:38 Xformers to speed up our inference and training.  So I am going to set this. Memory Attention:  

  • 00:12:45 I am going to set this Xformers. My graphic card  is RTX 3060 and it supports that. OK, Don't Cache  

  • 00:12:55 Latents: When I hover my mouse over that, you  see, there appears a tooltip and explains to me  

  • 00:13:03 what does that checkbox is doing. When this box is  checked, latents will not be cached. When latents  

  • 00:13:10 are not caged, you will save a bit of VRAM but  train slightly slower. So for a lower VRAM usage,  

  • 00:13:17 I am checking this. Train Text Encode. Enabling  this will provide better results and editability  

  • 00:13:24 but cost more VRAM. Yes, we are setting this.  I am not changing any of default parameters  

  • 00:13:33 And I am not changing these parameters as well.  So we are done with parameters tab. By the way,  

  • 00:13:41 I will make this arbitrarily high number because  I will stop it myself. The training myself.  

  • 00:13:47 OK, concepts. This is really important. This is  the part where we are setting what to teach. OK,  

  • 00:13:56 maximum training steps is minus one. It will  never end. OK, data set directory. Path to  

  • 00:14:04 the directory with input images. So to get the  director of my input images. I am right clicking  

  • 00:14:11 one of the image, right clicking properties, left  clicking. And in here you see it shows location.  

  • 00:14:16 I am copying this like this. Alternatively,  you can also click the search bar here and  

  • 00:14:24 select entire path and copy. And I am pasting  it here. Classification data set directory. OK,  

  • 00:14:32 this is a path to directly with classification  regularization images. So let's also set a path  

  • 00:14:39 for this to understand what is it. So, however,  you shouldn't set them inside here because in that  

  • 00:14:47 case it is using, I think, all of the images  in all of the folders. So let's say, brother  

  • 00:14:57 classification folder. OK, let's enter here.  Copy the path like this. File words: OK,  

  • 00:15:08 we are not going to use any file words in this  training because we are not training hypernet, or  

  • 00:15:16 let me show you what was it. It is embedding.  Therefore, you can just leave this empty. But  

  • 00:15:24 prompts: this is where we need to enter a unique  prompt to teach the face of or the other thing  

  • 00:15:33 that we want to teach to the model. So I will give  it a unique name as my brother. OK, like this,  

  • 00:15:43 you can give any unique name. It should be unique  enough that it won't be in the original training  

  • 00:15:51 data set. So to be sure, you can just also expand  it like this. Class Prompt. Now this is important.  

  • 00:15:57 What am I teaching to the model? I am teaching  face of a man. So I will say face of a man.  

  • 00:16:06 OK, like this. Classification image, negative  prompt. OK, you may give a negative prompt to  

  • 00:16:16 generate better quality classification images.  These images will be generated to improve your  

  • 00:16:23 training success. They will be automatically  generated. So let's enter a good negative  

  • 00:16:28 prompt here. OK, I have a decent negative prompt I  have prepared previously. I have used a chatGPT to  

  • 00:16:36 expand some of the famous negative prompts. I will  put this into the comment section of the video.  

  • 00:16:44 So, don't worry, you will be able to copy and  paste it. So I'm copying and pasted in here. So  

  • 00:16:51 I will explain to you what is this as well. So  this class prompt and classification images what  

  • 00:16:58 are they? Sample Image Prompt. This is important  Why? Because during the training we want to see  

  • 00:17:05 how the training is going on. And for that we  will generate sample images. So the sample images  

  • 00:17:12 will be like this: First we will give the instance  prompt so that we will be able to see the face of  

  • 00:17:22 the person we are teaching. Then you can append  here some of the good keywords to obtain better  

  • 00:17:30 results. But if you just want to see how much  the model has learned, you can leave this only  

  • 00:17:37 with the instance prompt. So you will get a better  idea of how much the face has been learned by the  

  • 00:17:47 model. Sample prompt template. We don't need  a prompt template right now. Sample image:  

  • 00:17:53 negative prompt. You can just copy and paste  it in here as well. Okay, Image Generation. So  

  • 00:18:00 when doing training we will generate, let's  say, generic images, generic face of a man,  

  • 00:18:08 images to make our training more generalized.  Okay, to improve its success rate. So I will  

  • 00:18:20 generate 10 times of the my input like this. And  these are for the other things that you need to  

  • 00:18:30 set for generating template images, we may say,  or generic images, we may say. Number of samples  

  • 00:18:36 to generate one. Okay, it looks good. Okay, you  can just enter 10 times of it. It's up to you.  

  • 00:18:43 And we are ready to do training. But I will also  show you one other thing in settings. In settings  

  • 00:18:53 when you go to training section here, you can  reduce VRAM usage by clicking this checkbox. Okay,  

  • 00:19:03 I think it will probably reduce your training  speed. And you can also turn on pin memory for  

  • 00:19:10 data loader. Make training slightly faster but  increase memory usage. So you may play with these  

  • 00:19:15 settings to obtain the best possible training  speed. I have 12 GB VRAM. So I may open this,  

  • 00:19:28 but I won't open that. Also. Yeah, the others are  just fine. Okay, and let's click training button,  

  • 00:19:39 since we are ready. Okay, when we click training  button, first it will generate the generic  

  • 00:19:49 face of a man images. Why face of  a man? Because we have entered our  

  • 00:19:58 class prompt as face of a man and where  they will be saved. I think they will  

  • 00:20:02 save it in here brother classification. So it is  starting to generate our generic face images to  

  • 00:20:13 add more variety to our data set. So you see,  it is generating face of a man with the given  

  • 00:20:20 prompt I have. You can also improve this prompt  with adding other, let's say, styling prompts,  

  • 00:20:29 more quality prompts, anything you want. If  you watch my previous videos on the channel,  

  • 00:20:37 on the playlist, you will understand what I  mean. This is actually same as doing inference  

  • 00:20:43 text2img from here to generate images. It is  exactly doing that to improve variety of our  

  • 00:20:57 classification training. You see they  are really bad quality right now,  

  • 00:21:04 So maybe we should tune our class prompt.  

  • 00:21:12 To do that I will just cancel the training  with clicking cancel button. You see training  

  • 00:21:20 cancelled. And you see that these are  the images it has generated. I will  

  • 00:21:27 modify. I have deleted all of them. I will modify  the class prompt by adding some keywords. Okay, to  

  • 00:21:39 decide what to enter, I have moved to text2image  tab and I have typed: portrait photo of a man,  

  • 00:21:46 HDR, 8K and sharp. The keywords that you will find  from real photos of people. And I have entered my  

  • 00:21:56 negative prompt as well and this is the image that  has been generated. It is pretty good. So I am  

  • 00:22:02 returning back to my DreamBooth tab and in here  I am changing to my the class prompt like this  

  • 00:22:10 and now I am clicking train again. Now it will  generate 160 class images for training. Basically  

  • 00:22:23 the generic images to improve my training quality.  Let's see what kind of results we are going to  

  • 00:22:30 get. We should get results same as we got in  text2image tab actually. Yeah, it's a decent  

  • 00:22:37 face photo of a male. Why we are doing this? As  I said, to increase the variance variation. When  

  • 00:22:51 you have different styles, different variations of  photos. It will prevent over training and it will  

  • 00:23:01 force model to learn face of the person that you  want to teach. OK, so you see, now we are getting  

  • 00:23:09 really decent quality face images of male persons  and it will help our model to learn better.  

  • 00:23:21 OK, meanwhile, the training is going on. I  mean, the image generation is going on. Let's  

  • 00:23:28 quickly recap. First, we generated our model with  a unique name like this, and we have selected the  

  • 00:23:35 source checkpoint. You can source checkpoint any  model on the Internet that you want to teach. OK,  

  • 00:23:42 it will work. Exactly same as version 2.1.  The only thing that may differ if your model  

  • 00:23:51 is based on stable diffusion 1.5 or 1.4, they  use 512 pixel size images. Therefore, the only  

  • 00:24:02 thing that you need to change is the image  size, which is where was it? Let me show you.  

  • 00:24:14 Here image processing resolution. So, if you use  a checkpoint model based on 1.5, 1.4 or 512 pixel  

  • 00:24:24 based 2.x version, then you need to change this to  512 pixel. But if you are using 2.1 version based  

  • 00:24:35 model which has native 768 pixel resolution,  then you need to change that. Other than that,  

  • 00:24:46 we are going to parameters: training steps per  image. I have set this very big number because  

  • 00:24:51 I will stop the training myself at a certain  point. I will show you when I will stop it and  

  • 00:24:58 how I will decide to stop training, And this is  important. I will save and generate images every  

  • 00:25:07 10 epoch. Every 10 epoch means that one epoch  will happen when it process all of the images  

  • 00:25:14 in my training folder. I have 16 images in my  training folder, which is here. It will also put,  

  • 00:25:23 I think, flipped images there, so it will be 32  images. We will see that when training starts,  

  • 00:25:29 currently still generating the generic images that  I have requested, like this. OK, so I will be able  

  • 00:25:39 to decide whether model has learned enough so that  I can stop and start using the model or not. OK,  

  • 00:25:48 so these save previews and save checkpoints is  really important to see the progress of training.  

  • 00:25:56 The batch size is, I think, related to how many  GPUs you have, or if you have a very strong GPU  

  • 00:26:04 that can process in parallel two images at the  same time. If it has enough VRAM memory, you can  

  • 00:26:12 also increase this. But if your graphic card can  only process one image at a time, then you should  

  • 00:26:19 leave both of these as one. I didn't change any  of the learning rates or other things. I did leave  

  • 00:26:28 them default. I have also applied horizontal flip  randomly. Decide to flip images horizontally so  

  • 00:26:36 that it will add more variation to the learning  data set. I don't have any VAE or concept list.  

  • 00:26:46 I am using LoRA because with this way it will use  lesser VRAM than DreamBooth. And the save files  

  • 00:26:57 will be 1000 times lesser than the DreamBooth,  because DreamBooth generates full size model  

  • 00:27:04 files for checkpoints. However, this will generate  minimal files. Then from those files, after we  

  • 00:27:12 are satisfied with the training process, we will  generate full model. I am using 8bit Adam to save  

  • 00:27:20 VRAM. I am using mixed precision memory attention  and I didn't change any other parameters. And in  

  • 00:27:27 the concepts I did set my data set directory and  the classification data set directory. I have  

  • 00:27:33 already shown you them. We are not using any file  words because we are not doing a general concept  

  • 00:27:40 training. It is not the context of this video.  I may make another video to train hypernet or  

  • 00:27:50 textual embeddings. The instance prompt. This is  really important because this keyword is being  

  • 00:28:00 taught to our model. So when I do inference  with the new model, the tuned model, it will  

  • 00:28:09 know that this keyword is the face of my brother  pictures. Therefore, this is really important.  

  • 00:28:18 This is the generic class prompt. I already  have explained that. And these are the arbitrary  

  • 00:28:26 numbers. Actually, this is the arbitrary number  I have entered. I didn't change the other things.  

  • 00:28:33 These are only affecting the images generated  in here. None other than that. So this part is  

  • 00:28:43 only important for these images. So it will  generate 160 images in this folder. You see,  

  • 00:28:50 it is also generating same named text files and  it is saving the description of the input. You  

  • 00:29:00 could also modify these descriptions, but  I think it is not very important for LoRA  

  • 00:29:04 training. It is important for hypernetwork and  especially for text embeddings. Now I will pause  

  • 00:29:12 video until the image generation has been done  and the training has started. OK, meanwhile,  

  • 00:29:19 the class image generation: So far, we are almost  at 50 percent. It says there is still 20 minutes  

  • 00:29:27 remaining. Approximately. It is using 95 percent  of my GPU. It is using almost 9 gigabytes of my  

  • 00:29:37 GPU and it is using about 20 percent of CPU. So  these are the values that it is using for just  

  • 00:29:45 class image generation. And let's see how much  it will use for training. And the class image  

  • 00:29:53 generation speed is 14.58 seconds / IT. OK, the  training process has started. After generating all  

  • 00:30:04 of the images. Let me show you them. Once you have  generated these generic images, you don't have to  

  • 00:30:11 generate them once again. You can stop and restart  training and use these base generic images.  

  • 00:30:19 However, an error occurred. So my web interface  is not getting updated anymore, unfortunately.  

  • 00:30:29 But the training is going on, as you  can see. Currently it is two iterations,  

  • 00:30:34 actually two seconds per iteration as a  speed. It has done 145 iterations so far.  

  • 00:30:47 And let's see how much VRAM. Oh, you see,  my entire VRAM is almost full. It says  

  • 00:30:56 that there is allocated and reserved, but I am  seeing the full VRAM usage in my graphic card.  

  • 00:31:04 And after 10 epochs we are supposed to get our  first training output to see. OK, it says that  

  • 00:31:15 you see, LoRA weights successfully saved to C  Stable Diffusion web UI, which is my folder.  

  • 00:31:20 Inside models LoRA. So let's go there and check it  out. In Stable Diffusion web UI in models in LoRA.  

  • 00:31:33 OK, so you see, this is the checkpoint file it  has generated and it is only three megabytes. So  

  • 00:31:40 I can generate checkpoint file for even every  epoch. However, if we were using DreamBooth  

  • 00:31:48 instead of the LoRA, this would be minimal, like  5 gigabytes or 6 gigabytes or 4 gigabytes based on  

  • 00:31:56 the model that you are using as the first initial  checkpoint. So for version 2.1, it would be equal  

  • 00:32:07 to minimum five gigabytes because the base model  is five gigabytes. If we were using DreamBooth  

  • 00:32:13 instead of LoRA, our every checkpoint would  be five gigabytes. But now we are only getting  

  • 00:32:20 three megabytes checkpoint. It is even smaller  than 1/1000. OK, so we should also got the first  

  • 00:32:34 image output of our training. So where it is  saved, If you are wondering, I think inside  

  • 00:32:42 DreamBooth, in my brother's in here samples.  Yes, So this is the first sample image it has  

  • 00:32:50 generated after ten epochs. In this folder, as  the time passes we are going to see images that  

  • 00:32:59 will be similar to my brother's sample images.  Let me show you our training data set images.  

  • 00:33:09 Let me show you once again. So once we get images  as close as possible to our training data set,  

  • 00:33:17 then we will, we will, we will generate a  checkpoint model file, full model file, from that  

  • 00:33:27 file. Which file from the file?  Let me open once again the C folder  

  • 00:33:37 to explain better. In here in models in  LoRA. So once we got a good image we are  

  • 00:33:46 going to you see the file name is 160. We are  going to get the same file in here and we will  

  • 00:33:54 generate a full model checkpoint from that and  then we will be able to generate the images of  

  • 00:34:01 the person we train it for. OK, so now I will  pause video until we got some good results.  

  • 00:34:10 Oh, by the way, it seems like. Yes, yes, the  process has stopped. So therefore I have to  

  • 00:34:17 continue. Probably an error has occurred. Yeah,  an error occurred. So what we need to do is:  

  • 00:34:26 OK, let me show you to continue from there.  So I am refreshing the web interface. This  

  • 00:34:34 error may occur time to time And in here I go  back to the DreamBooth. And in here you see,  

  • 00:34:41 let's refresh. We have my brother as LoRA  model. And OK, so let's click load params and  

  • 00:34:53 see if it will load. I hope it loads. OK, it says  loading. I maybe need to restart the application.  

  • 00:35:04 Yeah, probably I need to restart the application.  You see, when you play with the web UI while  

  • 00:35:10 training, these kind of errors may occur.  So I will now restart the application. OK,  

  • 00:35:16 after I close the command line, you see connection  error occurred. So let's go back to our stable  

  • 00:35:24 diffusion web UI. Click web UI webui-user.bat.  OK, I have restarted the web UI. Now refresh  

  • 00:35:33 and go to the DreamBooth and pick the LoRA model.  And let's click load params. Please specify model  

  • 00:35:41 to load. So you see, my brother model is now here.  After restart, and after I click load params,  

  • 00:35:49 load loaded config and I click train, it should  continue from where it is left. OK, you see,  

  • 00:35:57 it is getting. Concept requires 160 images.  It has loaded the same images, so it is not  

  • 00:36:03 regenerating the classification images, It is  loading the weights where it is left, I think.  

  • 00:36:12 So the number of examples: 16. Number of batches  per epoch: 16.. Correct. Number of epochs: one  

  • 00:36:19 million. As we have set, the total optimization  step is 16 million. OK, everything looks correct.  

  • 00:36:28 And, yes, it is now continuing where it is  left. This is great. So if an error occurs,  

  • 00:36:37 this is how you are going to continue your  training. So you can further optimize your  

  • 00:36:44 model from any checkpoint. And in  here, if you see, let me zoom in.  

  • 00:36:50 You see, OK, I did zoom too much. Training step  is 23. This is the current session. And, you see,  

  • 00:36:59 this is the lifetime session. So this is different  that you are setting in here. Use lifetime steps,  

  • 00:37:06 epochs when saving. We didn't check that. So  we are only taking into account the current  

  • 00:37:13 session steps for saving and previewing the  checkpoints. But this is the lifetime. OK,  

  • 00:37:20 now I will pause the video again. OK, so you see,  after the second save and now it continue to do  

  • 00:37:28 training. So sometimes errors may happen, even  though they shouldn't. So if an error happens,  

  • 00:37:36 just restart application, just as I have shown,  and continue to training. So the samples are  

  • 00:37:44 getting produced. I hope it doesn't take too much  time to teach my brother face into the model.  

  • 00:37:55 OK, it has been only 30 epochs so  far and we already got somewhat  

  • 00:38:02 similar picture in the third one. You see,  this is the thirty epoch after 480 steps  

  • 00:38:09 in total. And this is my brother. You see,  there is a similarity, as you can see. OK,  

  • 00:38:18 I have noticed another mistake. You see the  command line interface is displaying the LoRA  

  • 00:38:25 weights has been saved to my brother, underline  160 dot PT. However, it is correctly saving in the  

  • 00:38:33 folder. So this printed message is incorrect,  but the same with file names are correct,  

  • 00:38:40 as you can see. So this is the thirty epoch has  been done actually for. Actually for the epochs,  

  • 00:38:47 Yes, since we have 16 images, when you divide  this 16, it is 40 epochs. And these are the  

  • 00:38:54 so far generated images. You see it starts to  resemble more and more as the training continue.  

  • 00:39:00 OK, it has been over 1423 steps so far. So the  training step speed is, for 1.60 seconds for per  

  • 00:39:14 iteration. So far it is going on. We are getting  closer to our target image, as you can see here.  

  • 00:39:24 OK, It has been over 5600 steps so  far, which makes 350 epochs. And for  

  • 00:39:37 this tutorial I am now going to cancel the  training and I will generate a checkpoint  

  • 00:39:45 based on the best that comes to me as best  epoch number, which is sample 2400. You can  

  • 00:39:57 continue to do training until you are satisfied  with the results. But these results are just  

  • 00:40:03 a preview of what it has learned. With a good  prompt you can obtain much better photos. And  

  • 00:40:11 also it also depends on your data set quality. If  you prepare better images than in this example,  

  • 00:40:18 you can still obtain better results. I think  this is a decent one. And let's generate our  

  • 00:40:26 model checkpoint. So how we are going to  generate our model checkpoint to use later.  

  • 00:40:32 You see there is a LoRA model and now we are  going to generate checkpoint from our 2400.  

  • 00:40:42 I am entering the model name here  that I want to give. Let's say, my  

  • 00:40:48 brother test one, and I am clicking generate ckpt  file. OK, it is generating. You can see that it is  

  • 00:40:58 loading LoRA from the selected checkpoint from  here and applying weight. As you can see here:  

  • 00:41:07 LoRA weight: What percentage of LoRA weight  should be applied to the UNET when training  

  • 00:41:12 or creating checkpoint. Applying the text  weight as well. And then it is saving. However,  

  • 00:41:19 the saved file name is not correct. You see, it  has appended the latest training number, However  

  • 00:41:28 it has loaded 2400. So I think there is a simple  mistake in the web UI. So where it is saved,  

  • 00:41:39 It is saved inside Stable Diffusion installation  folder, then models, then Stable Diffusion  

  • 00:41:46 folder. OK, I am going to rename this  into the name that I want. Let's say LoRA.  

  • 00:41:54 And you see it also has YAML file and it has  to be the same name. OK, I did rename with F2.  

  • 00:42:02 OK, now I can do text2img and generate new images  based on our training. How we are going to do  

  • 00:42:10 that: First click refresh here and then my new  model has appeared here. It is now loading. You  

  • 00:42:19 can see the loading from this command window here:  Loading config and loading other parameters. OK,  

  • 00:42:27 the model has been loaded. Now what is the  prompt that we are going to do? The prompt  

  • 00:42:35 we are going to do is the prompt we have given in  here, which is my brother face. OK, this is our  

  • 00:42:42 unique keyword. And then we will append the other  keywords that we want. Even though our model has  

  • 00:42:51 learned the face very good As soon as we add new  keywords to improve and obtain different styles of  

  • 00:43:01 the learned face, it totally produces different  images. Unfortunately, no matter how many times  

  • 00:43:09 I have tried, my all attempts have failed. It  always produces different faces, not the face  

  • 00:43:17 it has learned. If I only give my prompt instance,  yes, it produces the face of my brother. But then  

  • 00:43:26 what is the purpose of training? Because I am  not able to modify it. Change the style, produce  

  • 00:43:33 different styles. Therefore, now I will do another  training with SD version 1.5. And let's see the  

  • 00:43:41 difference between SD version 2.1 and 1.5 when  we are doing face training. Since SD version 1.5  

  • 00:43:51 requires 512 pixel resolution, I am recutting the  images, as you can see. I am cropping them again  

  • 00:44:03 And I have removed some of the very old images.  Actually, I only removed two of them. Okay,  

  • 00:44:10 it is like this that I am setting up the  images for SD version 1.5 training. Okay, yes,  

  • 00:44:21 like this. Save as zip and open the downloaded  file and extract them into the folder.  

  • 00:44:33 Okay, here, like this, I will overwrite and  the training set is now ready. Okay, for 1.5,  

  • 00:44:40 I am first changing model into 1.5. Then I  am going to the reboot. And in here we are  

  • 00:44:49 going to generate a new model. Let's say,  okay, brother, SD 15, like this. And the  

  • 00:44:58 source checkpoint will be 1.5 because we are  starting a new model. Let's generate the model  

  • 00:45:06 like this: Okay, it is preparing  the model file for training.  

  • 00:45:12 Actually, everything is same. Then I am  clicking training wizard person. It will  

  • 00:45:17 set the parameters. Oh, I think I click  it. Yeah, I didn't wait process to finish.  

  • 00:45:26 Okay, now I will set again. Okay, now it is set  and the model is also set. All right, you see,  

  • 00:45:33 the model has arrived here for training. Okay, I  am just doing the same things. By the way, we now  

  • 00:45:41 need to recompose new class images for improving  the test accuracy. Also, this apply: horizontal  

  • 00:45:53 flip means that on the runtime it will sometimes  provide the horizontally flipped images in the  

  • 00:46:01 training. Okay, it won't generate new images on  the folder, It will do that on the runtime. Okay,  

  • 00:46:09 I am selecting LoRA. I am using 8bit adam  with FP16, xFormers. Don't Cache Latents.  

  • 00:46:25 Okay, I am not changing other things  because it is already learning good,  

  • 00:46:29 but we weren't able to generate  good images. I think it was due to  

  • 00:46:34 version, SD version 2.1. Okay, the path for our  VAE here. And classification. We now need to make  

  • 00:46:48 new classification, so I will just make another  folder too. Okay, like this: Let's enter here.  

  • 00:46:58 All right, we are leaving this empty because we  are only teaching one phase. I will give the same  

  • 00:47:05 name as the model name to instance prompt. Now  class prompt. This is important to decide class  

  • 00:47:14 prompt. Now I will do a few tests here. Okay,  with a simple prompt, such as face photo of a man:  

  • 00:47:21 8K HDR, smooth, sharp focus and cinematography,  we got decent faces. So this will be our class  

  • 00:47:32 prompt. Okay, let's go back to our DreamBooth  training, and the class prompt will be like this:  

  • 00:47:39 Classification image, negative prompt. So let's  also copy and paste it. By the way, don't worry,  

  • 00:47:45 I will provide these as comments. Okay, so the  sample image prompt will be same as before. Okay,  

  • 00:47:55 and should I provide negative image prompt  for sample? Yes, let's also provide it. Okay,  

  • 00:48:03 how many we want? This time? We have how  many images in the folder? Let's check it out  

  • 00:48:12 Once again. Okay, we have 14, so I will  generate just 140. Okay, I'm not touching  

  • 00:48:23 this. Parameters are set. Everything looking  good. Okay, let's start another training.  

  • 00:48:35 So you see, since we don't have any concept images  now, it is going to generate first our class  

  • 00:48:43 images, as before. But this time we are using  512 pixel resolution. This is really important  

  • 00:48:50 because our base model is now version 1.5 and  it is using 512 pixel as native resolution.  

  • 00:49:00 Generating class images are now much faster, you  see, because it is now lower size in dimension.  

  • 00:49:12 Okay so, the classification training set has been  completed and now the training has started and so  

  • 00:49:20 far we are at the 50 training step. You see, it is  much faster now than before because we are simply  

  • 00:49:29 working with 0.44, which means 44% of image size  than before. Because before we were working on  

  • 00:49:41 768 pixels. Now we are working with 512 pixels.  Therefore, it is more than two times faster.  

  • 00:49:50 New model training checkpoints are  also getting saved under the models  

  • 00:49:57 LoRA folder. As you can see here  with the name that I have given.  

  • 00:50:01 Also, the new folder under DreamBooth has been  generated for the new training in here, brother  

  • 00:50:09 SD15. And in here we can see the samples it is  generating: so far, nothing resembling at all.  

  • 00:50:17 Okay so the training has been completed. I let  it run during the night while I was sleeping, so  

  • 00:50:23 generated so many checkpoints. And now I am going  to use this particular checkpoint to generate our  

  • 00:50:31 .ckpt file from here. As usual, as previously  same. I have selected the model entered,  

  • 00:50:41 selected the checkpoint, given a name, and  then click it generate ckpt file. Then now  

  • 00:50:51 I am going to load newly generated ckpt  file. So to do that, just click refresh.  

  • 00:50:58 And it should come. And yes, it is. It has  arrived. With now checkpointing that model.  

  • 00:51:05 It is done. And now we can do our tests. OK, I  have generated over 600 images and some of them  

  • 00:51:15 are really good and really resembling the face  we teach it. So the key thing is that you need to  

  • 00:51:23 generate more images with LoRA because I think it  is not as precise as DreamBooth. The prompt I have  

  • 00:51:31 used is portrait photo of brother SD 15, which is  my prompt instance, with weight 1.2. 1.2 weight  

  • 00:51:40 means that it will give more importance to this  keyword. On the official page of Automatic1111  

  • 00:51:48 Stable Diffusion web UI wiki features, You can  see attention emphasis and it is explaining that  

  • 00:51:55 how you can give more attention to each word.  You can also use parentheses like this, or you  

  • 00:52:02 can directly set importance like this. So it is  totally up to you to use the either way. So I have  

  • 00:52:10 given more importance to the prompt instance and  I have also written photo of brother SD 15. And  

  • 00:52:17 then I have used a generic keywords to generate  images as close as to our prompt instance. You see  

  • 00:52:26 8K HDR, smooth, sharp focus cinematic. I am going  to share all these keywords in the comments of the  

  • 00:52:34 video, and I have also entered a lengthy negative  prompt. I have used Eular a as a sampling method  

  • 00:52:44 with 25 steps and the native resolution for SD 1.5  512 pixels. So how can you generate more than 100  

  • 00:52:55 images? Set the batch count to 100, then go to  bottom. Here. You will see the script section.  

  • 00:53:03 By default it is set to none, but you can go to  prompts from file or text box and you can just  

  • 00:53:10 copy and paste your prompt. So it will read  each line and will continue generating images  

  • 00:53:18 until all of the lines are executed. With this  way, you can generate much more images. Also,  

  • 00:53:27 there are other options that you can do here. For  example, you can do X and Y plots. So you can give  

  • 00:53:35 X values and Y values and, if you wonder what they  are, separate values for X axis, using commas.  

  • 00:53:42 You can play with these to, for example, generate  different style images with having, let's say,  

  • 00:53:52 artist names or style names in your X values  and the Y values would be like your regular  

  • 00:54:00 prompt. Okay, so I have selected few of the images  and now I will show you how to upscale them. The  

  • 00:54:09 resemblance rate is not as good as DreamBooth,  unfortunately. So you can also do DreamBooth  

  • 00:54:16 training. The only thing different in DreamBooth  training than the LoRA is in the advanced setup.  

  • 00:54:22 You just don't pick this LoRA and it will do  DreamBooth training. And also be careful that  

  • 00:54:27 when you are DreamBooth training it will generate  5GB files, 4GB files, at the each save checkpoint.  

  • 00:54:35 So you may want to reduce this, increase the  save checkpoint frequency, not just 10, but maybe  

  • 00:54:43 50. It totally depends on your hardware hard  drive. Okay, so what am I going to do is first  

  • 00:54:50 let's check out the PNG info. Tap in here in  pictures and in brother selected. Let's pick  

  • 00:54:57 one of them. So you see Web UI embeds the meta  information of the parameters. So if you can  

  • 00:55:05 get the original image that is generated by the  Web UI, you can just use PNG info to extract the  

  • 00:55:13 parameters from that image. And the one another  thing I am going to show you is extras. In extras,  

  • 00:55:20 let's first try a single image. You can upscale  it. Okay, the best upscaling algorithm I have  

  • 00:55:28 found is R-ESRGAN 4x+. I pretty much like  this. And let's upscale to 3X dimension. The  

  • 00:55:39 first time you do. It may download something  in here. Since I have done it previously,  

  • 00:55:44 it didn't download the necessary models.  Okay, the upscaling is done. As you can see,  

  • 00:55:51 now this is upscaled version. You can also  apply GFPGAN visibility. The GFPGAN will  

  • 00:56:00 improve the face of a human. It is another  model. Let's do that and see the difference.  

  • 00:56:10 Okay, it is getting done. And yes, so now you  see it is more like a real human. This is fixing  

  • 00:56:21 eyes much better, making the eyes much better if  they are not oriented, if they are not symmetric.  

  • 00:56:29 So you may want to apply  this as well, if you want.  

  • 00:56:34 And also you can do batch processing. For batch  processing, just open the folder with Ctrl-A,  

  • 00:56:41 select all, open. It will be loaded like this.  Then it will apply all of the parameters you  

  • 00:56:48 set here and click generate. And all of the  images will be generated as a batch. So the  

  • 00:56:58 results of the batch generation will be saved  in a folder. If you want to open that folder,  

  • 00:57:03 just click this folder image here. You see open  images output directory and the batch processing  

  • 00:57:08 results will appear here. You can just directly  copy them, paste them and do whatever you want.  

  • 00:57:17 Okay, this is all for today. Please  ask any questions that you might have.  

  • 00:57:23 To improve the performance so I suggest you to  use DreamBooth instead of the LoRA training. Also,  

  • 00:57:31 you can improve your data set. The data set,  our data set, was not that very good. You see  

  • 00:57:40 almost same time captured, same pose images.  So if you add more variety to your data set,  

  • 00:57:48 you will obtain better results, more likely. And  also, please like, share and subscribe our channel  

  • 00:57:56 if you have enjoyed. And if you support us on  Patreon we would appreciate very much. Currently,  

  • 00:58:02 so far, we have one Patreon, as you can see. Thank  you very much to our beloved Patreon, by the way,  

  • 00:58:08 And I am hoping that you will support us as well.  Hopefully, more videos, more advanced videos will  

  • 00:58:15 come for Stable Diffusion. If you want to also  learn something about Stable Diffusion, let me  

  • 00:58:22 know by comments. And hopefully I will make videos  about them. Hopefully, see you in another video.

Clone this wiki locally