Skip to content

Z Image Turbo LoRA training with AI Toolkit and Z Image ControlNet Full Tutorial for Highest Quality

FurkanGozukara edited this page Dec 4, 2025 · 1 revision

Z-Image Turbo LoRA training with AI Toolkit and Z-Image ControlNet Full Tutorial for Highest Quality

Z-Image Turbo LoRA training with AI Toolkit and Z-Image ControlNet Full Tutorial for Highest Quality

image Hits Patreon BuyMeACoffee Furkan Gözükara Medium Codio Furkan Gözükara Medium

YouTube Channel Furkan Gözükara LinkedIn Udemy Twitter Follow Furkan Gözükara

Z-Image Turbo LoRA training with Ostris AI Toolkit + Z-Image Turbo Fun Controlnet Union + 1-click to download and install the very best Z-Image Turbo presets. In this tutorial, I will explain how to setup Z-Image Turbo model properly in your local PC with SwarmUI and download models and use them with highest quality via ready presets. Moreover, I will show to install Z-Image Turbo Fun Controlnet Union to generate amazing quality images with ControlNet preprocessors. Furthermore, I will show how to 1-click install AI Toolkit from Ostris and train Z-Image Turbo model LoRAs with highest quality configs made for every GPU like 8 GB GPUs, 12 GB GPUs, 24 GB GPUs and so on. I did a massive research to prepare these Z-Image Turbo model training configurations.

👇 Links & Resources Mentioned:

Download SwarmUI & Models: [ https://www.patreon.com/posts/Download-SwarmUI-Models-114517862 ]

Ostris AI Toolkit (SECourses Version): [ https://www.patreon.com/posts/Ostris-AI-Toolkit-140089077 ]

Ultimate Batch Image Processing App: [ https://www.patreon.com/posts/Ultimate-Batch-Image-Processing-App-120352012 ]

SwarmUI with ComfyUI Backend Windows Tutorial: [ https://youtu.be/c3gEoAyL2IE ]

SwarmUI with ComfyUI Backend RunPod and Massed Compute Cloud Tutorial: [ https://youtu.be/bBxgtVD3ek4 ]

⏱️ Video Chapters:

00:00:00 Introduction to Z-Image Turbo Model

00:00:54 FP8 Scaled Version 5.7GB for Low VRAM

00:01:10 ControlNet Union with Z-Image Turbo

00:01:30 LoRA Training with Ostris AI Toolkit

00:02:00 Default vs Custom Training Preset Quality Comparison

00:03:00 RunPod Cloud Training Preview

00:03:40 MassedCompute Cloud Training Preview

00:04:16 Downloading Z-Image Models via SwarmUI

00:05:00 Z-Image Turbo Core Bundle & ControlNet Files

00:05:58 FP8 Scaled Model & Musubi Tuner Converter

00:07:13 Updating ComfyUI for Sage & Flash Attention

00:08:13 Updating SwarmUI & ControlNet Preprocessors

00:08:52 Updating & Importing Latest SwarmUI Presets

00:09:20 Generating with Quality 2 Fast Preset

00:10:48 Generating with Quality 1 Upscale Preset

00:11:35 Quality 1 vs Quality 2 Visual Comparison

00:12:13 Setting up ControlNet Input & Aspect Ratio

00:13:41 ControlNet Strength Settings & Canny Test

00:15:26 Using Depth Preprocessor with Z-Image

00:15:58 Coloring Lineart Drawings with ControlNet

00:16:58 Lineart Preprocessing Comparison

00:17:50 Ostris AI Toolkit Installation Prerequisites

00:19:12 Installing Ostris AI Toolkit on Windows

00:20:02 First Time UI Setup & Launching Interface

00:21:04 Loading Custom Training Configs

00:21:38 Creating a New Dataset Structure

00:22:24 Ultimate Batch Image Processing App Install

00:23:17 Dataset Prep Stage 1: Auto-Zooming with SAM2

00:26:08 Dataset Prep Stage 2: Resizing to Exact Resolution

00:28:12 How to Select Best Training Images

00:30:24 Importance of Emotions & Angles in Datasets

00:31:44 Z-Image Resolution & Aspect Ratio Rules

00:33:21 Configuring Training Parameters & Epochs

00:36:52 Resolution Impact on Training Speed

00:37:46 Starting the Training Job on Windows

00:38:39 Monitoring Training Progress & VRAM

00:39:43 Checkpoint Generation Settings

00:40:40 Resuming Training from Last Checkpoint

00:42:09 Training Speeds on RTX 5090 vs 4090 vs 3060

00:43:01 Training Quality: Default vs Custom Preset Comparison

00:44:21 Testing LoRAs with SwarmUI Grid Generator

00:46:04 Fixing ControlNet Error in Grid Generation

00:47:09 Comparing Generated LoRA Checkpoints

00:47:38 Using Trained LoRA with ControlNet Union

00:48:10 RunPod: GPU Selection & Template Setup

00:50:32 RunPod: Port 8675 Config & Initialization

00:51:36 RunPod: Uploading Installation Files

00:52:01 RunPod: One-Click Installation Command

00:54:07 RunPod: Starting AI Toolkit & Proxy Connection

00:54:38 RunPod: Uploading Dataset via Interface

00:55:32 RunPod: Starting the Training Job

00:56:24 RunPod: Speed & Cost Analysis

00:57:28 RunPod: Auto-Stop Command Setup

00:58:24 MassedCompute: GPU Selection & Coupon Code

01:00:16 MassedCompute: ThinLinc Client Setup

01:01:21 MassedCompute: Transferring Files to Shared Folder

01:02:55 MassedCompute: Installation Command

01:05:49 MassedCompute: Connecting via Public URL

01:06:54 MassedCompute: Starting Training Job

01:08:43 Downloading Checkpoints & Stopping Instance

🚀 Master Z-Image Turbo & LoRA Training: The Ultimate Guide!

In this comprehensive tutorial, I show you how to generate ultra-realistic images in seconds using the lightweight Z-Image Turbo model. We cover everything from 1-click installation on SwarmUI (ComfyUI backend) to mastering ControlNet Union for precise image control.

But that’s not all! I also reveal how to train your own high-quality Z-Image Turbo LoRAs using the Ostris AI Toolkit. I have developed a custom training preset that significantly outperforms the default settings—you have to see the comparison to believe it! Whether you are on a local PC, RunPod, or MassedCompute, this guide has you covered.

🔥 What You Will Learn:

Z-Image Turbo Setup: How to run this fast, 6GB model (FP8 included) on almost any GPU.

ControlNet Mastery: Use Canny, Depth, and Lineart to control your generations perfectly.

LoRA Training: Step-by-step guide

Video Transcription

  • 00:00:00 Greetings everyone. In this tutorial video, I  will show you how to use Z-Image Turbo version.  

  • 00:00:06 It is a very fast, very lightweight model  that you can generate amazing high-quality,  

  • 00:00:13 extremely realistic, or stylized images on your  local PC. The model is as small as 6 GB with  

  • 00:00:22 maximum quality. And it will run on literally  every GPU, and since it is a turbo model,  

  • 00:00:29 it only requires 9 steps. All these  images were locally generated ultra-fast,  

  • 00:00:36 like 10 seconds generation time, and they are  very high resolution as you are seeing right  

  • 00:00:41 now. I will explain all of these. And all of  these images were generated in SwarmUI with  

  • 00:00:47 using the ComfyUI backend with our presets. 1  click to install, download, and use right away.

  • 00:00:54 You probably didn't see before, but I have  Z-Image Turbo FP8 scaled. This is 5.7 GB in size,  

  • 00:01:04 so it fits into all GPUs. But this is not  all. Furthermore, I will show you how to  

  • 00:01:10 use ControlNet with Z-Image Turbo model.  So you see, based on this input image,  

  • 00:01:21 these images were generated. I will show you  how to use ControlNet with Z-Image Turbo model.  

  • 00:01:28 But we are not done yet. I will also show you  how to train your Z-Image Turbo model LoRAs as  

  • 00:01:36 you are seeing right now. By using the AI  Toolkit from Ostris, you will be able to  

  • 00:01:42 train amazing LoRAs even on a very low VRAM  having very weak GPU fully locally. I have  

  • 00:01:50 researched it extensively and prepared amazing  presets for all GPUs with the highest quality.  

  • 00:01:57 We have presets for 8 GB GPUs to 24 GB GPUs,  and each one of them has a very high quality.

  • 00:02:05 For example, let me show you the default preset of  the Ostris AI Toolkit versus our preset. So this  

  • 00:02:14 left image was trained with the default preset  of the AI Toolkit that he shows in his tutorial,  

  • 00:02:21 and this is our preset. This is default preset;  this is our preset. This is default preset;  

  • 00:02:28 this is our preset. There is a massive quality  difference between default versus ours. And  

  • 00:02:34 these are not cherry-picked images; these are grid  generations. Another case: this is default preset,  

  • 00:02:39 and this is my preset. This is default preset,  and this is my preset. This is default preset,  

  • 00:02:45 and this is my preset. There is a  massive difference. So I will show  

  • 00:02:49 how to train with our preset on your PC.  This is default preset, and this is our  

  • 00:02:55 preset. Default preset, our preset. You  will get amazing quality training images.

  • 00:03:00 But we are not done yet. I will also show you how  to train on RunPod extremely efficiently. With RTX  

  • 00:03:07 4090, for example, you will be able to train with  amazing speed. I will also show how you can turn  

  • 00:03:15 off your pod automatically after training. All  of the installation is literally 1 click, and  

  • 00:03:21 my installers install the latest version of the  application. I am not using any template, so it  

  • 00:03:28 is not static. We are using the official PyTorch  template. Therefore, with any GPU, you will be  

  • 00:03:34 able to 1 click to install and use this amazing  trainer on RunPod. But we are not done yet. I will  

  • 00:03:40 also show how to install and use on MassedCompute,  our favorite GPU provider, which is very fast. So  

  • 00:03:48 you will be able to install it on MassedCompute  and access it from a public URL like this and  

  • 00:03:55 train it like it is in your PC. So easy, so fast,  and it will run on cloud, not on your computer.  

  • 00:04:03 And with our SwarmUI presets, you will be able to  use this model with highest quality. And with our  

  • 00:04:10 model downloader, you will be able to download the  necessary files with just 1 click. So let's begin.

  • 00:04:16 As a first step, we need to download Z-Image  models. For that, we are going to use our SwarmUI  

  • 00:04:22 auto-installer and model downloader. Download the  latest zip file as usual as in previous tutorials.  

  • 00:04:28 If you don't know how to install and use SwarmUI,  follow this video; it will explain you everything.  

  • 00:04:33 The links will be in the description of the video.  Then move and copy the zip file into your SwarmUI  

  • 00:04:39 installation, WinRAR extract here. Overwrite  all of the files; this is very important. You  

  • 00:04:45 need to overwrite all the files. Then double click  Windows start download models app .bat file run.  

  • 00:04:51 This will start the model downloader. First  of all, we need to download Z-Image models.

  • 00:04:56 We have significantly upgraded our application.  We have moved to Gradio version 6. The initial  

  • 00:05:03 loading will take a little bit longer  than before, as you are seeing right now,  

  • 00:05:08 but the application will work much better  after loaded. Okay, we are loaded right  

  • 00:05:13 now. You see we have significantly improved  our interface. We have updated our bundles,  

  • 00:05:19 and you see there is Z-Image Turbo core bundle.  This bundle will download Z-Image Turbo BF16,  

  • 00:05:26 Qwen 3 4 billion parameters text encoder.  This is necessary for the Z-Image. Z-Image  

  • 00:05:32 Turbo full ControlNet Union. We are going to  use this ControlNet Union model to be able to  

  • 00:05:38 generate images with references to images we  want with using ControlNet preprocessors. And  

  • 00:05:44 this is the VAE that it uses. So you can click  download all models from here, or you can click  

  • 00:05:49 individually these buttons to download.  Download all models is the best approach.

  • 00:05:54 Alternatively, if you have a very low  VRAM GPU, in the image generation models,  

  • 00:05:59 you will see Z-Image Turbo models, and you  will see we have Z-Image Turbo FP8 scaled.  

  • 00:06:06 This is a model that I have made myself. With  our SECourses Musubi Tuner premium application,  

  • 00:06:13 now we can convert Z-Image models into FP8 scaled  as well. So let me start the application to show  

  • 00:06:20 you. This is 5.73 gigabytes in size, so it  fits into 6 GB GPUs as well. However, don't  

  • 00:06:28 get confused even if you don't have powerful GPU.  With SwarmUI that is using ComfyUI backend, you  

  • 00:06:33 can run any model on any GPU, literally. Because  it will use your RAM memory to hold some part of  

  • 00:06:39 the model, and it will work. Don't worry about  that. But to get maximum speed, if you want, you  

  • 00:06:45 can use FP8 scaled if you have like 6 gigabytes  of GPU or 8 gigabyte or 10 gigabytes of GPU.

  • 00:06:52 So in our SECourses Musubi Tuner, we have  FP8 model converter, and you can convert  

  • 00:06:59 Qwen image and Z-Image models. This is how I  converted. Follow the model downloads on the  

  • 00:07:04 CMD window. Also on the very top, you will  see that they have been downloaded. Then you  

  • 00:07:09 are ready. So we can move to next step.  As a next step, you need to update your  

  • 00:07:13 ComfyUI installation. Get the latest zip file  from here. Go to your ComfyUI installation,  

  • 00:07:19 again extract and overwrite all the files.  Then double click and run the Windows install  

  • 00:07:25 or update ComfyUI .bat file. This is important.  So it will update ComfyUI to the latest version,  

  • 00:07:31 which supports Z-Image models with ControlNet. So  my ComfyUI installation were already up to date,  

  • 00:07:38 but still it is updating everything. This  ComfyUI installation is perfect. It supports  

  • 00:07:42 Sage Attention, Flash Attention, xFormers,  Triton, everything that you need with RTX 5000  

  • 00:07:49 series GPUs support. So this is a very, very good  installation. Okay, it has been done. Moreover,  

  • 00:07:55 this ComfyUI installation supporting special  samplers and schedulers like bon_tangent,  

  • 00:08:00 Beta 57. You need to use our ComfyUI installation  to have these extra samplers and schedulers.

  • 00:08:07 As a final step, we need to update our  SwarmUI. So we have SwarmUI update .bat  

  • 00:08:13 file. This will also install ControlNet  preprocessors automatically for you,  

  • 00:08:18 so you won't be needed to install. The update  and installation both of them will do. You can  

  • 00:08:23 read the announcements. You should always read  the announcements that I have make. You see 3  

  • 00:08:29 December 2025 version 108 announcement. If you  don't have it or if you are using somewhere else,  

  • 00:08:36 you can also manually install. You will have  a button here that will tell you to install  

  • 00:08:41 ControlNet. But since our installation is  doing that automatically, you don't need  

  • 00:08:45 it. Now we have everything ready. Now we need to  update our presets. To get the latest presets,  

  • 00:08:52 you can use the import feature as usual. You  see there is import, choose file, select the  

  • 00:08:58 latest preset from here and overwrite everything.  However, if you want a clean installation that  

  • 00:09:03 will delete all the existing presets and update  them to the latest version, which I recommend,  

  • 00:09:08 we have Windows preset delete import file. And  click yes, it will backup your preset and update  

  • 00:09:14 the presets like this. Everything is ready.  Then refresh your presets and sort by name.

  • 00:09:20 Okay, so how we are going to use  Z-Image Turbo model? First of all,  

  • 00:09:24 quick tools, reset params to default. This is  important. Then we have 2 separate presets for  

  • 00:09:31 Z-Image. You see Quality 1 and Quality 2.  Quality 2 is faster. It is only 1 process;  

  • 00:09:38 it doesn't upscale. So let me demonstrate  both of them. Direct apply. And for example,  

  • 00:09:43 let's use this prompt. Okay. Copy paste your  prompt here. It is all set. You also need to  

  • 00:09:50 set your aspect ratio. I did set base resolution  of this model when you download the model that I  

  • 00:09:56 have as a 1536 to 1536. By default, it was 1024.  However, I think this model works better with this  

  • 00:10:06 resolution. Sometimes you may get some mistakes,  but generate more images. Because it is too fast.  

  • 00:10:11 Let me show you real time. So let's generate  8 images. Generate. After the model loaded,  

  • 00:10:16 let's see the speed. Moreover, you see the  models that I upload for you have images,  

  • 00:10:22 have description, so you can read these  descriptions, and they are easier to use.

  • 00:10:27 Okay, the first generation started. The first one  will be slower than the other ones. Okay, first  

  • 00:10:32 one is done. So you see it was done in, let's  see, 12 seconds. The next one is done in 7.84  

  • 00:10:42 seconds. What about using the more higher quality  preset? So let me demonstrate that. I will reuse  

  • 00:10:48 parameters of this one so it will set the seed.  Then go back to preset and I will direct apply.  

  • 00:10:55 So this will activate the special refine upscale.  Moreover, this is important, you need to download  

  • 00:11:01 the upscale models from here. So we have other  models and image upscaling models. So download  

  • 00:11:09 these 3 models to have accurate upscaling. Okay,  then generate. Now this will upscale this image,  

  • 00:11:17 but we cannot upscale a lot because this model  has limitations. Also sometimes you can see  

  • 00:11:23 some hallucinations at the right or left side of  the image because of the high resolution that we  

  • 00:11:29 generate. Sometimes you won't see, sometimes you  will see. It depends. So let's have a comparison.  

  • 00:11:35 The left one is the Quality 2, which is way  faster to generate. And the right one is the  

  • 00:11:41 Quality 1 preset that we have. So you see how much  realism and details we add this way. It changes  

  • 00:11:48 the image because this model is very sensitive to  the resolution and upscale. However, it improves  

  • 00:11:56 the quality significantly. So you can test both of  the presets and decide which one is working better  

  • 00:12:01 for you, which one is generating better images  for your use case, and decide and use that way.

  • 00:12:08 So how we are going to use the ControlNet? The  ControlNet logic is same. So let's reload this  

  • 00:12:14 page, reset params to default. And you can use  either of the presets; both of them is working.  

  • 00:12:20 Let's use the direct apply and Quality 2. Now  what is the difference? You need to provide a  

  • 00:12:26 ControlNet input image, which will be the base of  your image. For example, let's try this pose. Then  

  • 00:12:33 you need to select your ControlNet preprocessor.  This is not mandatory. If you are using already  

  • 00:12:38 a Canny image, you don't need to select this.  But if you are not using a preprocessed image,  

  • 00:12:43 you need to select it. Then click this preview  to see the preset output. If you have different  

  • 00:12:49 aspect ratio, it will show you mangled preview.  So let's click preview. You see this is not  

  • 00:12:56 accurate. So how to make it accurate? You  need to have accurate aspect ratio with  

  • 00:13:01 your according to your input image, according  to your ControlNet input. How you can set the  

  • 00:13:07 aspect ratio accurately? I have been telling  the SwarmUI developer to add this feature,  

  • 00:13:11 but he still didn't add. So you need to choose  file, upload your reference image, click res,  

  • 00:13:18 and use the exact aspect ratio or the closest.  Let's use the exact aspect ratio. Then disable  

  • 00:13:24 this init image. So we use this init image  to set our aspect ratio accordingly. Now  

  • 00:13:31 when I click the preview, it will show me  the accurate preview like this. I am ready.

  • 00:13:35 You don't set ControlNet union type for Z-Image  Turbo model. This is a little bit different  

  • 00:13:41 ControlNet than the previous ones that we know.  But you can enable and test it; it won't make  

  • 00:13:46 difference. So we don't set it. So the ControlNet  strength matters. Let me show you what I mean by  

  • 00:13:52 that. So let's write our prompt and let's set a  seed like this and generate. Now the ControlNet  

  • 00:14:00 strength will impact your output. I tested it  and I find that between 0.6, 60 percentage, to 1,  

  • 00:14:09 100 percentage, works. So this is 100 percentage  result. Let's see the 60 percentage result. Okay,  

  • 00:14:15 like this. By the way, you see this prompt and  this ControlNet is not very related, but it is  

  • 00:14:21 still following it pretty accurately. Okay, this  time 60 percentage didn't work well. So let's  

  • 00:14:28 make it 70 percentage. However, image quality  increased, so you need to find a balance between  

  • 00:14:33 both of them. We can change the prompt. Let me  demonstrate. So let's make this 60 percentage. A  

  • 00:14:39 man wearing an amazing suit. Because this prompt  is much more matching to the our ControlNet image.  

  • 00:14:48 You see this one. Okay, now 60 percentage will  generate a very accurate image as you are seeing  

  • 00:14:54 right now. Yes. And I can choose the Tier 1  like this, direct apply and generate. Then it  

  • 00:14:59 will generate higher quality, higher resolution.  So what does this do is that it first render,  

  • 00:15:06 then it does another render over it with upscaling  to improve the quality. And yes. You see the same  

  • 00:15:13 pose, different person definitely. However, it is  working really good as you are seeing right now.

  • 00:15:19 So this was the Canny Edge preprocessor. You can  use the Depth. Let's see the Depth preprocessing  

  • 00:15:26 from here. When you first time click preview,  it will download the necessary model if you  

  • 00:15:31 don't have it. So from the debug menu, you can see  whether it is downloading or not. And let's see,  

  • 00:15:37 preview is not done yet. We are still waiting.  Probably it is trying to download. Yes,  

  • 00:15:42 you see it has downloaded it here. I can see  it. And yes, the Depth has been generated.  

  • 00:15:47 Now generate again. So you can use either  way. You can use Canny, you can use Depth,  

  • 00:15:53 whichever is working best for your case. You  can use Lineart. If you have a Lineart drawing,  

  • 00:15:58 you can color it. Let me demonstrate you.  Then I will choose that image from here. X,  

  • 00:16:04 choose file. I also need accurate aspect ratio.  So let's choose the file from here. Resolution,  

  • 00:16:10 use the closest aspect ratio. And this is it.  Then disable init image. And I am not going to  

  • 00:16:17 use any ControlNet preprocessing. And let's try  an amazing cell shaded render of a dragon. Okay,  

  • 00:16:25 let's see what we get. Since this is ControlNet  Union, it should be able to understand it. By  

  • 00:16:31 the way, this time we can also increase  the ControlNet strength if we want. Okay,  

  • 00:16:35 we are getting something. This was a very simple  prompt, but I think we will get a good image.  

  • 00:16:41 Yeah, pretty amazing. Pretty amazing result.  If I want more matching, more exact matching,  

  • 00:16:47 I can increase this strength. So this is another  generation. So as you increase the strength, it  

  • 00:16:52 will more match to the original image. This one is  looking like this. So this is the way. We can also  

  • 00:16:58 set the preprocessing to like, let's see what we  have, Lineart standard preprocessing. So let's see  

  • 00:17:05 the preview. When you first time click preview,  it will download the model. And I am first time  

  • 00:17:10 clicking it, so it is taking time for preview  to appear. It is downloading the model. Okay,  

  • 00:17:16 this is another generation. Pretty good. So you  see from this simple Lineart, I can generate very,  

  • 00:17:22 very good images like these ones. Okay, it is  here. And this is the preview. So I can use  

  • 00:17:27 this preprocessing as well and compare the  result. And this is the result with Lineart  

  • 00:17:33 preprocessing. So you see this is entirely  different than this version. So you can use  

  • 00:17:37 either way. It is working really good, really  amazing. So I recommend you to test your case.

  • 00:17:43 Okay, now the part that you have been waiting  for. How to use Ostris AI Toolkit to train your  

  • 00:17:50 Z-Image Turbo LoRA models. First of all,  we need to download and install Ostris AI  

  • 00:17:56 Toolkit. Currently, Ostris AI Toolkit has a bug.  Therefore, you need to download this zip file:  

  • 00:18:04 AI Toolkit SECourses Version 2. Until he fixes  that bug, I made a pull request which fixes the  

  • 00:18:12 bug of the CPU offloading. So I will update this  post. So when you are watching this tutorial, if  

  • 00:18:18 you don't see this AI Toolkit SECourses Version 2,  then you need to download this official version,  

  • 00:18:23 which installs from the official repository. This  will install from my forked repository. Then move  

  • 00:18:29 the zip file into the drive where you want to  install and extract all. Enter inside the folder.  

  • 00:18:36 So you see we have Windows Install, Windows First  Time UI Setup, Windows Start, and Windows Update  

  • 00:18:42 .bat files. But before starting installation,  you need to read the Windows requirements. We  

  • 00:18:48 have the classical requirements, but additionally,  you need to download Node.js. I am using Node.js  

  • 00:18:54 version 22.20.0. The direct link is here. You  just need to download it, start it, next. So  

  • 00:19:02 you see I already have. You just need to click  next, next, next, next. That is it. Nothing else.

  • 00:19:07 So after you have followed both of the  requirements, click Windows install .bat file  

  • 00:19:12 run. This will install the Ostris AI Toolkit with  Flash Attention, Sage Attention, xFormers with  

  • 00:19:19 Torch 2.8, CUDA 12.9. And my installation supports  all of the GPUs starting from RTX 1000 series to  

  • 00:19:28 5000 series. Or on cloud GPUs, it supports  A100, H100, B200, whatever the GPU that you  

  • 00:19:36 use on cloud. So wait for initial installation to  be completed. Okay, so the installation has been  

  • 00:19:42 completed. You can scroll up and see if there are  any errors. There shouldn't be. This is important:  

  • 00:19:48 you need to have Python 3.10.11 installed.  Then press any key to continue. Second thing  

  • 00:19:55 is Windows First Time UI Setup. This is 1 time,  and you need to run this. Do not run this again  

  • 00:20:02 after you have made the initial installation. For  this to work, as I said, you need to have Node.js  

  • 00:20:09 installed. Otherwise, it will not work. Node.js  is a system-wide installation. It is not like a  

  • 00:20:14 virtual environment, so you need to install it as  I have shown you. You may get some warnings like  

  • 00:20:19 this. These are all unimportant. You see there are  warnings. You can just ignore all of them, and it  

  • 00:20:26 is done. You see setup completed successfully. Now  we are ready to use. Next time for updating, you  

  • 00:20:31 can use Windows update app .bat file. But since we  just installed, let's start the application. So it  

  • 00:20:38 will give us public URL and localhost URL. Public  URL is useful in MassedCompute. It doesn't work in  

  • 00:20:45 RunPod, but in MassedCompute it works. I will show  both of them hopefully. This is our interface.  

  • 00:20:51 Go to New Job and Show Advanced. There is no  preset saving and loading in the AI Toolkit yet.

  • 00:20:59 So how you are going to use my presets?  Enter inside the Z-Image Turbo LoRA  

  • 00:21:04 configs. According to your GPU, select  the configuration. Since I have RTX 5090,  

  • 00:21:10 I am going to use 24 GB. You see there  is no higher configuration because this  

  • 00:21:15 model fits into 24 GB. Quality 1 is better than  Quality 2. Quality 2 is better than Quality 3,  

  • 00:21:22 4, and it goes on. So let's open this Quality 1  config. Copy it. Paste it here. And Show Simple.  

  • 00:21:29 It will update everything except the dataset. So  you need to have your dataset from Datasets, New  

  • 00:21:38 Dataset. Give a name like My Dataset. Then upload  your images with drag and drop here. They will be  

  • 00:21:44 uploaded. Then you can type any caption: Caption  1, Caption 2, anything. I will explain everything,  

  • 00:21:51 don't worry. This is just explaining you how  it works. The dataset will be saved inside the  

  • 00:21:56 AI Toolkit, inside datasets here. You see My  Dataset. And the captions are image file names  

  • 00:22:03 with txt. So image file name and txt. This  is the format of the dataset system of the  

  • 00:22:10 AI Toolkit. I already have my dataset here. So you  see this is my dataset. I will copy this, paste it  

  • 00:22:17 into datasets folder. And when I go to datasets  and refresh this page, you see it will appear.

  • 00:22:24 So how to prepare your training images dataset?  Now I will explain this part extremely carefully,  

  • 00:22:31 so watch all of it. To automatically prepare your  images, I recommend to use Ultimate Batch Image  

  • 00:22:38 Processing App. You see it is under Auxiliary  Tools section. So let's go to this link. I  

  • 00:22:44 recommend you to check out these screenshots,  read this post. Let's scroll down and let's  

  • 00:22:50 download the latest version. Then let's move  it into our Q drive, right click, extract here,  

  • 00:22:56 enter inside it. First of all, we need to  install. This is a pretty fast installation.  

  • 00:23:01 This application is very lightweight,  but it has so many features. Okay,  

  • 00:23:05 the installation has been completed. Scroll up  to see if there are any errors or not. Then close  

  • 00:23:11 this. Then let's start the application. Windows  Start application run. Why is this application  

  • 00:23:17 important? Because this will allow you to  batch preprocess your training images. You  

  • 00:23:23 can of course manually preprocess your images,  but this makes it much easier and accurate.

  • 00:23:30 So I have some sample images to demonstrate you  the power of this tool. I will copy this path  

  • 00:23:37 and enter as an input folder. Then as an output  folder, let's output them into my other folder as  

  • 00:23:45 Preprocess Stage 1. Then the aspect ratio. If you  are going to generate images with 16 by 9 always,  

  • 00:23:55 you can make your aspect ratio accordingly.  However, if you are not sure which aspect  

  • 00:24:00 ratio you are going to use, I recommend you  to use square aspect ratio with 1328 to 1328  

  • 00:24:08 pixels. This is the base resolution of the Qwen  image model or Qwen image edit model. This works  

  • 00:24:14 best. And with this aspect ratio and resolution,  you can still generate any aspect ratio. All the  

  • 00:24:20 images I have shown you in the beginning of  the tutorial were trained with 1328 to 1328.  

  • 00:24:27 Then there are several options. You can select  the classes from here to zoom them in. This is  

  • 00:24:33 extremely useful when you are training a person.  Because you want to zoom in the person. What I  

  • 00:24:39 mean by that? You see in these images, there  are a lot of extra spaces that can be zoomed  

  • 00:24:46 in. For example, in this image, I can zoom  in myself a lot. So you can choose this,  

  • 00:24:52 or there is a better one which is based on  SAM2. This takes anything as a prompt. Let's  

  • 00:25:00 say "person". You can set your batch size, GPU  IDs. These are all advanced stuff if you are  

  • 00:25:05 going to process a lot of images. So default is  good. Let's start processing. What this is going  

  • 00:25:12 to do is it is going to zoom in the class I  have given without cropping any part of the  

  • 00:25:18 class. So this will not make these images  exactly as this resolution or this aspect  

  • 00:25:24 ratio. It will try to match this aspect ratio  without cropping the any part of the subject.

  • 00:25:30 So let's see what kind of images we are  getting. We are saving them inside here.  

  • 00:25:34 You see it has generated this subfolder. This  is important because in the second stage,  

  • 00:25:40 we are going to use this to make them exactly same  resolution. When I enter inside this folder, you  

  • 00:25:48 can see that it has zoomed in the person. So this  is how it works. And when zooming in, it will not  

  • 00:25:55 crop any parts of the image. And also when zooming  in, it will try to match the aspect ratio that you  

  • 00:26:02 have given like this. Okay, the first stage has  been completed. Now the second stage is resizing  

  • 00:26:08 them into the exact resolution. This will crop  the subject if it is necessary. Like cropping the  

  • 00:26:14 body parts to match the exact resolution. So this  takes the parent folder, not this folder. This is  

  • 00:26:21 not the folder, but this is the folder that I need  to give. And I need to change the resolution that  

  • 00:26:26 I want. So this will look a subfolder named  as exactly like this. You can have multiple  

  • 00:26:32 resolutions actually. For example, in the image  cropper, I can add here another resolution. Let's  

  • 00:26:38 say 16:9. So this is the resolution of 16:9 for  Qwen image model. Let's add it like 1744 to 992.  

  • 00:26:48 Let's start processing. It will process this new  resolution as well. And I am going to see a folder  

  • 00:26:54 generated here in a minute when it is processed.  Okay, it is started processing. Now it will try  

  • 00:27:00 to match this aspect ratio. It may not match it  exactly. Why? Because it is not going to crop any  

  • 00:27:07 body parts. So you see this image cannot match  that aspect ratio. This is not a suitable image  

  • 00:27:13 for that. This is almost still square. However,  in the second tab, when I go to image resizer,  

  • 00:27:18 when I type it, you see I have given the parent  folder. Let's wait for this one to finish. Okay,  

  • 00:27:25 it is almost finished. By the way, if you use this  YOLO, it is faster than SAM2. So just delete this  

  • 00:27:32 and select your class from here. It supports so  many classes to focus on them. Okay, it is done.  

  • 00:27:38 Now I am going to make the output folder as Final  Images like this. And I will click Resize Images.  

  • 00:27:45 You can also make resize without cropping, so  it will make padding expansion. So let's resize  

  • 00:27:52 images. I recommend cropping; it is better. Then  let's go back to our folder Final Images. Okay,  

  • 00:27:59 in here you will see that it has cropped the body  parts, resized it into the exact resolution like  

  • 00:28:06 this. And these are the square images. They  are much more accurate than the other ones.

  • 00:28:12 Now I have my images ready. However, this is not  a very good collection of images. It is another  

  • 00:28:20 thing that you need to be careful of. I have used  these images to train the models that I have shown  

  • 00:28:26 you in the beginning of the tutorial. So when we  analyze these images, what do you see? I have full  

  • 00:28:32 body pose like this. I have half body pose. I  have very close shot. And when you have images,  

  • 00:28:39 what matters is that it should have good  lighting, good focus. These two are extremely  

  • 00:28:46 important. It should be very clear. All of  these images are captured with my cheap phone,  

  • 00:28:52 so they are not taken with a professional camera.  For example, when we look at this image, you see  

  • 00:28:57 it is not even a very high quality. This is how  it looks. And this is a real image. This is a raw  

  • 00:29:26 image. And when we look at the AI generated image,  as you can see, it is even higher quality than my  

  • 00:29:33 raw image. And therefore, you should add highest  possible quality images into your training dataset  

  • 00:29:41 to get the maximum quality images. What else  is important? You should try to have different  

  • 00:29:48 clothings so it will not memorize your clothing.  This is super important. Try to have different  

  • 00:29:53 clothings, different times, different backgrounds.  All of these will help. Whatever you repeat in  

  • 00:29:59 your training dataset, the model will memorize  them. You don't want that. You want only yourself  

  • 00:30:06 or the subject if you are training a style, the  style, or an object, the object, to be repeated.  

  • 00:30:12 Nothing else. I will explain them in the style  and the item training, the product training part.  

  • 00:30:17 And one another thing is that you should add the  emotions that you want. If you want smiling, you  

  • 00:30:24 should add it. If you want laughing, you should  add it. So whatever the emotion you have will make  

  • 00:30:31 100 percentage quality difference in your outputs.  Try to have all the emotions you want. But that is  

  • 00:30:39 not all. Also try to have all the angles you want.  If you want to generate images that looks down,  

  • 00:30:46 you should have an image that has a look down  like this. Or from this angle, this angle.  

  • 00:30:51 Whatever angle. So do not add the angles and  poses that you don't want after training. And  

  • 00:30:58 add the poses and the angles you want to generate  after training. So if we summarize again: have the  

  • 00:31:07 emotions, have the poses, have the angles, have  different backgrounds, have different clothings,  

  • 00:31:14 have highest possible quality lighting and focus.  Do not have blurry backgrounds. Do not have  

  • 00:31:21 fuzzy backgrounds. They will impact your output  quality. So in the AI world, whatever you give,  

  • 00:31:28 you get it. And with this medium quality  dataset, I am able to generate amazing images.  

  • 00:31:33 If I increase the number of images, the variety  in these images, I can get even better quality.

  • 00:31:38 Okay, now you understand how to prepare your  dataset. There are few tricky issues with the  

  • 00:31:44 Z-Image Turbo model training. The best quality  for this model is 1536 pixels. So make your images  

  • 00:31:53 1536 pixels if they are bigger resolution.  Not 1328 or not 1024. 1536 pixels. Another  

  • 00:32:02 thing is that this model is extremely aspect ratio  dependent. So if you want to generate your images  

  • 00:32:10 with a certain aspect ratio, then you should  prepare your images with that aspect ratio. What I  

  • 00:32:17 mean by that? For example, currently my images are  all square like this. You see like this. However,  

  • 00:32:24 if I want 16:9 aspect ratio to generate images  after training, then I should set my aspect ratio  

  • 00:32:32 according to like this. So if I use this image,  this aspect ratio for training instead of square  

  • 00:32:38 aspect ratio, this model will able to generate  images better in that aspect ratio. So therefore,  

  • 00:32:45 I recommend you to prepare your images with  your desired aspect ratio after training. So  

  • 00:32:51 you see this generation was square; therefore,  it is much more natural and accurate compared  

  • 00:32:56 to the 16:9 aspect ratio. I had to generate a lot  of images to get accurately looking 16:9 aspect  

  • 00:33:04 ratio images. So decide your aspect ratio,  whichever the aspect ratio that you want to  

  • 00:33:09 generate your images after training, and based on  that aspect ratio, prepare your training images.

  • 00:33:16 Okay, once the training images are ready,  let's go back to New Job, Show Advanced,  

  • 00:33:21 copy the configuration, Show Simple. Now, training  name is the final file names that you are going to  

  • 00:33:29 get after training. Since I have done previously a  training, I will use this name so it will generate  

  • 00:33:34 names like this so that I can show you how to find  the best checkpoint. And you will see that our  

  • 00:33:40 configuration is already using Version 2 adapter.  This may get updated later because the Ostris is  

  • 00:33:48 working on adapter. And what is this? Because this  model is turbo model, it is distilled model. Until  

  • 00:33:56 the Alibaba team publishes the main model, we are  using distilled turbo model. So to not break the  

  • 00:34:04 distillation of the model, we are using a trick.  This adapter model is merged with the turbo model  

  • 00:34:10 during training automatically. We don't do that.  So that it behaves like a main model. So therefore  

  • 00:34:16 our trained LoRA is not breaking the turbo model.  This is not the highest quality. This is not the  

  • 00:34:21 quality that we are going to get from the base  model, but it is working. And we don't change  

  • 00:34:25 this. It will download it into Hugging Face  cache. You don't change anything here. What  

  • 00:34:30 you can change here is that save every N steps.  So this trainer doesn't work with epoch based;  

  • 00:34:37 it works with the steps based. Therefore,  I recommend you to calculate your number of  

  • 00:34:43 steps based number of the training images. I  have 28 images, so I recommend 200 epochs for  

  • 00:34:49 this number of images. That makes 5,600 steps. And  if you want 10 checkpoints, make this 560. If you  

  • 00:34:56 want 8 checkpoints, make it 800 steps. So based  on these steps, it will save those checkpoints.  

  • 00:35:03 And max steps saves to keep. So let's, if I make  this 3, it will keep only 3, and it will delete  

  • 00:35:10 the previous ones. This is how it works. So I will  keep all of them. And you don't change anything  

  • 00:35:15 here. They are all set. In the dataset, you select  your training dataset. This is my dataset. And  

  • 00:35:21 you don't change anything else here from the  configuration. And that's it. You can also have  

  • 00:35:27 samples during generation. Currently I did set  it 250,000 steps, so they are never generated.  

  • 00:35:33 But if you want them to be generated like every  100 steps or maybe 200 steps, you can generate  

  • 00:35:39 them and you can see them if they are good or  not. But I am not doing that. It is up to you.

  • 00:35:46 Moreover, there is skip first sample, which  generates the samples just when begin the  

  • 00:35:53 training. I'm not using that either, but you can  disable this and it will generate samples. And  

  • 00:35:59 there is trigger word. I am only training with a  trigger word right now. Since I don't provide any  

  • 00:36:05 caption in my dataset, it is trained entirely  with this. But if you also provide captions  

  • 00:36:12 with your dataset like Test 1 caption, then  it will append the trigger word. I think it  

  • 00:36:18 is appended into beginning. So it will become  like "ohwx test 1" during the training. However,  

  • 00:36:23 I don't recommend the captions. And you see we  are losing our configuration. This is annoying,  

  • 00:36:30 I know that. The AI Toolkit must get  save and load configuration. Okay,  

  • 00:36:36 I need to make the file name again like this.  Okay, and this was 700 steps. This was 5,600.  

  • 00:36:46 Okay. So I have compared all this. I have tested  all of them for you. I have done so many grid  

  • 00:36:52 tests to find everything. I did so many trainings.  For example, let me show you some of the trainings  

  • 00:36:59 that I have made. You see all these different  parameters I have tested. I have tested 1024,  

  • 00:37:06 1280, combination of the different resolutions  like these 3. And the best yielding resolution  

  • 00:37:14 for this model is 1536. However, if  you want to speed up your training,  

  • 00:37:19 if this becomes too slow for you, then you  need to disable this and use like 1280 or  

  • 00:37:25 just use 1024. Or you can enable all 3.  Too much speed comes with 1024. But the  

  • 00:37:31 best quality is with 1536 for the Z-Image Turbo  model. And let's select our training dataset. I  

  • 00:37:39 am also not using any caption dropout. So these  are all my settings. And then click Create Job.

  • 00:37:46 Now this training is queued. Not automatically  started. So you can either click from here  

  • 00:37:52 to start or you can go to Training Queue and  click the play icon from here. And you see it  

  • 00:37:58 shows Queued, then you click start. And it will  start the queue processing. Let's see what is  

  • 00:38:04 happening in our VRAM. Currently I'm recording  video with my second GPU, so my initial GPU is  

  • 00:38:11 empty. I just need to close the SwarmUI; it will  become zero. Yes. So you can see the speed. I  

  • 00:38:16 have tested the speed. It is same on Windows and  Linux. This is surprising, but maybe it is good,  

  • 00:38:23 I don't know. Because I am not sure whether it is  fully utilizing the GPU or not. I don't see it is  

  • 00:38:28 fully utilizing on Windows. But it is fairly fast.  And I can speed up the training significantly with  

  • 00:38:34 dropping the resolution. So when you click this  icon, it will show you the training window like  

  • 00:38:39 this. You don't see the training on the started  CMD window. You will see it from here. When you  

  • 00:38:44 refresh this page, let's go to Dashboard, you  need to click this icon to see it. Don't forget  

  • 00:38:49 that. So it will show you some of the statistics,  and it will show you what is happening. This is  

  • 00:38:54 actually what is happening on the training. This  is the window that you are going to follow. It  

  • 00:38:59 will first cache then unload the text encoder.  I did set everything for you. These parameters  

  • 00:39:05 are really optimal. Let's say you have 100 images  as a training dataset. Then set it 10,000 steps.  

  • 00:39:11 I don't know how many steps you can do until the  model breaks, but I can say that up to 10,000  

  • 00:39:18 steps you are safe. Maybe even 20,000 steps. You  need to test. And since we are saving checkpoints,  

  • 00:39:23 we will compare. The checkpoints will appear  here. This is very useful because on RunPod and  

  • 00:39:29 MassedCompute, you can directly download them  from here. They will appear here. They will be  

  • 00:39:35 saved inside the folder. So it is started. But to  get checkpoints quickly, let me show you how they  

  • 00:39:43 will appear. So I will pause this and you see it  says that you can resume. But you can resume from  

  • 00:39:49 the last checkpoint. Therefore if there weren't  any checkpoints, it will start from the beginning.  

  • 00:39:55 However, if you had checkpoints, it will resume  from the last checkpoint. This is how it works.

  • 00:40:00 So I will modify and I will save checkpoints every  5 steps. So let's make this every 5 steps. Update  

  • 00:40:08 Job. Then click Play. And it will start again. So  I will start getting checkpoints here. Therefore  

  • 00:40:14 I can show you how they are appearing. When I  start it again, it will not cache the images and  

  • 00:40:22 text captions again since they were already  cached. So where they are stored? When you  

  • 00:40:27 go back into your AI Toolkit installation, in  the output, your trainings will be saved here.  

  • 00:40:33 When I enter inside this training, you see they  are named with the same name as the name that  

  • 00:40:39 I have set or the name I set here. So they will  be saved with that name. It shows the log, PID,  

  • 00:40:46 and it will save the checkpoints here. We will see  in a moment. You can also set the checkpoints from  

  • 00:40:52 the settings. You can set the training folder  path. So you see, this is the path where it  

  • 00:40:57 will save the checkpoints. You can change this  and it will save the checkpoints there. And you  

  • 00:41:03 see this is the datasets folder. So you can change  both of these and save settings. Let's go back to  

  • 00:41:08 Dashboard. Let's see our training. So we see the  steps. It will start from the first step. Okay,  

  • 00:41:14 it started. It is doing the steps. The speed  is 6.67 seconds for RTX 5090 with the very  

  • 00:41:22 high quality, highest quality. Okay, we got  the first checkpoint. So it is appeared here.  

  • 00:41:27 When I click this download, it will download. So  you can use this in RunPod or MassedCompute. And  

  • 00:41:32 when I go back to outputs, I will see the  checkpoint here. So it is saved like this.  

  • 00:41:37 Now I need to move this into my LoRA folder  and test it. And you see it also saved an  

  • 00:41:44 optimizer PT file. So now that I can pause and  continue. So let's get the next checkpoint. Yes,  

  • 00:41:51 it is saved here. Now I will pause it and I will  resume it. Let's see how it continues. Okay,  

  • 00:41:57 click play again. Now it should resume from the  last checkpoint which was 10 steps. Let's see what  

  • 00:42:03 happens. So the speed was 6.7 seconds. I also  shown the speeds in experiment speeds folder.  

  • 00:42:09 You can see the speeds of different resolutions:  1024, 1280, combination mixed, 1536. These are all  

  • 00:42:18 speeds of the RTX 4090. Not 5090. This is 5090  speed. And this is RTX 3060 speed. So we have  

  • 00:42:26 the speeds. Okay, nice. So it is continuing  from the last checkpoint which was 10 steps.

  • 00:42:32 So this is how you can continue your training  with AI Toolkit. These things are not specific  

  • 00:42:38 to the Z-Image Turbo model. With any model  these are applying. But for each model,  

  • 00:42:45 you need to research a new configuration. That  is important. If you use the default values,  

  • 00:42:50 you probably won't get the best results. Okay,  so this is the speed. I can increase the speed  

  • 00:42:54 with reducing the resolution. And what are the  differences between the default versus our best  

  • 00:43:01 training? I have it. So let me show you. This  is default configuration, which the Ostris shown  

  • 00:43:09 in his tutorial, and this is our config. This is  default; this is our config. Default, our config.  

  • 00:43:15 Default, our config. Our configuration yields much  better results. Default, which the Ostris shown,  

  • 00:43:34 our config. You see there is a massive quality  difference between default versus ours. And  

  • 00:43:38 these are not cherry-picked images; these are  grid generations. Default, our config. Default,  

  • 00:43:44 our config. There is a huge difference between  default and our config. Default, our config.  

  • 00:43:48 Default, our config. You see there is 2 persons  because we are generating in high resolution. So  

  • 00:43:54 we need to generate just more images to get.  Default, our config. Another prompt. Default,  

  • 00:43:59 our config. There is a massive difference.  Default versus our config. Default,  

  • 00:44:04 our config. And these are grid images. And you  see it even learned my broken teeth. I have a  

  • 00:44:10 broken teeth here. Maybe you noticed that. It  learned it slightly. And this is a turbo model,  

  • 00:44:16 not like a base model. So this is pretty good,  pretty accurate. So this is how we train.

  • 00:44:21 After training, how you are going to test  them? It is same as our other tests that  

  • 00:44:27 I have shown in other tutorials. So you will  have checkpoints like this in the output folder  

  • 00:44:32 once the training finished. Move them into your  LoRA folder. So I have them in my LoRA folder.  

  • 00:44:39 Then start your SwarmUI after put them in. Or if  you are already running, it is fine. Go to LoRAs,  

  • 00:44:46 refresh. Then let's reset params to default.  Let's go to presets. Select our preset again.  

  • 00:44:52 Direct apply. And go to Tools and select  the Grid Generator. Let's say Test 1,  

  • 00:44:58 whatever the name you want. From here select LoRA.  Type your LoRA name. My LoRA name is like this at  

  • 00:45:04 all. So it adds all the LoRAs. The last one goes  here. So it is the last checkpoint. So they are  

  • 00:45:10 also sorted like this. Then the prompt. You can  use any prompt you want like prompt like this.  

  • 00:45:15 To separate prompts use this character. So each  prompt will be different. However, this is not a  

  • 00:45:21 proper prompt. So I am going to use the example  prompt which I have provided in the zip file.  

  • 00:45:27 When you go here you will see Test Grid Prompts  and Grid Format. Copy this. You can change this  

  • 00:45:32 according to your training. And generate grid. Now  this will generate a grid for me based on these  

  • 00:45:39 LoRA checkpoints so I can see them. So let's go  to here and see that in real time. Okay. So from  

  • 00:45:46 here. Okay, LoRA prompt. This is true. Sometimes  you need to play with this to see. As the images  

  • 00:45:52 generated, they will appear. Do we have an error  somewhere? Why did it? Okay, we have forgotten  

  • 00:45:58 reset params to default. Therefore the generation  failed. You see because the ControlNet is open.  

  • 00:46:04 So always reset params to default. Don't forget  that. Okay. Then let's select the preset one more  

  • 00:46:11 time. Direct apply. Let's go back to Tools and  generate grid. Because otherwise you will get  

  • 00:46:16 error as I just had because the ControlNet  was enabled. Now you just need to wait for  

  • 00:46:22 processing to be finished. And we will be able to  compare the grid, the quality. So this is first.  

  • 00:46:28 As you can see on 5090 it is pretty fast. Every  image taking like 8 seconds. I don't need to wait  

  • 00:46:34 more. But you can see that it is very undertrained  in early steps. And it will get better trained up  

  • 00:46:40 to the last steps. We will see it. Even early  steps there is some resemblance. I prefer this  

  • 00:46:46 over generating samples during the training.  It's a choice. I find this better because I  

  • 00:46:52 don't lose time with the training process. And  this is the most proper way of testing in my  

  • 00:46:58 opinion. Not using the samples generated during  the training. Sometimes they may be inaccurate.

  • 00:47:03 Okay, so the grid generation has been completed.  Let's refresh. Now compare the checkpoints. 700  

  • 00:47:09 steps, 1400 steps, 2100 steps. So you see this way  it goes. Decide which one is best for you. I can  

  • 00:47:18 say that the last one is the best. So you see this  is very best. If you can't decide based on this,  

  • 00:47:24 what you can do is you can make this Test 2 and  generate another grid. So this way compare until  

  • 00:47:31 you decide which one is working best. Moreover,  trained LoRAs working with the ControlNet Union as  

  • 00:47:38 well. The only thing is that set your ControlNet  strength to 0.6, so it is 60 percentage. And then  

  • 00:47:45 type your prompt. With just this simple prompt,  "Photo of ohwx man wearing an amazing suit",  

  • 00:47:51 I am able to get amazing quality images with my  trained LoRA by using this reference image as a  

  • 00:47:59 ControlNet. So it is fully working same way as  using the base model with our trained LoRAs.

  • 00:48:05 Now I will show you how to install and  use Ostris AI Toolkit on RunPod. Then  

  • 00:48:10 on MassedCompute. To use on RunPod, always  follow RunPod instructions txt file that I  

  • 00:48:17 have. Always. I have this file in all of my  applications. For RunPod and MassedCompute,  

  • 00:48:23 always follow them. So let's open this. First  of all, please register RunPod from this link.  

  • 00:48:28 I appreciate that. This enables me to do more  research on RunPod. This helps me significantly  

  • 00:48:36 because these trainings are using huge amount  of money. So you see I have spent 90 dollars on  

  • 00:48:42 a single day for Z-Image research. And then 10  dollars. Once you are here, go to Billing and set  

  • 00:48:50 some credits. Pay with your card, whatever. Then  go to Pods. You can also use permanent storage,  

  • 00:48:57 which I use it. I also have a dedicated tutorial  for that. So you see we have RunPod permanent  

  • 00:49:04 network storage tutorial. But I will show on a  single pod this time. I recommend you to limit  

  • 00:49:10 your region to the US starting from bottom. These  are the best GPUs. For this Ostris AI Toolkit,  

  • 00:49:18 you can use RTX 4090. This is most performant  price option. If you want more speed, you can use  

  • 00:49:25 RTX 5090. The bigger ones are useless because it  fits into VRAM. So let's see where we have. Okay,  

  • 00:49:31 we have it here. Moreover, from additional  filters, select 100 RAM and NVME disk. Okay,  

  • 00:49:38 we don't have it. So we need to check again. US  NC2, NC1, MO2, MO1, MD1, KS. Okay, here. We have  

  • 00:49:50 it. Then change this template to whatever template  the instructions file tells you. The instructions  

  • 00:49:57 is telling this. Then you need to select this. If  you get an error here, like when I select 5090,  

  • 00:50:03 it will tell me that you need to use this.  This is wrong. Why? Because I am installing  

  • 00:50:09 applications into a virtual environment. I am not  using the template environment. That is why. So  

  • 00:50:15 don't believe whatever the RunPod is telling you.  Use the template that I write in my instructions.  

  • 00:50:24 So we are going to use this official PyTorch 2.2.0  template. This is super fast. Then click this edit  

  • 00:50:32 template and add an port here: 8675. This is super  important. Otherwise you won't be able to connect  

  • 00:50:41 the Ostris AI Toolkit interface. And set your  volume disk according to your, you know, needs.  

  • 00:50:48 If you are going to get too many checkpoints,  if you are going to train bigger model,  

  • 00:50:51 you need bigger. But for Z-Image Turbo model, 200  is sufficient. And deploy on demand. Since this  

  • 00:50:58 is using official template, it will be super fast  to initialize. Sometimes it doesn't show here, so  

  • 00:51:04 refresh to see the status. If it gets initialized  or not. It should be very fast initialized. Okay,  

  • 00:51:11 details, telemetry, refresh. Okay, it is  initialized. So it took like 20 or 30 seconds  

  • 00:51:18 because this template is very lightweight. Then  click Jupyter Lab. Sometimes Jupyter Lab may also  

  • 00:51:23 not get loaded. You need to refresh. If it doesn't  get loaded, delete the machine and get a new one.

  • 00:51:28 First of all, verify that its GPU is working. pip  install nvitop. Then type nvitop. And you need to  

  • 00:51:36 see your GPU like this. Otherwise just delete the  pod and move to new one. Then upload the zip file  

  • 00:51:43 into here like this. This is important. Wait  for upload to be completed. It is uploading in  

  • 00:51:48 the bottom as you see. Then right click and  extract archive. Then click refresh. Okay,  

  • 00:51:53 it is extracted. Open the RunPod instructions read  txt file. Copy this command. Open a new terminal.  

  • 00:52:01 Paste it and hit enter. This is it. This will  install everything fully automatically. Including  

  • 00:52:06 the Node.js or whatever libraries. You don't need  to do anything else. Once the installation has  

  • 00:52:12 been completed, we are going to use this to start  the application. If you restart your pod again, if  

  • 00:52:19 you want to use again, you just need to run this  command. But since this is first time, we just  

  • 00:52:24 need this to start once the installation has been  completed. The installation speed totally depends  

  • 00:52:29 on the pod that you got. If it is a fast pod, if  you are lucky enough, it will be fast. Otherwise  

  • 00:52:35 it will take time. But since we have made some  filtering, we have selected from a certain region,  

  • 00:52:41 I can say that this pod is fast. There are  major advantages of my installers on RunPod  

  • 00:52:47 instead of using a RunPod template. It always  installs the latest version of the AI Toolkit.  

  • 00:52:55 It supports all of the GPUs that are available on  RunPod. Not certain type of GPUs. So installation  

  • 00:53:02 of latest version and this GPU support makes it  much more advantageous because you always use the  

  • 00:53:09 latest version of the application. Moreover  the installation is fairly fast since it is  

  • 00:53:14 extremely optimized by myself. So the installation  is getting completed. You can ignore these warning  

  • 00:53:21 messages as I have also explained in the Windows  tutorial part. You need to watch Windows tutorial  

  • 00:53:27 part to learn it. Moreover if you don't know how  to install and use SwarmUI and ComfyUI on RunPod I  

  • 00:53:34 have an excellent up to date tutorial. The link  will be in the description of the video. This  

  • 00:53:39 tutorial. So watch it to learn how to use SwarmUI  and ComfyUI on RunPod. I won't explain that part  

  • 00:53:43 in this video. I will only show Ostris AI Toolkit  usage on RunPod in this tutorial. So watch this  

  • 00:53:50 tutorial to learn how to use SwarmUI and ComfyUI  on RunPod. The link will be in the description  

  • 00:53:55 of the video. Okay installation almost completed.  All right the installation has been completed. Now  

  • 00:54:01 we will run this starting command terminal. Copy  paste the starting command. The starting should  

  • 00:54:07 be fairly fast. You see it also give us a network  link like this but it is not working in RunPod. So  

  • 00:54:16 we need to connect from RunPod proxy. So go back  to your My Pods click and click HTTP service 8675.  

  • 00:54:25 It will open the AI Toolkit. And we got the  interface. The rest is exactly same. First  

  • 00:54:31 of all make your dataset. You can also upload  your dataset into AI Toolkit Datasets folder  

  • 00:54:38 or you can click dataset my dataset like this  create. Then click add images and you can drag  

  • 00:54:45 and drop the files as I have shown in the Windows  tutorial part. It will upload them to the RunPod.  

  • 00:54:52 We will see the dataset here. Yes. You see the  datasets. My dataset. Exactly same. The upload  

  • 00:54:59 will take some time because the RunPod is slow.  They will appear here once processed. Okay. Let's  

  • 00:55:07 see what's happening. If it doesn't work you can  also drag and drop them to here like this. It will  

  • 00:55:12 upload from the Jupyter Lab interface. Then you  can refresh to get it. Yes it is uploading. So  

  • 00:55:19 let's use this way. Either way should work.  We can just refresh. So you see the data set  

  • 00:55:25 images will appear here. Then click new job, show  advanced. Again, same as in the Windows tutorial  

  • 00:55:32 part. Let's select our config, like this one,  copy-paste, show simple, select your data set.  

  • 00:55:38 I'm not going to repeat the Windows tutorial part.  Create job and click training. So that we can see  

  • 00:55:44 the training on RunPod. It should be fairly fast.  First, it will download the necessary model,  

  • 00:55:50 then it will start the training. Let's just wait  and see the logs. So you see, it is downloading  

  • 00:55:56 model into our workspace, fairly fast. We can see  the speed. Okay, so the training has been started.  

  • 00:56:03 You can see the step speeds here. It will also  show the step speed here after a while. Currently  

  • 00:56:09 like 8 seconds IT. You need to wait a little bit.  It is using the GPU 100 percentage. The memory  

  • 00:56:14 usage is around 90 percentage. So RTX 4090 is very  good on RunPod as a price performance. Therefore,  

  • 00:56:24 I recommend it. If you want faster, go with  the RTX 5090. It's a little bit faster. Again,  

  • 00:56:30 as I have shown in the Windows tutorial part, you  can look at the speeds. These four are for RTX  

  • 00:56:36 4090. This is RTX 5090, and this is RTX 3060. So  the speed is decent. It will take like 13 hours.  

  • 00:56:42 Maybe it will take lesser, but let's say 13 hours.  The cost would be like $10, maybe lesser than $10,  

  • 00:56:50 0.62 with 13, like $8. If you want faster, as  I have shown in the tutorial, just change the  

  • 00:56:59 resolution. It will become four times faster. It  will take four times lesser. That is the strategy,  

  • 00:57:05 but it will lower the quality. I don't recommend  it. As I have shown in Windows tutorial part,  

  • 00:57:10 it will lower the quality significantly. Then you  will get the checkpoints here, so you can download  

  • 00:57:15 them from here or from my pods, go back to AI  toolkit outputs, they will appear inside here,  

  • 00:57:23 so you can download them from here too.  That is it. Then you can terminate your pod,  

  • 00:57:28 stop your pod. These are the same. If you want to  stop your pod after training has been finished,  

  • 00:57:35 we also have a command for it. So you see, this  will stop your pod. How to do it? You need to get  

  • 00:57:41 your pod ID and paste it here. So this is seconds.  Let's stop our pod in 20 seconds. So copy this,  

  • 00:57:49 open a new terminal, paste it. Now in 20 seconds,  we will see that our pod has been stopped. Let's  

  • 00:57:57 see. Okay. Okay, it should be any second. I didn't  count. We will see the command has been executed.  

  • 00:58:05 This way, you can sleep. Okay, it is stopped. So  now when I refresh this page, I should see it is  

  • 00:58:11 stopped. Yes. So it won't spend my money.  If you have any questions, you can ask me.

  • 00:58:17 Now, the MassedCompute part will begin. Okay,  now I will show the MassedCompute part. For  

  • 00:58:24 MassedCompute part, we are going to follow  MassedCompute instructions. This is same  

  • 00:58:29 for all of my applications. Always follow the  instructions.txt file. Please use this link  

  • 00:58:35 to register MassedCompute. I appreciate that.  Login your account after registration, go back  

  • 00:58:40 to billing and add some credits. Once you have the  credits, go to deploy. For Z Image Turbo version,  

  • 00:58:49 my recommended GPU is L40S. But you see, all of  them is currently occupied, and they are hopefully  

  • 00:58:57 going to add new GPUs soon, they told me. So what  can we use alternatively? We can use RTX 6000 ADA,  

  • 00:59:06 but they are also all full. Yes, there are no  RTX 6000 ADA GPU. Therefore, to the cheapest one,  

  • 00:59:13 which would take time, we can use RTX A6000  premium. This is the cheapest one. If you want  

  • 00:59:19 speed, you can use A100 or H100. So let's go with  the cheapest option, RTX A6000. So let's select  

  • 00:59:27 the category creator, select the image SE Courses.  So you see, currently this is $0.56 per hour.  

  • 00:59:34 We are going to apply our coupon, SE Courses,  verify, and it's only 42 cents. Deploy. You see,  

  • 00:59:41 I have selected premium version. This premium  version is the best one. It has the most RAM  

  • 00:59:46 memory. Therefore, I recommend you to pick this  one if you are going to use RTX A6000. However,  

  • 00:59:53 my recommended GPU, as I said, for Z Image  Turbo version, L40S or RTX Pro 6000 if they  

  • 01:00:01 are available. If they are not available, RTX  6000 ADA, this GPU. If none of them is available,  

  • 01:00:08 you can use A100, H100, depending on your budget,  or RTX A6000 and the premium version. Now we need  

  • 01:00:16 to wait for the initialization. When you click  the running instance, when you refresh this page,  

  • 01:00:22 you will see it. Wait for initialization to be  completed. Meanwhile waiting for initialization,  

  • 01:00:28 click details, and you see there is ThinLinc  client. If you haven't installed it yet,  

  • 01:00:34 we are going to use it. Download according to your  platform. I am on Windows, so let's download this.  

  • 01:00:40 Let's start it. Yes, next, accept, select the  options, run ThinLinc client. Once the ThinLinc  

  • 01:00:47 client started, click options, go to local  devices. Just have clipboard synchronization  

  • 01:00:52 and drivers enabled. Click drivers details and  add a folder from your computer like this one.  

  • 01:00:59 You see there is add and remove, or you can  copy-paste the path from here. Make sure that  

  • 01:01:03 it has read and write permission. Click okay  and click okay. Then you just need to wait for  

  • 01:01:08 initialization to be completed. Sometimes refresh  the page to be sure. Okay, the machine has been  

  • 01:01:14 initialized. Before connecting it, I recommend you  to put your training images, let's copy it, into  

  • 01:01:21 your shared folder. So my shared folder is here.  Copy-paste them there. Moreover, also copy the  

  • 01:01:29 downloaded installation zip file into your shared  folder. Then you are ready. Then click this IP,  

  • 01:01:36 it is copied. Copy-paste it here. You see there  is username. Copy the username, copy-paste here,  

  • 01:01:42 and user password. You cannot transfer big files  with ThinLinc client. You need to use like Google  

  • 01:01:49 Drive, OneDrive, or Hugging Face. We have Hugging  Face upload and download notebook as well. So  

  • 01:01:55 this is only for small files, like your training  images or like installation zip files. Remember  

  • 01:02:01 this. The big files will be very slow or will not  work. Once you are in this screen, go to home, go  

  • 01:02:08 to Thin Drives, MassedCompute shared folder, wait  for synchronization to be completed. Sometimes it  

  • 01:02:15 can take time depending on your internet. Okay.  Then select your installation zip file, drag and  

  • 01:02:22 drop it into downloads folder. Moreover, drag  and drop your training images as well. This is  

  • 01:02:29 not mandatory. We will be able to upload from the  interface as in the Windows tutorial part, but you  

  • 01:02:35 can have it. Then extract the installation in the  downloads folder. Do not run anything in ThinLinc  

  • 01:02:42 client driver, in the shared folder. Always copy  files into downloads. Enter inside the folder,  

  • 01:02:48 double-click MassedCompute instructions, copy this  installation command, click these three dots icon,  

  • 01:02:55 open in terminal inside this extracted folder,  right-click and paste. This will do the entire  

  • 01:03:02 installation of the AI toolkit on MassedCompute.  Now just wait for installation. This will be  

  • 01:03:07 really fast compared to the RunPod. MassedCompute  is super fast. Then meanwhile it is installing,  

  • 01:03:14 copy-pasting the training files during the  installation will give you speed up, will reduce  

  • 01:03:20 your timing. So that's an advantage. But you see,  this ThinLinc client for transferring files is  

  • 01:03:28 very slow. It is really, really slow. Therefore,  for big files, you need to use like Google Drive,  

  • 01:03:34 OneDrive, big cloud services like Hugging Face.  Okay, the training images have been copied.  

  • 01:03:40 So while installing the AI toolkit, I will copy  this, enter inside the AI toolkit, enter inside AI  

  • 01:03:47 toolkit, and here, make a new folder named as data  sets because it is not automatically generated.  

  • 01:03:55 Copy-paste it here. So when we start the  application, our data set will be ready. Still,  

  • 01:04:00 as in the Windows tutorial part, you can use  the interface to upload from these data sets,  

  • 01:04:06 new data set, type your data set like test, and  you will be able to upload from this interface  

  • 01:04:12 as well. Okay, installation is continuing. When  you get this window, just click cancel. Moreover,  

  • 01:04:19 when you start Google Chrome, it may ask you  something as login or something. Okay, it didn't  

  • 01:04:25 ask. If you get that error, just click cancel.  So you don't need to update software installed on  

  • 01:04:32 MassedCompute. Just click cancel to all of them.  Moreover, I won't show you how to use SwarmUI on  

  • 01:04:38 MassedCompute because we have fully up-to-date  tutorial for MassedCompute. You see this one,  

  • 01:04:44 ComfyUI and SwarmUI on cloud GPUs tutorial. The  link will be in the description of the video.  

  • 01:04:48 So watch this to learn that part. I will just  show how to use AI toolkit on MassedCompute,  

  • 01:04:55 not the how to use SwarmUI and ComfyUI and  do the grid generation or other stuff as I  

  • 01:05:01 have shown in the Windows tutorial part. The  biggest advantage of my installer is that it  

  • 01:05:06 always installs the latest version of the AI  toolkit trainer. Moreover, it supports all of  

  • 01:05:12 the GPUs with the latest pre-compiled drivers  with flash attention, xformers, sage attention,  

  • 01:05:18 Torch version, CUDA version. Therefore, my  installers are really better than using the  

  • 01:05:24 templates. So during the Node.js installation, it  is all automatic. You may get some warnings, just  

  • 01:05:30 ignore them because it will work. My installer  is fully optimized and made it so easy that you  

  • 01:05:37 just run these two lines of command. It handles  everything, all the setup for you. Okay, so the  

  • 01:05:43 installation has been completed. You can scroll  up to see if there are any errors or not. Then  

  • 01:05:49 return back to your folder, open the MassedCompute  instructions txt file again, and copy this part.  

  • 01:05:57 This is for starting. Then open three dots here,  open in terminal, copy-paste it. We always run the  

  • 01:06:03 commands inside installed folder. This is super  important. So it has been started. You can either  

  • 01:06:10 use the local link like this, or if you want to  connect from your computer, which I recommend,  

  • 01:06:15 open link like this. So you see this is public  link, and now I can connect from my computer. So  

  • 01:06:22 let's see. It says, yes, it says it is not secure.  Continue site. This is totally fine. And now,  

  • 01:06:28 yes. So you see it is running in MassedCompute,  but I am connected from my computer. The data  

  • 01:06:34 set will be here since I have copy-pasted it,  or I can click copy data set. I can type GG,  

  • 01:06:40 create, then I can add images. I can drag and drop  images from here to upload. However, copy-pasting  

  • 01:06:47 from the disk is better than here in my opinion.  Okay, let's refresh. We don't need it. Then click  

  • 01:06:54 new job as in the Windows tutorial part, show  advanced, select the configuration from the zip  

  • 01:06:59 file inside Z Image Turbo Lora configs. So since  this GPU is 24 GB, copy-paste it, show simple,  

  • 01:07:06 give a name to your training, whatever you  want, and then in the data set, select it.  

  • 01:07:12 As in the Windows tutorial part, you need to  set your save every N steps and step count.  

  • 01:07:17 Watch the Windows tutorial part. Don't skip it.  Then create job and then click play. So it will  

  • 01:07:22 first download the necessary models, then it will  start training. Then the checkpoints will appear  

  • 01:07:28 here so that I will be able to download from here  or in my machine in the AI toolkit installation,  

  • 01:07:37 they will be inside the output folder. So they  will be inside here. To download from here,  

  • 01:07:42 I can use my notebook, my Jupyter Lab notebook,  or you can use Google Drive, OneDrive, or you can  

  • 01:07:50 use the ThinLinc client. However, it would be  very slow. So probably downloading from these  

  • 01:07:56 checkpoints from here will be the fastest way to  download. Let's wait until the training begins so  

  • 01:08:02 we can see the speed. Okay, so the training has  been started. It is like 18 seconds per IT. So  

  • 01:08:10 it will take 30 hours for 6,000 steps on this  GPU. It is only 42 cents. Therefore, it would  

  • 01:08:17 take like $12. However, it is up to you. You can  rent powerful GPU or you can use RunPod and 4090,  

  • 01:08:25 5090, or you can reduce the training resolution  and speed up like four times, five times. It is  

  • 01:08:32 totally up to you what you want to do, but this is  how you do it. And as the checkpoints generated,  

  • 01:08:38 they will appear here so that you can download  and use on your local computer right away. This  

  • 01:08:43 is it. I hope you have enjoyed. Don't forget  to delete your machine once you have saved  

  • 01:08:50 your generated checkpoints. Otherwise, if you use  stop, it will not stop billing on MassedCompute.  

  • 01:08:56 And MassedCompute team told me that they will  get a lot of new GPUs, hopefully very soon,  

  • 01:09:02 and maybe there will be permanent storage as well.  We will see. Keep watching. Thank you so much.

Clone this wiki locally