-
-
Notifications
You must be signed in to change notification settings - Fork 361
DeepFloyd IF By Stability AI Is It Stable Diffusion XL or Version 3 We Review and Show How To Use
DeepFloyd IF By Stability AI - Is It Stable Diffusion XL or Version 3? We Review and Show How To Use
Full tutorial link > https://www.youtube.com/watch?v=R2fEocf-MU8
I review new amazing model DeepFloyd IF-I-XL by Stability AI and show how you can use it on a free Kaggle notebook step by step. #DeepFloyd IF is claimed to be the most advanced image generative model out there, with an FID-30K score of 6.66, beating DALL·E 2, Imagen, Parti & more.
Our Discord server
https://bit.ly/SECoursesDiscord
If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰
https://www.patreon.com/SECourses
Technology & Science: News, Tips, Tutorials, Tricks, Best Applications, Guides, Reviews
https://www.youtube.com/playlist?list=PL_pbwdIyffsnkay6X91BWb9rrfLATUMr3
Playlist of #StableDiffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img
https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3
DeepFloyd IF GitHub repo
https://github.com/deep-floyd/IF
DeepFloyd IF Official Website
DeepFloyd IF Kaggle NoteBook
https://www.kaggle.com/furkangozukara/deepfloyd-if-4-3b-generator-of-pictures-video-vers
Generate your Hugging Face token
https://huggingface.co/settings/tokens
DeepFloyd IF License Agreement To Accept
https://huggingface.co/DeepFloyd/IF-I-XL-v1.0
Improved Kaggle Notebook file
https://www.patreon.com/posts/enhanced-if-file-82253574
Kandinsky 2.1 Tutorial
00:00:00 Introduction to Stability AI DeepFloyd IF
00:00:29 How DeepFloyd IF is built and how does it work
00:00:51 Architecture of the DeepFloyd IF model
00:01:10 What makes DeepFloyd IF model better
00:01:55 Strongest part of DeepFloyd IF
00:02:17 Comparison between DeepFloyd IF and other models
00:03:16 More detailed architecture of DeepFloyd IF
00:03:39 Minimum requirements to use DeepFloyd IF
00:04:18 How to register a free Kaggle account
00:04:35 How to use DeepFloyd IF on a free Kaggle notebook step by step
00:05:23 How to contact Kaggle support to activate your Kaggle account for GPU usage
00:05:40 Other Kaggle notebook settings
00:05:50 Start Kaggle session and installation
00:07:50 How to get your Hugging Face token
00:09:07 How to accept DeepFloyd IF license agreement
00:09:41 Continuing the installation of the DeepFloyd IF libraries on Kaggle
00:11:09 Starting image generation with DeepFloyd IF
00:12:55 Seeing the first ourselves generated images by DeepFloyd IF
00:14:45 Where is saved generated images
00:15:15 DeepFloyd IF vs SD 1.5 Custom Model Rev Animated comparison
00:16:05 DeepFloyd IF vs Kandinsky 2.1 comparison
00:16:18 DeepFloyd IF vs Stable Diffusion 1.5 base model comparison
00:16:39 DeepFloyd IF vs Stable Diffusion 2.1 768px base model comparison
00:16:46 Text generation performance comparison of DeepFloyd IF with other models
00:17:16 How to disable IF watermark from generated images
00:17:43 Results of text written image generation
00:18:35 DeepFloyd IF vs other models text generation comparison
00:19:19 Experiments of 4 different prompts
00:20:45 How to download all of the images as a zip file. Utilize ChatGPT to get the code
00:22:00 Examples provided on DeepFloyd AI and testing them
00:22:16 How to generate multiple different images with same prompt by using random seeds
00:24:07 How to delete all generated images in the runtime folder of Kaggle
00:25:37 How to used downloaded enhanced Kaggle notebook
IF-I-XL-v1.0
DeepFloyd-IF is a pixel-based text-to-image triple-cascaded diffusion model, that can generate pictures with new state-of-the-art for #photorealism and language understanding. The result is a highly efficient model that outperforms current state-of-the-art models, achieving a zero-shot FID-30K score of 6.66 on the COCO dataset.
Developed by: DeepFloyd, StabilityAI
Model type: pixel-based text-to-image cascaded diffusion model
Cascade Stage: I
Num Parameters: 4.3B
Language(s): primarily English and, to a lesser extent, other Romance languages
License: DeepFloyd IF License Agreement
Model Description: DeepFloyd-IF is modular composed of frozen text mode and three pixel cascaded diffusion modules, each designed to generate images of increasing resolution: 64x64, 256x256, and 1024x1024. All stages of the model utilize a frozen text encoder based on the T5 transformer to extract text embeddings, which are then fed into a UNet architecture enhanced with cross-attention and attention-pooling
Training Data:
1.2B text-image pairs (based on LAION-A and few additional internal datasets)
Test/Valid parts of datasets are not used at any cascade and stage of training. Valid part of COCO helps to demonstrate "online" loss behaviour during training (to catch incident and other problems), but dataset is never used for train.
Training Procedure: IF-I-XL-v1.0 is a pixel-based diffusion cascade which uses T5-Encoder embeddings (hidden states) to generate 64px image. During training,
thumbnail by twitter @artimindArt
-
00:00:00 Greetings everyone. In this video, I will introduce you the DeepFloyd IF model, which
-
00:00:05 is the most advanced image generative model out there with an FID 30K score out of 6.66
-
00:00:12 beating DALL·E 2, Imagen, Parti & more. It has been announced and released by Stability
-
00:00:20 AI yesterday as you are seeing right now. In their official website DeepFloyd.ai it
-
00:00:26 is explained like this: DeepFloyd IF is built with multiple neural models. DeepFloyd IF
-
00:00:34 generates high-resolution images in a cascading manner. The action kicks off with a base model
-
00:00:41 that produces low-resolution samples, which are then boosted by a series of upscale models
-
00:00:48 to compose stunning high-resolution images. And the most interesting part of DeepFloyd
-
00:00:53 IF, it operates within the pixel space as opposed to latent diffusion e.g. Stable Diffusion
-
00:01:01 that we are accustomed to use that depends on latent image representations. So this is
-
00:01:07 the architecture of their model. So what makes DeepFloyd IF model better? The DeepFloyd IF
-
00:01:14 4.3 billion parameter base model is the largest diffusion model in terms of number of effective
-
00:01:21 parameters of the U-NET. This is true, for example Stable Diffusion 1.5 uses 859 million
-
00:01:30 parameters. So the DeepFloyd is almost 6 times larger than Stable Diffusion. A deep text
-
00:01:38 understanding is achieved by employing a large-language model T5-XXL as a text encoder, using optimal
-
00:01:47 attention pooling and utilization the additional attention layers in super-resolution models
-
00:01:53 to extract information from the text. So the strongest part of DeepFloyd IF is that it
-
00:01:59 is able to produce text amazingly. As you are seeing right now. It is able to write
-
00:02:07 text on generated images amazingly with amazing quality. The text that it can write on the
-
00:02:15 generated images is stunning. So they also released comparison between DeepFloyd IF and
-
00:02:21 other models as you are seeing right now. In this tutorial video, I will show you how
-
00:02:27 to run it on a Kaggle notebook. In the very bottom. They also provided some prompts and
-
00:02:33 the results that they have generated. Unfortunately, since the model size is very big, it requires
-
00:02:40 minimum 14GB VRAM memory. And since majority of the consumer-grade graphic cards are under
-
00:02:48 14GB VRAM, I will show how to use DeepFloyd IF model on a free Kaggle notebook. All you
-
00:02:56 need to follow this tutorial is just registering a Kaggle account and follow my steps. I will
-
00:03:02 also show you how you can turn off this IF watermark. So the link of their GitHub repository,
-
00:03:09 their official website, and the Kaggle notebook will be in the description of the video. On
-
00:03:16 their GitHub repository the architecture of the model is also clearly explained as you
-
00:03:22 are seeing right now. So in the first step, they generate 64 by 64 pixels image, then
-
00:03:29 it is upscaled into 256 pixels, and in the final step, it is upscaled to 1024 pixels
-
00:03:37 to 1024 pixels. So for minimum requirements to use all DeepFloyd IF models without 8-bit
-
00:03:44 quantization, it requires 24GB VRAM memory. Since Kaggle provides two GPUs for free to
-
00:03:52 use at the same time, we are able to load different parts of the model into each GPU
-
00:03:58 and use the model on a free Kaggle account. There is also Google Colab link on the GitHub
-
00:04:04 page as well if you want to test it out. And if you are ready, let's begin. To open the
-
00:04:10 Kaggle notebook, either click the link here or the link in the description of the video.
-
00:04:16 You need to have a Kaggle account. You can register for free from here. By the way, after
-
00:04:21 you registered, you need to verify your phone number and then you will be able to use Kaggle
-
00:04:27 GPU. So I will sign in with my email. After you sign in, it will redirect you to homepage,
-
00:04:33 then you need to click the link again. So to begin using DeepFloyd IF 4.3 billion parameters
-
00:04:39 model, we need to click edit my copy here. Then it will generate a copy of the notebook
-
00:04:46 on your account. As you are seeing here. Running is pretty easy. The first thing is that we
-
00:04:52 need to start our session. In the right panel of the Kaggle, you will see notebook options.
-
00:04:59 You need to pick GPU T4 X2. This means it will give us two GPUs to use at the same time,
-
00:05:07 turn on GPU T4. If you are not getting this option, then that means that your account
-
00:05:12 is not verified. This is a free account, you need to enter your phone number and verify
-
00:05:17 it. If you encounter any problem with your phone verification, then you should contact
-
00:05:22 support of Kaggle.com. Contact Kaggle support type Kaggle support to Google. Click the contact
-
00:05:29 Kaggle support here. And in here click I can not activate my account I cannot verify my
-
00:05:35 phone number. Then you can use this form to get your account verified. Also language:
-
00:05:41 Python is selected here. You can select persistence or not. I am not using any persistence. Pin
-
00:05:46 to original environment. Internet on. These are important. Then let's start our session
-
00:05:52 from here. Kaggle is much better than free Google Colab because they are giving you 30
-
00:05:58 hours GPU time every week and you are seeing how much quota you have left to use unlike
-
00:06:04 the Google Colab. Okay session started. Now I am ready to use DeepFloyd IF model. You
-
00:06:12 see the Kaggle assigned me 70 GB hard drive 13 GB RAM memory and two graphic cards with
-
00:06:20 15 GB VRAM memory. First we will click this button to install the necessary dependencies.
-
00:06:27 While installation is going on, you will see an indicating icon here. Cell execution is
-
00:06:32 queued. You need to wait until this is completed. The first installation has been completed.
-
00:06:39 You can ignore all of the warnings and error messages here. As you are seeing right now.
-
00:06:45 You see not anymore it is in the loading state. Now we will execute this cell. So I am clicking
-
00:06:51 here and the play icon appears. Click it. So for cell execution to appear, you need
-
00:06:57 to click somewhere around in the cell like this. Okay this is completed pretty quickly.
-
00:07:03 It also shows the available GPUs. The very nice feature of DeepFloyd IF is that it has
-
00:07:10 different parts so that you can load the different parts of the model into different GPUs. Therefore
-
00:07:17 in this code we will load these different parts in the different GPUs since we have
-
00:07:23 two GPUs available right now. Then click this play icon. You see you need to execute every
-
00:07:30 cell and after cell execution you will get a number like this. You need to see this number.
-
00:07:36 It says that cell executed at 11.53 pm with 225 seconds. The second cell executed with
-
00:07:45 only 11 seconds and now it is downloading the first model. Okay, the cell execution
-
00:07:50 is completed. Now the next one. As a next step we need Hugging Face token. Probably
-
00:07:55 this requirement will be removed later, but for now we need. So either type Google Hugging
-
00:08:01 Face or click the link in the description of the video. You need to register an account
-
00:08:06 here. It is free. After you registered your account, go to the settings. In here you will
-
00:08:12 see access tokens. Click there, click new token I will choose write here and you can
-
00:08:18 give any name. I will type deep floyd like this, generate token. Click copy icon here.
-
00:08:25 Then execute this cell. Paste the token here. Click login. After as a next step you are
-
00:08:32 seeing device cuda 1. If you are running with only single GPU then you need to make this
-
00:08:39 0. However, since we have two GPUs right now, how do I know when I click here I can see
-
00:08:44 I have two GPUs. GPU 1 GPU 2. Therefore, I can use cuda 1. As I said if you use only
-
00:08:51 single GPU like GPU p100 then you need to make this 0. However, since I have two, GPUs
-
00:08:58 I will load it onto GPU 1 which is actually being GPU 2 because it is starting from 0.
-
00:09:05 Click it. Okay, we have an error because we need to accept the license they are requiring.
-
00:09:13 So you need to open this. URL. I will put this URL into the description. Type here anything
-
00:09:19 you want test test I accept the above license agreement. Agree and access repository. Now
-
00:09:26 I can access the repository and the files as you are seeing right now. Then I will re-execute
-
00:09:32 the cell and you see it is downloading. Okay, the file is downloaded in two minutes and
-
00:09:39 loaded into our device. Now this is the next cell that we will execute. It is getting loaded
-
00:09:46 in eight bit because our VRAM is below 24 gigabytes. This model is getting loaded into
-
00:09:53 first GPU cuda 0. So the cuda 0 is the first GPU and cuda 1 is the second GPU. We will
-
00:10:00 see VRAM usage increase after this cell executed on the GPU 0. The first GPU. The cell execution
-
00:10:09 has been completed. It took about four minutes. There are some warnings, but you can ignore
-
00:10:14 them and now you see both of the GPU VRAM usage are over 8 gigabytes. As a next step,
-
00:10:21 we will execute this cell. As I said you need to have execution number on all of the cells
-
00:10:28 like you are seeing. If you skip a cell execution then it won't work. Okay, this didn't take
-
00:10:36 too much time. Now GPU 2 is using over 11 gigabytes. We will load this model right now
-
00:10:44 into the first. GPU the download speed of Kaggle is awesome as you are seeing. Okay,
-
00:10:50 the execution of the cell is completed. These are the VRAM usages and the RAM usage. Now
-
00:10:55 we will empty the unnecessary VRAM usage by executing this cell. Finally, we are ready
-
00:11:02 to use our prompts and generate images with awesome DeepFloyd IF 4.3 billion model. After
-
00:11:09 you've written your prompts, click play button, it will save it. Then for generating images
-
00:11:15 you need to play this cell. Okay, currently it is generating my first prompt. These are
-
00:11:21 the VRAM usages. This is the cpu usage and ram usage. This is the disk space usage. It
-
00:11:28 is doing 150 steps which is pretty high number. So you can reduce these number of steps to
-
00:11:35 increase your image generation speed. This is the IT per second which means how many
-
00:11:41 iterations it is doing every second and it is going to do 150 iterations for the first
-
00:11:49 image generation. So this model is a cascading model. So DeepFloyd works in a cascading manner
-
00:11:56 as you are seeing in here as well. Therefore it will do three generation. In the first
-
00:12:03 generation, it will generate a lower resolution 64 pixels, then in the second generation it
-
00:12:09 will upscale it into 256 pixels. Then it will upscale it into 1024 pixels. You see it completed
-
00:12:19 the first generation. Now it started the second generation. Ok, second generation also completed.
-
00:12:26 Now it is starting the final part. You see every generated image is being saved in these
-
00:12:33 variables: pil_images 1, pil_images 2 and pil_images 3. It is saving every part in the
-
00:12:41 variables list. Ok generations have been completed. Now time to show and save. First execute this
-
00:12:48 cell. I see that the generation weren't completed because this cell is still being executed
-
00:12:53 and now it is completed in 225 seconds which is huge time. Ok now we are seeing the images.
-
00:13:00 This is the base image 64 pixels. This is the second stage image which is 256 pixels
-
00:13:08 and now let's see the final image which will be 1024 and 1024 and here. Ok, we have waited
-
00:13:17 too long for this image. Now let's compare it with Stable Diffusion 1.5 rev animated
-
00:13:23 version. Ok looks like I had a spelling mistake so I will fix and rerun. Every time you make
-
00:13:29 a change here you need to click this play button then click again to generate image.
-
00:13:35 Ok looks like the prompt fixing didn't make much difference. Now I will test the same
-
00:13:41 prompt on Stable Diffusion 1.5 version, Stable Diffusion 2.1 version custom public model
-
00:13:49 rev animated version 11 and I will also test it on Kandinsky 2.1 version. Let's start with
-
00:13:56 rev animated version 11. I am going to use high resolution fix to get the same resolution
-
00:14:04 and I won't do any cherry-picking. I will use the first generated image so I am hitting
-
00:14:10 generate button. Automatic 1111 currently running on my local computer. Kandinsky 2.1
-
00:14:16 is currently running on Google Colab and DeepFloyd IF is running on Kaggle notebook. Let's also
-
00:14:23 start Kandinsky image generation. By the way: I have an amazing tutorial for Kandinsky 2.1
-
00:14:29 version on my channel and in the pinned comment, you can find the Colab link to open this Colab
-
00:14:36 for Kandinsky 2.1 version. Okay, the image generation is done with rev animated. Let's
-
00:14:43 compare them. So I will copy this image. Okay, looks like there is no copy option here so
-
00:14:50 it is saved inside our runtime. So if you wonder where is our runtime, you see in the
-
00:14:56 right panel of Kaggle there is output. This is your runtime. So here our image. Let's
-
00:15:03 refresh it. It is named as 0.png. Okay, it looks like it will start from first number
-
00:15:10 always so we need to modify this for future generations as well. Okay, it can be like
-
00:15:15 this so we will add a counter here. Okay, let's start comparison. So I will download
-
00:15:20 this from here. Download. Okay, it is downloaded. Let's open a new window for comparison like
-
00:15:28 this: This is DeepFloyd image: this is the first image that rev animated generated with
-
00:15:34 the same prompt. Okay, here is the comparison with their native resolution: I can say that
-
00:15:39 rev animated is looking much better. Okay, let's also test with 1.5 pruned version which
-
00:15:45 is the official base version and this is the Kandinsky 2.1 version result. Kandinsky normally
-
00:15:52 supports 768 and 768, but we generated this image with 1024 and 1024. Let's make its comparison
-
00:16:02 first. So I pasted the image here. And you are seeing right now: DeepFloyd versus Kandinsky
-
00:16:08 2.1 version. Okay, the result of Stable Diffusion 1.5 also generated and it is looking terrible.
-
00:16:16 It looks like a lot of repetition seeming. Let's also see. Okay, now you are seeing the
-
00:16:22 result of base Stable Diffusion 1.5 version. Okay, let's also test with Stable Diffusion
-
00:16:28 2.1 version 768 pixels. This time I won't use high resolution fix and I will just generate
-
00:16:35 with 1024 1024. Okay, this is the result. Let's compare and now we are seeing the result
-
00:16:41 of Stable Diffusion 2.1 version and the DeepFloyd IF model. DeepFloyd is said to be very strong
-
00:16:48 with text generation. So let's compare its text performance with other models as we did.
-
00:16:56 I am going to use natural language to do this experimentation. A teddy bear wearing a white
-
00:17:04 shirt that has SECourses text written on it. Also, if you have already noticed that the
-
00:17:11 generated images have IF watermark but we can disable it. To disabling watermark, we
-
00:17:17 are going to use this. We just need to put it like this: equal to false, equal to false
-
00:17:25 and equal to false. Okay, let's generate and see the real performance of DeepFloyd. So
-
00:17:31 after changing your prompt, you can also write multiple prompts here. I will show click,
-
00:17:36 play, and click generate. Meanwhile, I will do generation on other models as well. Okay,
-
00:17:42 the generation of DeepFloyd has been completed. You see it displays a number here. Let's see
-
00:17:47 the results. Okay, it is looking pretty accurate, but it has a letter error here. Let's see
-
00:17:55 the final result. Okay I need to fix the error here. Okay, fixed the script error. All right.
-
00:18:02 So this is the result of DeepFloyd. By the way, we still have watermark. However, it
-
00:18:08 should have been. Oh I see the mistake because we said that disable watermark false. It should
-
00:18:15 have been true in the next example. I will do that. Okay, now let's compare the results.
-
00:18:22 It is pretty close. However, it doesn't look natural if you ask me. Okay, let's download
-
00:18:27 it. Okay, let's start with rev animated. The image of rev animated is much better quality.
-
00:18:33 However, it has totally unrelated text. The text of DeepFloyd is pretty accurate except
-
00:18:40 a letter it is missing. O here. SECourses. However, the quality is pretty bad. Okay,
-
00:18:48 now we are seeing SD 1.5 version. It doesn't even have any text on it. This is the result.
-
00:18:55 And now we are seeing the result of SD 2.1 version. It has some text, but it is also
-
00:19:01 totally unrelated. Okay, now we are seeing the result of Kandinsky 2.1 version. Kandinsky
-
00:19:07 also have some text, but it is also totally unrelated. So with text written, DeepFloyd
-
00:19:13 is very good, but the image quality is not very good. I will do some more tests now and
-
00:19:19 I will show you. Okay this time I am going to test 4 different prompts. The first prompt
-
00:19:26 is a terminator robot holding a banner that has we love programming text on it. Then I
-
00:19:32 am adding some beautifying tokens keywords after it. The second prompt is a soap that
-
00:19:38 has text on it. This soap is amazing. Then adding other beautifying prompts. In the last
-
00:19:44 two prompts I am removing the beautifying prompts as you are seeing here. So let's test
-
00:19:50 it. I will copy paste it here and play. Okay, it is set. Let's hit generate. Okay, currently
-
00:19:57 it is generating the third prompt. While generation is in progress, we will see a cancel run button
-
00:20:04 here. That means that it is still running. Not finished yet. Okay, the image generation
-
00:20:10 has been completed. You see there is not any more cancel run. Now let's display the images.
-
00:20:16 The images are here. Let's save them. They will appear here. Okay, okay, images are saved.
-
00:20:24 So let's download and see them. By the way each time this is starting from one. So we
-
00:20:30 should add a code here. Make it as a base something like this, then it will just get
-
00:20:36 increased. Okay, let's download all of them. You can also download entire directory if
-
00:20:42 you wish. How can you do that? We can write something like this. I will ask to the ChatGPT
-
00:20:48 give me a python code that will download Kaggle working directory as a zip. Okay, okay here
-
00:20:57 the code. Let's copy it. Let's execute it and see if it is working. Okay, we got the
-
00:21:04 zip. We click and we download it. Okay, working. Amazing. We don't need to click download one
-
00:21:10 by one and here all the files generated. The result of first prompt which has beautifying
-
00:21:16 words. Unfortunately, the text is not very good and it is not even related. The second
-
00:21:23 one is still totally unrelated. In the below of the image, it tells you the prompt you
-
00:21:30 are seeing. This is the third prompt. In the third prompt when we don't add beautifying
-
00:21:37 words, it is much better. I see that it is almost correct, but not exactly. Here a soap
-
00:21:44 that has a text on it. This soap is amazing. Unfortunately, this is also not very correct.
-
00:21:50 So far, we don't have very good examples. It is taking huge time so this definitely
-
00:21:55 needs some improvements. Okay, there are some examples provided on the DeepFloyd.ai website.
-
00:22:04 So let's try this: aerial photo of a beach the words "what if?" written in the sand.
-
00:22:10 Okay, this is our prompt. I click play. Also, if you want to generate different images with
-
00:22:18 the same prompt, you need to change the seed. So instead of manually executing this multiple
-
00:22:25 times with changing seed, we can write a simple script. We can put this script into a for
-
00:22:34 loop and decide how many times we want to generate different images with different seeds
-
00:22:39 with the same prompt. So I will copy this entire code. Then I will open a new chat here.
-
00:22:46 I will command ChatGPT as modify this below script so that I can decide it to loop 10
-
00:22:56 times and each time use a different random seed value. Then I will paste the code like
-
00:23:05 this: hit enter. Currently I am using free ChatGPT so you can also use it as like this.
-
00:23:12 Okay here our code: let's copy it. Let's paste it here. You see that now it is covered in
-
00:23:19 for loop. So I can change this to any value I want and it will be looping until that.
-
00:23:27 So I will loop with 10 and it should work exactly as we want. So let's try it. Let's
-
00:23:33 loop it. The images generation have been completed. There was a tiny bug and I have fixed it.
-
00:23:40 Now it is generating as many as images we want. You just need to change the parameter
-
00:23:46 here it. Is using random seed. If you don't want a random seed, you can make it like this.
-
00:23:53 Then it is also iterating every one of the prompts. So I have used two prompts like this:
-
00:24:00 I also changed the location of this parameter here and I added a new script that will clear
-
00:24:08 your working directory if you wish. I cleared it and let's see the images. Okay, so this
-
00:24:15 is the very first image it generated. This is the second upscaled and let's see the final
-
00:24:22 image with here and this script will save every file inside this working directory as
-
00:24:30 a zip file. Okay, here we are seeing the final images. So this is areial photo of a beach
-
00:24:36 the words "what if?" written in the sand and it is looking very very good for this prompt.
-
00:24:43 The second prompt is an amazing intricate masterpiece space wallpaper that has the following
-
00:24:50 word SECourses. However, it has failed one more time. Also, we have only two images which
-
00:24:57 means that I had another error here and I will fix it as well. Okay, I have fixed the
-
00:25:03 problem. Now with different seeds we see different images. This is the first seed one and this
-
00:25:10 is the second random seed. You see with same prompt we got different images. I also fixed
-
00:25:17 the iteration count here so you just need to change iteration count here. Change it
-
00:25:23 as many as you want to generate images. I have shared this modified notebook on our
-
00:25:31 Patreon page. You can download the improved notebook file from this post and then all
-
00:25:37 you need to do is go to here file and import notebook. Select the downloaded notebook file
-
00:25:46 like this and then import and then it will import the updated notebook like this. This
-
00:25:53 is all for today. I hope you have enjoyed. Please like subscribe, leave a comment. I
-
00:25:59 also have amazing other tutorial videos on my channel as you are seeing right now. Please
-
00:26:04 check them out as well. Also, in the description of the video and in the pinned comment section,
-
00:26:11 you will find our discord channel and Patreon link. These links will be also in the description
-
00:26:17 of the video. When you click the discord link, you will be directed to this page. You see
-
00:26:23 currently we have over 2250 members, over 300 online members. I am also expecting you
-
00:26:30 to join. Come here, let's chat, ask me any questions that you have. You can also run
-
00:26:36 DeepFloyd on your computer by using this jupyter notebook. However, I didn't show this in this
-
00:26:42 tutorial because as you have seen you need over 14 GB VRAM having GPU and majority of
-
00:26:50 the commercial GPUs currently under 14 GB. I have 24 GB RTX 3090 but there is only two
-
00:27:00 models which are over 12 GB RTX 3090 and RTX 4090. But if there be a request for this from
-
00:27:10 you, I may make a video for running DeepFloyd IF model on your computer as well. However,
-
00:27:16 KAggle is awesome to run these models on them as long as there is an already ready script.
-
00:27:24 Kaggle gives you 30 hours per week and you are able to use all of it and see the remaining
-
00:27:29 time you have. It is much better than Google Colab as I said. Hopefully see you in another
-
00:27:33 awesome video.
