Skip to content

Midjourney Level NEW Open Source Kandinsky 21 Beats Stable Diffusion Installation And Usage Guide

FurkanGozukara edited this page Oct 26, 2025 · 1 revision

Midjourney Level NEW Open Source Kandinsky 2.1 Beats Stable Diffusion - Installation And Usage Guide

Midjourney Level NEW Open Source Kandinsky 2.1 Beats Stable Diffusion - Installation And Usage Guide

image Hits Patreon BuyMeACoffee Furkan Gözükara Medium Codio Furkan Gözükara Medium

YouTube Channel Furkan Gözükara LinkedIn Udemy Twitter Follow Furkan Gözükara

Discord : https://bit.ly/SECoursesDiscord. Kandinsky 2.1 is truly exceptional, and it is on par with Midjourney. In this video, I will compare Kandinsky to Stable Diffusion and provide a comprehensive tutorial on installation and usage. If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰 https://www.patreon.com/SECourses

Playlist of StableDiffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Kandinsky 2.1, Pix2Pix, Img2Img:

https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3

Save image code posted on Github Gist (further improved) :

https://gist.github.com/FurkanGozukara/10bdc0435b708b26bd87a59b6c3d1bc7

Hugging Face repo of Kandinsky :

https://huggingface.co/ai-forever/Kandinsky_2.1

Github repo of : https://github.com/ai-forever/Kandinsky-2

How to install Python and Git : https://youtu.be/AZg6vzWHOTA

Installing Jupyter : https://jupyter.org/install

00:00:00 Amazing Kandinsky 2.1 text 2 image free model that beats Stable Diffusion 2.1 and par to Midjourney

00:00:18 Same prompt comparison with Midjourney and Kandinsky 2.1

00:00:45 Comparison of Kandinsky 2.1 with Stable Diffusion 2.1 with same prompts

00:01:50 Why Kandinsky 2.1 is better than Stable Diffusion or Dall-E 2

00:02:15 How to install Kandinsky 2.1

00:04:13 How to install Kandinsky 2.1 notebooks

00:05:13 Downloading and loading Kandinsky 2.1 model files

00:06:04 Starting to test Kandinsky 2.1 in Jupyter Notebook

00:06:36 How to save generated images - updated script is posted on GitHub Gist

00:07:20 Another prompt comparison with Stable Diffusion 2.1 and Kandinsky 2.1

00:09:00 If you have low VRAM, you can use Kandinsky 2.0 instead of 2.1

00:09:12 How to restart and use later Kandinsky 2.1 again after initial run

00:09:40 Where are the Kandinsky model files are downloaded

00:10:41 How to close - hide all outputs in a JupyterLab notebook

Revolutionizing Visual Art: The Emergence of Text-to-Image Generative AI Models

Introduction

Artificial intelligence (AI) has been making rapid advancements in recent years, with groundbreaking applications in various fields. One such application is the development of text-to-image generative AI models, which are transforming the way we visualize and create art. These models use natural language processing to generate realistic images from text prompts. Among the most notable models are Stable Diffusion, DALL-E, Midjourney, and Kandinsky. This article delves into these innovative AI models and their implications for art, design, and creative industries.

Stable Diffusion

Stable Diffusion is a generative model that utilizes a diffusion process to create high-quality images from textual descriptions. Based on the idea of noise-contrastive estimation, it inverts the process of adding noise to images by progressively removing it. This technique allows the model to learn complex patterns and generate images with finer details. Stable Diffusion has shown promising results, producing images with enhanced realism and diversity compared to previous models.

DALL-E

DALL-E, a creation by OpenAI, has garnered significant attention for its ability to generate a vast array of images based on textual prompts. This model is a variant of the GPT-3 language model, fine-tuned to generate images instead of text. DALL-E's success lies in its capacity to handle abstract concepts and create visually coherent images, even with unusual or imaginative prompts. Its versatility and creativity make DALL-E a valuable tool for artists and designers looking to explore new visual possibilities.

Midjourney

Midjourney is a generative AI model that focuses on producing intricate and visually appealing images from text descriptions. It employs a combination of unsupervised and supervised learning techniques to generate images with remarkable detail and texture. The model's strength lies in its ability to understand and depict complex scenes, making it particularly suitable for landscape and architectural visualization. Midjourney offers an innovative approach to digital art, providing artists with a unique tool to inspire and enhance their creations.

Kandinsky

Named after the famous abstract painter Wassily Kandinsky, this AI model aims to bridge the gap between text and abstract visual art. Kandinsky employs a combination of deep learning techniques and an extensive dataset of abstract art to generate images based on text prompts. The model is specifically designed to understand and interpret emotions, moods, and abstract concepts in order to create visually striking and evocative images. This groundbreaking technology has the potential to redefine the way we create and perceive abstract art.

Conclusion

Text-to-image generative AI models, such as Stable Diffusion, DALL-E, Midjourney, and Kandinsky, are revolutionizing the creative landscape by providing artists, designers, and other professionals.

Video Transcription

  • 00:00:00 Hello everyone, I am excited to present to you  the latest publicly released text-to-image model  

  • 00:00:05 Kandinsky 2.1. While it shares similarities with  Stable Diffusion, Kandinsky 2.1 is significantly  

  • 00:00:13 better. Its prompting easiness and the quality of  the generated images are par with Midjourney. When  

  • 00:00:19 I say it is at the Midjourney level, I am not  exaggerating. For example, this is a Midjourney  

  • 00:00:25 output. The prompt is Lion, Jungle, Cartoon,  Ultra Realistic. This output is 512x512. And  

  • 00:00:33 this is the output of Kandinsky 2.1 version with  the same prompt. Here is the comparison. The left  

  • 00:00:40 one is the Midjourney output. The right one is  the Kandinsky 2.1 version output. For example,  

  • 00:00:45 a beautiful rose garden, awesome, intricate, HD,  fantastic. Nothing else. This is our prompt. The  

  • 00:00:52 left one is the output of the Stable Diffusion  2.1 version. And the right one is the output of  

  • 00:00:58 the Kandinsky 2.1. Here you see another prompt.  A fancy sports car. These are not cherry-picked.  

  • 00:01:04 These are the first results that I have got. Here  at the left side, we are seeing the results of  

  • 00:01:10 the Stable Diffusion 2.1 version. In the right  side, we are seeing the results of the Kandinsky  

  • 00:01:15 2.1 version. Now another very simple prompt. A  futuristic very advanced battle robot. I am not  

  • 00:01:21 using any negatives. The left one is the output  of the Stable Diffusion 2.1 version. And the right  

  • 00:01:27 one is the output of the Kandinsky 2.1 version.  The right one is very similar to the output that  

  • 00:01:33 we would get from Midjourney. However, the left  one is very primitive. Very simple. It is not even  

  • 00:01:40 like a battle robot. So this is the difference  of Kandinsky 2.1 version. And it is totally free  

  • 00:01:46 to use on your computer forever as you wish. So  according to the authors of Kandinsky 2.1 version,  

  • 00:01:52 it inherits best practices from Dall-E 2 and  latent diffusion while introducing some new  

  • 00:01:58 ideas. As a text and image encoder, it uses CLIP  model and diffusion image prior mapping between  

  • 00:02:04 latent spaces of CLIP modalities. According  to the authors, this approach increases the  

  • 00:02:09 visual performance of the model and unveils new  horizons in blending images and text-guided image  

  • 00:02:14 manipulation. Installation of Kandinsky is looking  pretty easy. It will be installed into our main  

  • 00:02:21 Python installation by executing this command.  To do that, I am opening my CMD window. First,  

  • 00:02:28 let me show you my default Python. When I type  Python, I see that it is 3.10.8 version. Then  

  • 00:02:35 paste the command which is pip install and the  GitHub repository URL like this. It will install  

  • 00:02:41 everything into my default Python installation.  If you don't know how to install Python,  

  • 00:02:47 Stable Diffusion, and other things, in this video,  I have shown everything. The link will be in the  

  • 00:02:52 description. The installation has been completed.  It has overridden some of my default installations  

  • 00:02:58 in my default Python folder. However, this  shouldn't affect my Stable Diffusion installation  

  • 00:03:04 because Stable Diffusion uses its own virtual  environment. For using Kandinsky in our computer,  

  • 00:03:11 we are going to utilize Jupyter Notebooks. To  install Jupyter Notebook, we will use Jupyter from  

  • 00:03:17 JupyterLab. It is so easy to install. Just open  CMD, type this command, and it will be installed.  

  • 00:03:24 To start Jupyter-Lab, open a CMD window, copy  paste this command, or type it jupyter-lab,  

  • 00:03:30 and you will see a JupyterLab is opened. It will  display the contents of the folder where it is  

  • 00:03:37 started. This is important. Therefore, I have  cloned the Kandinsky 2 into my C drive. To do  

  • 00:03:43 that, I have opened a CMD window in my C drive.  I am just typing git clone and pasting the URL,  

  • 00:03:50 and I will start my JupyterLab inside this  folder. Type CMD here and run the command again.  

  • 00:03:57 JupyterLab has started inside this folder, as you  can see, and now I can see the notebooks folder,  

  • 00:04:03 which is mentioned inside their GitHub repository.  When you enter inside notebooks folder, you will  

  • 00:04:10 see the notebooks that they have made. So it will  install the necessary scripts inside the started  

  • 00:04:18 JupyterLab. Let's start executing them one by one.  You see, currently, all of these are displaying  

  • 00:04:25 nothing in these square brackets. So when I  execute them, you will see there is a * argument.  

  • 00:04:32 Then when the execution of the cell is completed,  it will display a number. You see, they have made  

  • 00:04:39 this first one like this because we already  have installed it. So it is just skipped, and  

  • 00:04:46 it installed the CLIP GitHub repository. This is  why there is an error. Okay, it looks like nothing  

  • 00:04:51 is problematic. Let's continue with the second  cell. Just run this. Looks like we need to install  

  • 00:04:58 IPyWidgets. So I clicked this URL, and in here  it shows how to install it. Just run it in our  

  • 00:05:07 main Python installation folder. It is installed.  Then let's rerun this command line again. Okay,  

  • 00:05:14 now no errors. It is displayed as number three.  Now let's execute the next cell. So it will load  

  • 00:05:21 the Kandinsky into our CUDA. By the way, I have  RTX 3060 at the moment. I have purchased RTX 3090,  

  • 00:05:30 and I will hopefully make a video about it as  well. A review video. So it is going to download  

  • 00:05:36 the necessary model file from Hugging Face. So  it is 2.68 GB. It is downloading the file which  

  • 00:05:45 is hosted on their Hugging Face repo. Currently  it is only ckpt file. While executing a cell,  

  • 00:05:52 be careful that you will see a * icon on the  cell square brackets. You need to wait until  

  • 00:05:59 there written a number instead of *. All files  have been downloaded. Now time to start testing.  

  • 00:06:06 I will first execute their default prompt. Let's  see. While executing the prompt, you will see the  

  • 00:06:14 progress bar like this. I also have opened SD 2.1  version as well for comparison. So this it/s may  

  • 00:06:21 not be displaying my best it/s currently. Okay,  the execution has been completed. After that,  

  • 00:06:28 you need to run this cell. Don't forget that. When  you click running this cell, it will display the  

  • 00:06:34 latest generated image. We generated only one  image because the batch size is one. So for  

  • 00:06:40 displaying and saving the generated image, I have  written a simple script. This Python script is  

  • 00:06:46 shared on my GitHub Gist repository. The link will  be in the description so you can copy and paste  

  • 00:06:52 it. It will generate a new random name every time  when you execute this cell and it will save the  

  • 00:06:57 generated image with that name. Let's execute and  see. So the image is generated. When you open it,  

  • 00:07:03 you can see the image. So for comparing it with  SD 2.1 version, I have written the same prompt,  

  • 00:07:10 made the sampling steps 100 and made the output  768. This is base 2.1 version. This is the SD  

  • 00:07:18 2.1 version. This time, I have given another  prompt of some fantastic intricate castle in  

  • 00:07:24 a forest with beautiful waterfall and trees. It  took 34 seconds with 100 number of steps. And  

  • 00:07:30 this is the generated image. So I have executed  the same prompt on SD 2.1 version, and this is  

  • 00:07:37 the generated image. Let's compare them. So here  you see the comparison between ST 2.1 version and  

  • 00:07:45 Kandinsky 2.1 version. The difference is huge.  With the same prompt, we get an awesome image on  

  • 00:07:53 Kandinsky 2.1 version. However, on SD 2.1 version,  we are getting a very simple image. For getting  

  • 00:07:59 this kind of image with SD 2.1 version we have  to do a lot of prompting, prompt engineering,  

  • 00:08:06 but we can get awesome image with Kandinsky 2.1  version. When this comes to SD Web UI, I believe  

  • 00:08:14 that it will be amazing. So let's compare it with  only 20 number of steps. With number of steps 20,  

  • 00:08:22 it took only 11 seconds. And this is the image  we got. You see, it is still amazing with only  

  • 00:08:28 20 number of steps. And when we compare the  timing, the Kandinsky took 11 seconds and  

  • 00:08:34 Stable Diffusion 2.1 took 7 seconds. Still, Stable  Diffusion 2.1 is faster because it is using a lot  

  • 00:08:42 of optimization such as xFormers and maybe other  things. Kandinsky is just released. Therefore,  

  • 00:08:48 it is not optimized. I believe it will be much  better in future. Currently, my computer is able  

  • 00:08:55 to run both of them at the same time with 12GB  VRAM memory. If you get VRAM memory error with  

  • 00:09:02 Kandinsky, you can use Kandinsky version 2. It  uses lesser VRAM. It is a smaller model. Probably,  

  • 00:09:08 it is also lesser quality. It generates 512  and 512 resolution. So let's say you wanted  

  • 00:09:14 to restart your Kandinsky later, just close your  CMD window, open your Kandinsky folder, start CMD,  

  • 00:09:22 start JupyterLab, and now you don't have to  do installation part again. Just go to the  

  • 00:09:28 from Kandinsky import part, click execute cell.  Whatever you do on your JupyterLab will be also  

  • 00:09:35 displayed on the CMD window. So import has been  completed. Just click the second cell. This time,  

  • 00:09:42 it won't redownload the files because files have  been downloaded in this particular folder. You can  

  • 00:09:48 see that folder inside your C drive, inside  TMP folder. And when you go inside there,  

  • 00:09:55 you will see the downloaded files. So the total  downloaded files size is 6.75GB for me right now.  

  • 00:10:03 You need to wait until this * icon turns into a  number. That means that it is still executing and  

  • 00:10:10 now it is executed. Also, you can see Python  3 idle in the left bottom of my screen. When  

  • 00:10:18 you are doing something, this part of the screen  will write busy. Let me show you. So this time,  

  • 00:10:23 I will generate amazing intricate futuristic  fantastic tank. Let's execute it. While executing,  

  • 00:10:29 you see it is displaying here busy. And then  let's display it. So yes, this is a really,  

  • 00:10:36 really amazing result. When compared to SD 2.1  version, we are going to get with this simple  

  • 00:10:42 prompt. You can also close all of the outputs  from view and in here, you will see collapse  

  • 00:10:50 all outputs. When you click it, it will hide all  of the outputs windows. So this is all for today.  

  • 00:10:56 I hope you have enjoyed. Please like, subscribe,  leave a comment, tell me your ideas about this  

  • 00:11:02 new Dall-E 2 Midjourney Stable Diffusion like  text to image generative model. I think it is  

  • 00:11:08 amazing. Also, if you join our YouTube channel, I  would appreciate that very much. Please also join  

  • 00:11:14 our Discord channel. You will find the Discord  channel link in the description and also in  

  • 00:11:18 the comment section. If you also support us on  Patreon, I would appreciate that very much. Your  

  • 00:11:24 Patreon support is significantly important for  me. Hopefully see you in another awesome video.

Clone this wiki locally