How To Use Mochi 1 Open Source Video Generation Model On Your Windows PC RunPod and Massed Compute

How To Use Mochi 1 Open Source Video Generation Model On Your Windows PC, RunPod and Massed Compute

Full tutorial link > https://www.youtube.com/watch?v=iqBV7bCbDJY

Mochi 1 from Genmo is the newest state-of-the-art Open Source video generation model that you can use for free on your computer. This model is a breakthrough like the very first Stable Diffusion model but this time it is starting for the video generation models. In this tutorial, I am going to show you how to use Genmo Mochi 1 video generation model on your computer, on windows, locally with the most advanced and very easy to use SwarmUI. SwarmUI as fast as ComfyUI but also as easy as using Automatic1111 Stable Diffusion web UI. Moreover, if you don't have a powerful GPU to run this model locally, I am going to show you how to use this model on the best cloud providers RunPod and Massed Compute.

🔗 Public Open Access Article Used in Video ⤵️

▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-106135985

Amazing Ultra Important Tutorials with Chapters and Manually Written Subtitles / Captions

Stable Diffusion 3.5 Large How To Use Tutorial With Best Configuration and Comparison With FLUX DEV : https://youtu.be/-zOKhoO9a5s

FLUX Full Fine-Tuning / DreamBooth Tutorial That Shows A Lot Info Regarding SwarmUI Latest : https://youtu.be/FvpWy1x5etM

Full FLUX Tutorial - FLUX Beats Midjourney for Real : https://youtu.be/bupRePUOA18

Main Windows SwarmUI Tutorial (Watch To Learn How to Use)

How to install and use. You have to watch this to learn how to use SwarmUI

Has 70 chapters and manually fixed captions : https://youtu.be/HKX8_F1Er_w

Cloud Tutorial (Massed Compute - RunPod - Kaggle)

If you don't have a powerful GPU or you want to use powerful GPU this is the tutorial you need

48 GB A6000 GPU is only 31 cents per hour on Massed Compute with our special coupon : https://youtu.be/XFUZof6Skkw

Free Kaggle Account Notebook for GPU-Poor

Installs latest version of SwarmUI on a free Kaggle account

Works with Dual T4 GPU at the same time

Supports SD 1.5, SDXL, SD3, FLUX and Stable Cascade and more :

Download from here : https://www.patreon.com/posts/106650931

00:00:00 Introduction to the tutorial

00:01:44 How to download, install and use Mochi 1 on Windows

00:03:59 How to update SwarmUI to the latest version to be able to use Mochi 1

00:04:17 How to start SwarmUI on Windows

00:04:27 How to set SwarmUI to use which GPU for generation

00:04:55 How to generate a video with Mochi 1, what are the best configurations

00:06:45 Where I have shared all the prompts I used to generate intro demo AI videos

00:07:30 Where to see step speed of your video generation and what are the speeds of RTX 3060 and RTX 3090

00:08:04 How do I activate my first primary GPU while also generating on my secondary GPU

00:08:25 Why queue system may not immediately start using your multiple GPUs and how to fix

00:09:45 How to solve out of memory error by enabling VAE tiling

00:10:02 Which parameters are best for VAE tile size and VAE tile overlap

00:10:53 How to use Mochi 1 and SwarmUI on Massed Compute cloud service - you don't need a GPU for this

00:11:13 How to apply our SECourses coupon to get 50% discount for real for RTX A6000 GPU

00:11:37 How to connect initialized Massed Compute and start using Mochi 1

00:12:23 How to update SwarmUI to latest version on Massed Compute

00:12:51 How to start SwarmUI with public share to access from computer directly and use in computer browser

00:14:10 How to install and use Mochi 1 on RunPod with SwarmUI

00:16:45 How to monitor back-ends loading of SwarmUI on RunPod

00:17:35 How to properly terminate your RunPod pod and Massed Compute instance to not lose any money

Repo : https://huggingface.co/genmo/mochi-1-preview

Model Architecture

Mochi 1 represents a significant advancement in open-source video generation, featuring a 10 billion parameter diffusion model built on our novel Asymmetric Diffusion Transformer (AsymmDiT) architecture. Trained entirely from scratch, it is the largest video generative model ever openly released. And best of all, it’s a simple, hackable architecture.

Alongside Mochi, we are open-sourcing our video VAE. Our VAE causally compresses videos to a 128x smaller size, with an 8x8 spatial and a 6x temporal compression to a 12-channel latent space.

An AsymmDiT efficiently processes user prompts alongside compressed video tokens by streamlining text processing and focusing neural network capacity on visual reasoning. AsymmDiT jointly attends to text and visual tokens with multi-modal self-attention and learns separate MLP layers for each modality, similar to Stable Diffusion 3. However, our visual stream has nearly 4 times as many parameters as the text stream via a larger hidden dimension. To unify the modalities in self-attention, we use non-square QKV and output projection layers. This asymmetric design reduces inference memory requirements. Many modern diffusion models use multiple pretrained language models to represent user prompts. In contrast, Mochi 1 simply encodes prompts with a single T5-XXL language model.

Video Transcription

00:00:00 Greetings everyone, today I am going to show you how to use the newest state-of-the-art
00:00:06 text-to-video model Mochi 1 Preview. And I am going to use our newest favorite, very easy,
00:00:14 very convenient, and very advanced to use SwarmUI. If you have been following me recently,
00:00:20 I am using SwarmUI for FLUX, for Stable Diffusion 3.5, and the newest models like Mochi 1. Because
00:00:28 it has so many features, it is using ComfyUI as a backend. Therefore, it is super, super optimized,
00:00:36 super fast, and it gives us all the features that we need. Like wildcards, like grid generation,
00:00:44 like presets, like image history, and anything that you need. And since it is using ComfyUI
00:00:50 as a backend, it is the best UI to use
00:00:54 the video generation models as well. And it is starting to get supporting more
00:00:59 video models. Therefore, I really recommend you to learn how to use SwarmUI. So, today,
00:01:05 I will show you how to use this amazing new Mochi 1 Preview on the SwarmUI. This will be a public
00:01:13 tutorial. You will access all the resources and the links. And moreover, I will show it how to
00:01:20 use on Windows computer with as low as 12 GB GPUs, and I will show how to use on RunPod and Massed
00:01:28 Compute. So, don't worry, if you have a powerful GPU, you will be able to use this amazing model with amazing
00:01:36 prices on RunPod and on Massed Compute.
00:01:39 I don't know if you have a lower than 12 GB, GPU, it would work or not. First of all,
00:01:45 go to this page. This is a public post. The link will be in the description of the video.
00:01:49 This is the main
00:01:51 post that I use for SwarmUI tutorials.
00:01:53 And as I said, this is a public post. You can access this post from even private
00:01:58 window. You don't need to login or register or be subscriber of me. You will see that we have
00:02:04 these following tutorials. You can watch this latest one to learn how to install SwarmUI.
00:02:11 But I really recommend you to watch this cloud and main SwarmUI tutorials if you don't know
00:02:17 how to use SwarmUI. And in the very bottom of the post in the attachments, you will see the install
00:02:22 linux.sh file to install on RunPod. And you will see the windows installer.bat file. So as I said,
00:02:30 watch the tutorials to learn how to install SwarmUI. So how we are going to install and
00:02:36 use Genmo Mochi 1 preview video model. In this tutorial, I am going to use FP8 scaled
00:02:42 unified model. So click here and you will go to the ComfyUI Org Mochi preview repackaged.
00:02:49 Download the file from here and put it into your SwarmUI models Stable diffusion folder. That's it.
00:02:55 However, you can also use our SwarmUI unified downloader. Go to here and we have the version
00:03:00 11. Click here to download. Extract the files into your SwarmUI installed folder. Like this.
00:03:07 Okay. Move the files into the main parent folder like this. You see my SwarmUI has been installed
00:03:13 here. So this is the main parent folder. This is the SwarmUI folder. And this is the folder where
00:03:18 you need to put the installer files. Then double click windows download model.bat file. More info
00:03:24 run anyway. And it will give you options to download all these models. If you remember,
00:03:29 I had introduced you FP8 FLUX development model. But the scaled version which works better and
00:03:35 really fast compared to the GGUF models. This model has been fixed and improved. Now it works
00:03:42 on RTX 4000 series as well like RTX 4090. I recommend you to check it out. How you gonna
00:03:49 download the Genmo Mochi model? You see the option 12. So just type 12 and hit enter and it will
00:03:55 start downloading the necessary model. Since I have previously downloaded it, it is ready.
00:04:00 Then enter inside your SwarmUI folder. Now this is very important. You need to first update your
00:04:05 SwarmUI to the very latest version. So double click update windows.bat file. It will update
00:04:11 the SwarmUI. Make sure that there are no errors and it is done. Then we double click the launch
00:04:18 windows.bat file and it will start. So the SwarmUI has been been started on my
00:04:24 local computer. So in the server in the back ends currently this is set for my the second
00:04:29 GPU. What I mean by that when I type nvidia-smi you can see my GPUs RTX 3090 TI and RTX 3060.
00:04:38 Let's also open the nvitop like this and you can see I am currently using 7000 megabytes. But it
00:04:45 is fine because this model is using very little amount of VRAM during the generation and in the
00:04:51 end. When it is reconstructing video it is using tiling. So go to the generate options and go to
00:04:57 the models. In here select the downloaded Mochi 1 preview FP8 scaled model. This is an all-in-one
00:05:05 model. It contains all the files that you need and click here display advanced options. Then
00:05:10 what are the recommended values? I recommend 40 steps however it may take a time depending on
00:05:17 your GPU. 20 is also fine. I recommend CFG scale 6. And this is the most crucial part. How many
00:05:24 frames you want to generate. The examples I have shown in the beginning are 121 frames.
00:05:31 So you may be wondering why this is 25 frame not 24. Because this is how the model works. It has
00:05:38 to be multiplication of the 6 plus 1. So you can increment this by 6 like this to 31 or 37 or 43.
00:05:50 And as you increase the number of frames you can increase the number of frames you can increase the
00:05:52 number of frames it will increase your generation time. Not the VRAM usage but the generation time
00:05:58 will almost linearly increase. The fps, you can decide the fps of the video. If you set this as
00:06:05 8 you will get a 3 times longer video than your generated video. What I mean for example if I
00:06:13 generate 49 frames with the 8 fps I am going to get 6 second video. It is actually 2 second video
00:06:21 but it will be 6 second because of the frame rate. So let's make an example video generation and see
00:06:28 the speed. But keep in mind that currently I am recording I am using a lot of GPU power. You see
00:06:35 I am using 10% GPU utilization right now and using this amount of memory. So it will be slower than
00:06:43 you generate without recording anything. And I have shared all the prompts that I used to
00:06:47 generate the videos in the beginning. So they are here click this GitHub gist file and you will see
00:06:53 the prompts that I have used. So let's pick this prompt as an example and put it here. And click
00:06:59 the generate. Now as you increase the number of video frames at the end when it is decoding with
00:07:05 the VAE you may get out of VRAM error. You see currently this is still using my GPU ID 1. It
00:07:12 is generating this 49 frames video on my second GPU. And this is 3 times slower than my
00:07:21 RTX 3090 GPU. However it is working. So don't worry even if you have low VRAM GPU you will
00:07:28 get the result eventually. And what is the speed of this generation on this GPU if you wonder. Go
00:07:35 to the server. Go to the logs. Go to the debug. And you will see the step speed here. As I said
00:07:41 as you increase the number of frames the duration will linearly increase. So currently this is 26
00:07:49 seconds / it on RTX 3060 for 49 frames. So this will take around 500 seconds. Because it is 20
00:08:00 steps. 20 multiplied with 26. And meanwhile it is generating with my secondary GPU. I am going to
00:08:07 use my first GPU as well. So I will make this GPU ID 0 and save like this you see. Then I am going
00:08:15 to make another generation. Generate. Now this should start using my second GPU. Let's see if it
00:08:23 is going to start or not. Since it has a queue system it may not start. Therefore I may be needed to
00:08:30 order it to do 10 generation like this. Okay now it should start. And you see it is now generating
00:08:37 video on my both of the GPUs and the speed you see here. So on my RTX 3090 it is 9.39 seconds right
00:08:46 now. It would be faster if I was not doing anything else on my GPU. So it would
00:08:52 take let's say 9 seconds maybe 8 seconds with 20. It would take only like 3 minutes, 4 minutes,
00:09:00 3 minutes on my GPU. So if you have 4090 GPU it would take less than 1 minute for 20 steps and 49
00:09:10 frames. It is really fast. And it is really easy and convenient to use with the SwarmUI. And if you
00:09:16 increase the frame count don't worry. It will just almost linearly increase. The VRAM usage
00:09:22 will stay same. So you can also generate very long videos like 10 second video. The generation with
00:09:28 my RTX 3090 Ti has been completed. I clicked it here and interrupted all sessions. There is one
00:09:35 another thing that you can change which is the output file. However webp is also working fine.
00:09:41 You can try each one of them and see which one of them is working best for your case. So what
00:09:46 if. If you get. Out of memory error like this. Which you shouldn't get on Windows because of
00:09:52 the shared VRAM. How to fix it? The fix is very easy. Go to the advanced sampling here and set the
00:10:00 sample size. How? VAE tile size as equal as 160 and VAE overlap tile overlap as 96. And with this way
00:10:12 you can generate very long videos even 10 second video maybe even longer than 10 second video. I
00:10:17 have tested this. So it should work even on the lower end GPUs. And how did I come up with these
00:10:25 values. The developer of the SwarmUI has done some testing and he recommends VAE tile size as
00:10:33 160 and VAE tile overlap as 96. Also make sure that you set this on cloud machines as well if
00:10:43 you are going to generate longer videos. As I am going to show in the upcoming part of the video.
00:10:48 So now I am going to show how to use this amazing model with SwarmUI on RunPod and Massed Compute.
00:10:54 If you have not registered Massed Compute before please use this link to register. I appreciate it.
00:10:59 You see there is Massed Compute instructions here. You should always read here. After registration
00:11:04 setting up your billing. Go to the deploy here. And I recommend to use RTX A6000 GPU,
00:11:11 Creator, SECourses. These are all as usual as before. Verify. And deploy. That's it. You see
00:11:19 this is only 31 cents per hour for this amazing, very powerful GPU. 48 GB of RAM, 48 GB of VRAM,
00:11:27 256 GB of storage but this is many many times faster than RunPod. For both downloading and for
00:11:34 both disk speeds. So as usual I will connect with our ThinLinc client. Then I will use the unified
00:11:43 downloader through my synchronizing drive. So drag and drop into download folder from
00:11:50 my synchronizing drive. And right click and extract here. I will use it but you can also
00:11:56 manually download. So the instructions are here. Let's just open and copy this. And start a new
00:12:02 terminal. Paste. And all I need to do is just select option 12. You can also download other
00:12:08 models from here. And the download has started. Currently Hugging Face Limited single connection
00:12:14 downloads to 40 MB. So it is going to take like 6 minutes. It is fine.
00:12:20 Meanwhile let's also start SwarmUI. So first of all you need to update your SwarmUI. This
00:12:26 is the update icon here. Double click it. It will update the SwarmUI to the latest version. It is
00:12:33 pre-installed and the update will take only like 1 minute. Maybe lesser. Let's see. And the update
00:12:40 has been completed and the SwarmUI has started. So I can use it inside the ThinLinc client.
00:12:47 Like this. With these pre-downloaded models. However, if you want to use it faster on your PC,
00:12:54 it is way easier. Just close this and you will see that there is Run Cloudflare SwarmUI here.
00:13:01 You see here. Double click it. It will start the SwarmUI with a public link that you can use on
00:13:08 your computer. It is here. You see just open link. This is using the Cloudflared. Just refresh until
00:13:15 it arrives. And yes, now I will copy this link and open a page in my computer and you see I am using
00:13:22 the SwarmUI from my Windows computer very fast and this is running on the Massed Compute. So even
00:13:29 if you don't have any GPU, it will not make any difference because this is running on the Massed
00:13:34 Compute cloud computing. Now all I need to do is just wait for the Mochi model to be downloaded.
00:13:40 So the download has been completed. It took like 7 minutes. However, it could be way faster because
00:13:48 of that I just opened an issue on Hugging Face to make my scripts download way faster hopefully.
00:13:55 So now I need to refresh the models folder. Go to models. Click refresh icon and you will see
00:14:01 the all-in-one Mochi 1 preview. Click it. And then click display advanced options as in the
00:14:07 Windows tutorial part. And the rest is exactly the same. And now it is time for me to show you how to
00:14:14 use the use this model on the RunPod. So go to the RunPod
00:14:18 instructions. Follow the older tutorials if you don't know. Please use this link to register. I
00:14:24 appreciate that so that I can get more credits and I can test them. I am not getting paid
00:14:29 any money from them. I am using the credits to make more tutorials to you. Go to deploy. I don't
00:14:35 recommend community cloud recently because it is way slow. I am using US Texas 3. This is my
00:14:40 favorite. And I am going to show with RTX 4090. My newest template is, let me show you which one.
00:14:47 It is Torch 2.2.0. This is the best newest one. And set the volume disk as you wish. Let's set it
00:14:55 like 50 gigabytes it is sufficient. This is really important if you remember this is the port that
00:15:00 we need and set overrides. I am not going to show entire thing because it is always same with the
00:15:04 previous videos. Nothing has changed. Download the install linux.sh file. Click connect. Connect
00:15:10 to Jupyterlab. I feel like this pod is so slow because it took a lot of time for to
00:15:16 get prepared. So I will rent another pod. If this happens, I usually rent another pod to save my
00:15:23 time. So when you run multiple pods, you can calculate their starting speed and decide which
00:15:29 one of them is working better. And this one has better statistics here. So I will delete
00:15:34 the other one. Unfortunately, it is very very non-stable with the RunPod. Sometimes you
00:15:40 may get a good pod but it is very rare. But mostly your pods will be very bad. Recently.
00:15:47 Unfortunately. With Massed Compute, it is never the case. Okay connect to Jupyterlab.
00:15:52 This one initialized faster. Click this arrow. Select install and SwarmUI model file. Follow
00:15:59 the instructions here for install. Okay it will start installation. Meanwhile let's also extract
00:16:05 here archive and refresh and then open the RunPod instructions. You can also manually download.
00:16:11 By the way, this is not. I don't recommend anymore. Open a terminal. Copy paste and select the model.
00:16:17 So while installing we will also download the model to speed up our initialization part.
00:16:23 Everything is same as before. I agree, customize, modern dark, this, ComfyUI local. I'm not going to
00:16:30 download anything so it will be fast and you see meanwhile installing I am also downloading the
00:16:35 model and making everything ready. I am doing the multiple things at the same time. So the SwarmUI
00:16:41 installation has been completed. However, model is still getting downloaded. Moreover you need
00:16:46 to wait until backends are loaded on the server. To see what's happening. Go to logs. Go to debug and
00:16:52 watch here. On the Massed Compute we already have installed. So you don't wait anything. You see it
00:16:58 is trying to fix all the libraries at the moment. Okay the model has been downloaded. You can also
00:17:04 download manually with wget as in my another tutorial that I have shown. And we are still
00:17:10 waiting backends to be loaded. However, if this fails forever, if you see repeating messages here,
00:17:17 you need to restart your pod. Okay. So it was able to load the backend as you are seeing
00:17:23 right now. It took a lot of time to install. Let's go to the generate tab, click models,
00:17:28 refresh and select the model and it is same as on the Windows, display advanced options text to
00:17:35 video. So if you want to terminate and not spend your money with RunPod close from here. Stop pod.
00:17:43 This will still use your money, but it will be costing less and you can resume again without
00:17:50 installing again. And when you terminated, you will lose everything. With the Massed Compute,
00:17:55 even if you stop instance, you will still be fully charged. Therefore, you need to delete
00:18:00 your instance and delete everything permanently. I hope you have enjoyed. Please follow us on
00:18:07 Patreon. Please join our discord channel. We have over 9000 members. When you click here,
00:18:12 you will see it. Please also go to our GitHub repository and please Star our repository.
00:18:19 Fork it. Watch it. You see, we have a lot of followers, stars, and if you sponsor me,
00:18:24 I appreciate that. And our discord server has over 9000 members. So come and ask and chat
00:18:31 with me and chat with everyone. Hopefully see you in another amazing tutorial video.

Uh oh!

How To Use Mochi 1 Open Source Video Generation Model On Your Windows PC RunPod and Massed Compute

How To Use Mochi 1 Open Source Video Generation Model On Your Windows PC, RunPod and Massed Compute

Full tutorial link > https://www.youtube.com/watch?v=iqBV7bCbDJY

Video Transcription

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!