LTX 2 and Z Image Base Full Tutorial Audio to Video Lip Sync ComfyUI SwarmUI Windows Cloud

LTX 2 & Z Image Base Full Tutorial + Audio to Video Lip Sync + ComfyUI + SwarmUI + Windows + Cloud

Full tutorial link > https://www.youtube.com/watch?v=SkXrYezeEDc

LTX 2 is the newest state of the art (SOTA) Open Source video generation model and tutorial will show you how to use it with very best and most performant way in ComfyUI and also in SwarmUI. Moreover, Z Image Base model published and I will show how to use Z Image Base with most amazing preset and workflow as well. Furthermore, this tutorial will show you how to install, update, setup, download ComfyUI and SwarmUI and models and presets and workflows both on Windows and on RunPod, Massed Compute and SimplePod. Linux users can use Massed Compute scripts and installers directly. This is a masterpiece entire lecture level complete tutorial. This video will kickstart your AI journey 100x. Both local Windows and Cloud.

📂 Resources & Links:

🤖 ComfyUI Installer and Presets Zip File With CUDA 13: [ https://www.patreon.com/posts/ComfyUI-Installers-105023709 ]

💻 SwarmUI Installer and Presets Zip File: [ https://www.patreon.com/posts/SwarmUI-Install-Presets-114517862 ]

🚀 Model Downloader Zip File: [ https://www.patreon.com/posts/Model-Downloader-114517862 ]

🚆 SECourses Musubi Trainer (Model Quantize and Train App): [ https://www.patreon.com/posts/SECourses-Musubi-Trainer-137551634 ]

🛠️ Image Comparison Slider Tool: [ https://www.patreon.com/posts/image-video-comparison-slider-app-133935178 ]

👋 SECourses Discord Channel for 7/24 Support: [ https://bit.ly/SECoursesDiscord ]

ℹ️ SimplePod Register : https://simplepod.ai/ref?user=secourses

ℹ️ SimplePod Template : https://dash.simplepod.ai/account/explore/100/ref-secourses/

ℹ️ RunPod Register : https://get.runpod.io/955rkuppqv4h

ℹ️ RunPod Template : https://get.runpod.io/SECourses_CU13

ℹ️ Massed Compute Register : https://bit.ly/SECoursesMassedCompute

⏱️ TIMESTAMPS

00:00:00 Intro: ComfyUI + SwarmUI presets, Z-Image, model downloader, cloud installs

00:00:28 Free prompt enhancement with Google AI Studio (prepared prompt file)

00:01:26 Demo: 45s audio-driven lip-sync image→video (LTX 2)

00:02:25 Quick demos: ComfyUI image→video, text→video, Z-Image base

00:03:43 Quick demos: SwarmUI LTX 2 image→video + Z-Image base

00:04:46 Install/update presets zip (v78+): extract & overwrite everything

00:05:11 Upgrade to CUDA 13 safely: delete venv then run ComfyUI update/install

00:06:00 Windows prereqs + the always-updated setup guide referenced in description

00:06:42 Install required node bundles: (1) SwarmUI extra nodes + (100) LTX audio

00:07:29 VRAM-optimized launcher: no-VRAM / cache / smart-memory / precision choices

00:09:36 Share one model library: configure extra_model_paths.yaml (no duplicates)

00:10:25 Model Downloader overview: set base path + one-click bundles for SwarmUI/ComfyUI

00:11:43 Download LTX 2 core bundle: multi-connection download + merge + hash verify

00:12:12 Low-VRAM path: GGUF distilled models vs recommended FP8-scaled defaults

00:14:16 URL Downloader: Civitai/HF links, folder targeting, and optional API keys

00:14:54 ComfyUI preset pack tour: LTX2 (I2V/T2V/audio) + Z-Image (base/2× upscale)

00:15:16 ComfyUI audio lip-sync preset: image + resolution + audio setup

00:16:58 Frames & prompting: 24fps math, run/stop frame count, lyrics/subtitles prompts

00:17:59 Quality/perf knobs: CRF, VRAM monitoring, and low-VRAM args recap

00:20:22 Review result + move to LTX 2 image→video preset workflow

00:21:50 Prompt enhancer workflow: drag prompt file into AI Studio (optionally add image)

00:24:11 Z-Image troubleshooting: disable Sage Attention and restart

00:24:32 Z-Image base + 2× upscale preset: when to use it and what to expect

00:26:47 Outputs & reproducibility: where renders save + drag PNG to reload metadata

00:27:57 Update SwarmUI via zip: get latest presets + utilities

00:28:27 SwarmUI setup: ComfyUI backend, passing args, and pointing to model folders

00:30:21 SwarmUI image→video: direct apply + init image workflow

00:31:17 Fix model load/caching issues: add --use-cache-none when needed

00:32:51 SwarmUI text→video + upscale: duration/frames, half-res then upscale rules

00:33:52 SwarmUI outputs: output_local/raw + metadata saved with generations

00:35:05 SwarmUI Z-Image: base vs 2× upscale comparison + speed notes

00:36:58 Image comparison slider tool: quick before/after inspection

00:37:51 RunPod start: template choice, CUDA/driver constraints, optional storage volume

00:43:30 RunPod Jupyter: upload ComfyUI zip, extract, install bundles (1+100)

00:46:28 RunPod: run Model Downloader, start ComfyUI, connect via exposed port

00:52:05 RunPod: switch to SwarmUI, map folders (case-sensitive), import presets

00:55:43 RunPod: download outputs as archive + stop vs terminate cost control

00:57:55 SimplePod: cheaper/faster alternative + persistent volume setup highlights

01:03:52 Massed Compute: deploy with coupon + connect via ThinLinc (shared folder setup)

01:08:17 Massed Compute: install bundles + download models (disk speed advantage)

01:10:48 Massed Compute: start ComfyUI, connect from PC, run lip-sync preset

01:14:26 Massed Compute: hook SwarmUI to ComfyUI backend + text→video demo

01:17:02 Wrap-up: recap providers + next steps

#comfyui #comfyuitutorial #runpod

Video Transcription

00:00:00 Greetings everyone. Today I am going to show you LTX 2 workflows, presets in ComfyUI. I
00:00:07 have prepared all the configurations, one click to install and use. Moreover, I will
00:00:12 show Z-Image base model in ComfyUI. Moreover, I will show you how to use the LTX 2 in our famous
00:00:21 SwarmUI with presets, Z-Image base in SwarmUI with presets. Furthermore, I will show you how
00:00:28 to improve your prompts in Google AI Studio for free with our prepared commands file. Moreover,
00:00:36 downloading these preset models will be one click with our interface updated model
00:00:42 downloader. This is independent from both ComfyUI and SwarmUI. It has both Z-Image and LTX 2 models.
00:00:51 I also have prepared Z-Image Quant FP8 scaled model for you and also working on NVFP4 models
00:00:59 and NVFP4 generator. This application will have everything SECourses Musubi Trainer. It will
00:01:06 have all the options to quantize your models. Also, I will make fresh installation on RunPod,
00:01:13 Massed Compute, and SimplePod so that we will refresh our memory to how to install and use
00:01:20 ComfyUI and SwarmUI on cloud platforms in addition to your local Windows computer.
00:01:26 I am going to show you an amazing workflow in ComfyUI which you can provide your audio,
00:01:32 image, prompt and generate amazing audio driven
00:01:35 video. It has lip sync and it has animation. Let me demonstrate you.
00:01:40 SECourses. Let's go. Late night screen glow. Ideas on fire. Turning what if into something higher.
00:01:48 From zero to flow. We learn it fast. Build it ship it make the future last. New tools new
00:01:53 rules we break the doubt. Take that prompt watch the AI route. If you want the shortcuts done the
00:01:57 right way. Hit play let's level up today. Hands on the keyboard mind on the dream. Watch the
00:02:02 model run like a laser beam. No fear no fluff just clarity. One more step to your next victory. SECourses
00:02:08 AI on the rise. Lighting up the dark with electric skies. From prompts to pixels
00:02:13 code to sound. We make it real we run this ground. That's courses learn build repeat. Turn big ideas
00:02:19 into something elite. If you want the best place to start. That's courses straight to the heart.
00:02:25 As you have seen it was amazing video. It was 45 seconds. After like 30 seconds 35 seconds
00:02:32 it starts to degrade but this is what LTX 2 can do. This will be literally one click to
00:02:39 install and use. I will show that. So here another example in ComfyUI. This is image
00:02:43 to video workflow. It is also one click to use and I will show in SwarmUI as well. Let's see.
00:02:50 This is the tutorial that you have been waiting for so long.
00:02:53 I mean amazing. You see it did a very nice emotion at the end. The LTX video
00:02:59 can generate up to 20 seconds very good. It can generate up to 1 minute
00:03:03 but after 20 seconds it starts to degrade. Let's see text to video.
00:03:13 You took a wrong turn partner.
00:03:18 By the way these were quick demos that I did. They are not perfect. They are not
00:03:22 cherry picked. And this is Z-Image workflow. This is an amazing quality. So many people
00:03:29 is not using accurately and as you see this is what kind of quality Z-Image base can generate
00:03:35 in ComfyUI. All is preset I will show. Now let me show you the examples on the SwarmUI. This
00:03:43 is image to video in SwarmUI. It is so easy to use with our preset. I will show how to use.
00:03:49 This is the tutorial that you have been waiting for so long.
00:03:58 As you have seen it is amazing. These are quick demos so you can
00:04:02 generate even much better videos. So here another example in SwarmUI. This is video.
00:04:12 Faith falls! Witness me!
00:04:17 With video upscalers they would become much better or with more tries or improving the text prompt.
00:04:24 These are quick demos. And we have the Z-Image base preset in SwarmUI as well. It is working
00:04:30 amazing. Amazing quality, amazing realism and it is following the prompts very well very good.
00:04:36 So you should begin with upgrading your ComfyUI if you haven't yet or download the latest zip
00:04:41 file to obtain the all the presets that we have. So the link will be in the description
00:04:46 of the video. Download the latest zip file version 78. It can be bigger version when
00:04:51 you are watching this tutorial. Then copy the downloaded zip file. Go to your ComfyUI
00:04:56 installation or into a fresh folder. Both works. Then you need to extract archive and overwrite all
00:05:04 the files. This is really important. Overwrite all. Then if you didn't upgrade to CUDA 13 yet,
00:05:11 enter inside ComfyUI, delete your virtual environment folder like this. This is proper
00:05:17 way of upgrading CUDA 13 if you haven't yet. Since my ComfyUI is open it is preventing me
00:05:23 to delete the virtual environment folder. So let me delete all of them. Try again. Deleting
00:05:29 virtual environment will not cause you to lose any data. Then install or update ComfyUI run.
00:05:36 This is not mandatory every time. This is mandatory when you are upgrading from CUDA
00:05:40 12.9. So I am using Python 3.10. You need to have this installed in your computer. And this is the
00:05:47 version that I recommend. This is same for fresh installation and upgrade. We use the same file.
00:05:54 If you are first time into generative AI you really should follow Windows requirements
00:06:00 tutorial. When you scroll down a little bit you will see this. We have a up to date video
00:06:06 here so watch it and we have an up to date post which you will follow while you are
00:06:12 watching this tutorial video. It has all the links instructions. It is fully up to date.
00:06:17 Last time updated is 8 January 2026. I am using exactly as in here so this is
00:06:26 identical to my setup so therefore when you watch this tutorial all of the AI
00:06:31 applications will work on your PC perfectly fine on your computer perfectly fine.
00:06:36 So the ComfyUI installation and update has been completed. To use the newest presets you need to
00:06:42 use the Windows custom nodes bundles installer. Double click it. This is a unified bundles
00:06:49 installer that I have made for you. So you need to install option 1 SwarmUI extra nodes to use
00:06:55 the presets. Plus you need to install bundle 100 LTX audio to video bundle. So when you install
00:07:02 these two you will be ready. So 1 and comma and 100 and hit enter. It will ask you to proceed.
00:07:09 You can hit Y or enter and it will install the necessary bundles and update them. So this is for
00:07:16 both update and installation. This covers both of them. You see it is updating my bundles as
00:07:22 well and it is all getting updated. Okay all the nodes have been installed or updated properly.
00:07:29 Then I also have added windows_run_vram_optimized.bat
00:07:34 file. When you run this it will ask you to start your installation with specific configuration.
00:07:40 So if you are getting out of VRAM options you can do this. You can start with no VRAM
00:07:46 mode. When you do this you can even generate 1 minute videos on LTX 2. This will use very very
00:07:52 minimal amount of VRAM memory. Actually let me demonstrate you. So you see currently I am using
00:07:57 2 megabytes of VRAM. So I will choose the option 5. And if you are also low on RAM memory you can
00:08:04 start with no cache. So it is option 3. You can also disable smart memory. So let's disable as
00:08:12 well. And it will be GPU default or you can also choose VAE on CPU. I will make it GPU.
00:08:19 I am going to use Sage Attention. You shouldn't use Sage Attention for Z-Image base model at the
00:08:24 moment but for all other models you should use it. Currently Z-Image base model is not
00:08:29 working with Sage Attention. And the precision I am going to just make it auto. And for this
00:08:34 option just hit yes. Additional optimizations I am just going to leave it auto. Disable pinned
00:08:41 memory no. And it is all done. So you see it shows you the configuration options. You can
00:08:48 do this if you don't want to pick these every time. Copy this windows run GPU and paste. So
00:08:55 you see I have copied version. Right click edit in Notepad++. So you can append these options to here
00:09:04 like this. You see or like this. So this way you can customize your starting bat file and use it
00:09:13 to start directly with all these optimizations you want. This is also how you remove the sage
00:09:21 attention from run GPU here. So you can also edit this file and remove the Sage Attention
00:09:30 from here to start without Sage Attention. This is how you customize your starting with ComfyUI.
00:09:36 Moreover to use the SwarmUI downloaded models you need to do this. Copy extra model paths into your
00:09:43 ComfyUI folder. Then right click and edit it. Then select the models base path. This is where
00:09:51 I have downloaded all of my models so therefore when I give this it sees all the models inside
00:09:57 my SwarmUI installation folder or wherever I want. You can add multiple base paths to here.
00:10:04 Okay so the ComfyUI started with the minimal amount of VRAM memory usage. So let me show you
00:10:11 45 seconds audio to video generation. I have the preset here. How you are going to use them? You
00:10:18 need to first download the models. To download the models we are going to use the model downloader
00:10:25 which is included in SwarmUI model downloader zip file. This is also how we are going to
00:10:29 install SwarmUI updated and get the presets. So let's download this. The link will be in
00:10:34 the description of the video. So move the SwarmUI model downloader into your SwarmUI installation
00:10:40 or into any drive make a fresh installation. Right click. Then you need to extract here and
00:10:46 overwrite all the files. This is super important. Overwrite all the files so that it will update the
00:10:53 utilities and everything. Then double click Windows start download models application
00:10:57 file. Independent from SwarmUI. You don't need to install SwarmUI for this. This supports ComfyUI
00:11:05 as well. So this is our new interface. This is where you give the base model path. So you see
00:11:11 currently it is seeing automatically my SwarmUI models. You can change this into your ComfyUI
00:11:17 models as well. So enter inside here. Go to the models and copy the path. Copy paste it
00:11:24 here. You can also remember settings. And ComfyUI folder structure. Then you need to download the
00:11:30 LTX 2 video core bundle or Z-Image core bundle if you want to use them. But I am using SwarmUI
00:11:38 mainly therefore I am downloading models there. So refresh and it will return back to original
00:11:43 since I didn't remember settings. Then I need to click download all models of the LTX 2 core
00:11:51 bundle. Then it will start downloading with 16 connections. You can also drag and drop this
00:11:56 like this to expand or make it small or click here. And it will start the downloads. It will
00:12:01 do hash verification. It will download all the models that I have prepared for you. You can
00:12:06 also see their names here. If you have a very low VRAM GPU alternatively go to the video generation
00:12:12 models. Go to the LTX 2 video models and you can download the LTX 2 distilled GGUF model.
00:12:22 Which is here. So you can download GGUF Q4 model if you are on very low RAM and VRAM. This is not
00:12:30 recommended. Normally I use FP8 scaled. This is also what the bundle downloads. And additionally
00:12:35 you can also download FP4 mixed text encoder but by default I am using this one. This is mixed FP8
00:12:43 scaled. The name is like this because SwarmUI uses this automatically and my presets are also
00:12:49 set for this so don't worry the presets will be one click. So follow the download process
00:12:54 here. It will download all the original models as you are seeing very fast. 100 megabytes per
00:13:01 second on my computer. Then it will verify them then you will be ready to start using them.
00:13:06 Moreover since I am using extra model paths.yaml file inside ComfyUI when I start my ComfyUI you
00:13:12 see it adds all the models that I have given it. So it sees all the models that I have in SwarmUI
00:13:20 in ComfyUI. Therefore I am avoiding all the model duplication. And I am using central model folder.
00:13:28 You can also modify this file according to your needs. Add more paths or more mapping.
00:13:34 It works. SwarmUI also has this feature. You can give another base model path and it see
00:13:40 all of them. So you can have all of your models inside ComfyUI and also use in SwarmUI as well.
00:13:46 Since model downloader downloads with 16 connections it will merge all the downloaded
00:13:52 pieces into single file at the end. So you will see like this it is merging then it will verify
00:13:59 the hash. So now it is verifying its hash value. This is also working on cloud and Linux machines.
00:14:05 This way we are ensuring that we have accurately downloaded the models. Therefore we will never
00:14:10 have any corrupted model at all. Moreover this downloader application supports custom model
00:14:16 downloads as well. So let's hide this. Go to the URL downloader. In here you just need to give the
00:14:23 model path. It can be CivitAI. It can be Hugging Face. And you can select the folder where it will
00:14:29 download so that you can download your custom models this way on cloud machines or on your
00:14:35 Windows computer very fast. You can also provide your API keys to download even faster. This is
00:14:42 very very useful especially on RunPod on Massed Compute to download fast into the accurate folder.
00:14:48 Okay so all the models have been downloaded. Let's return back to our ComfyUI. Let's refresh.
00:14:54 Then inside ComfyUI extracted zip file you will see there are presets. So the newest presets are
00:15:02 LTX 2 image to video, text to video, audio lip sync and Z-Image base and Z-Image base with 2x
00:15:09 upscale. For Z-Image base don't forget you need to download same way Z-Image core bundle. Okay first
00:15:16 I will show you LTX 2 audio to lip sync image to video. Drag and drop it. So it is loaded. You
00:15:24 shouldn't have any red or warnings here if you have followed the steps that I have just shown
00:15:29 you. This is extremely high quality and fully optimized to work on low VRAM GPUs and systems
00:15:36 as well. So first of all you need to pick your image. Let's choose file to upload. I am going
00:15:43 to use this anime image. Then you need to set your width and height accurately. This is 9
00:15:51 to 16 therefore I am going to use 720 to 1280. If you are not sure how to set your width and
00:16:00 height accurately I recommend you to use SwarmUI. Unfortunately this preset doesn't exist in SwarmUI
00:16:08 because it doesn't support it. It uses a lot of custom nodes. Then quick tools reset params
00:16:14 to default. I am going to use my LTX 2 image to video. Then you can see that it will show you the
00:16:21 resolution like this and init image. So choose the same file in here. Resolution use closest
00:16:27 aspect ratio. You see this image is 720 to 1280 pixel for this model. Okay it is set. Then you
00:16:35 need to set your audio file which will be driving audio. I am going to use this 45 seconds audio.
00:16:44 SECourses. Let's go.
00:16:48 Okay this way you can verify your audio is working. So you see there is audio length in
00:16:53 seconds. So first set your audio length. You can also set audio start index here.
00:16:58 Then when you click this run and stop immediately you will see that it is
00:17:04 showing you how many frames you need here. So I need 1081 frames for this video. After that
00:17:11 you need to write your prompt. When you are using audio this is how you should prompt it.
00:17:19 Describe your audio in details plus write the full spoken or singing like this. You see this is the
00:17:30 lyrics of my audio. If it was a speaking it would be the subtitles transcription of the speaking.
00:17:38 So this is the way of it. Then I have after few tries I have come up with this prompt. You can
00:17:44 pause the video and read it. And like this. So this is my prompt like this and run. Then all
00:17:53 you need to do is just wait. You don't need to change anything else. You can also change your
00:17:59 final video quality from here. You see CRF this is the video quality. If you make this very low
00:18:05 it will be very big generated MP4 file. Let's also monitor our VRAM usage and the speed. Currently we
00:18:14 are running it on no VRAM. Therefore it will use minimal absolutely minimal amount of VRAM. So we
00:18:22 can see both of the screens like this. It will be therefore slow but it will work on every GPU.
00:18:30 Okay once the generation starts. So even if you have a 6 gigabyte GPU 4 gigabyte GPU you can use
00:18:37 all these presets with the way I have shown you in the beginning of the video how to start your
00:18:44 ComfyUI. This applies to SwarmUI as well. In the SwarmUI you set the same arguments into
00:18:51 your ComfyUI backend. And it is also using my GPU very decent. I mean look at that. It
00:18:57 is using like 500 watts. This is really really good. It is using my GPU very well. And it is
00:19:02 only using 5000 megabytes VRAM right now. So this way you can generate very long videos on very low
00:19:10 VRAM GPUs. And you can use the same arguments in your backend in your SwarmUI. It will work
00:19:16 exactly same. Remember when we started the ComfyUI it has shown what parameters what
00:19:24 arguments have been used like no VRAM cache none disable smart memory. Use Sage Attention disable
00:19:31 asynchronous offload. This way I have minimized my VRAM usage. So it is only using 5 gigabytes
00:19:39 of VRAM for 1081 frames at HD resolution. And it is generating it right now. This workflow first
00:19:49 do 8 steps for initial video generation then it does 3 steps of the upscale. The upscale will
00:19:56 be of course slower. And during the upscale it uses more VRAM as expected since it is
00:20:03 now bigger resolution. So you can reduce your length if you get out of VRAM error or on your
00:20:09 computer it may do even further optimization since I have VRAM maybe it didn't do I am not
00:20:15 sure. And the time is still very decent. I mean very acceptable time even at these settings.
00:20:24 Let's see it.
00:20:27 SECourses. Let's go. Late night screen glow. Ideas on fire. Turning what if into something higher.
00:20:34 From zero to flow. We learn it fast. Build it ship it make the future last. New tools new
00:20:39 rules we break the doubt. Take that prompt watch the AI route. If you want the shortcuts done the
00:20:43 right way. Hit play let's level up today. Hands on the keyboard mind on the dream. Watch the
00:20:49 model run like a laser beam. No fear no fluff just clarity. One more step to your next victory. SECourses
00:20:54 AI on the rise. Lighting up the dark with electric skies. From prompts to pixels
00:20:59 code to sound. We make it real we run this ground. That's courses learn build repeat. Turn big ideas
00:21:06 into something elite. If you want the best place to start. That's courses straight to the heart.
00:21:11 Purely amazing. This time there weren't any noise. There weren't any text. There weren't
00:21:16 any issues. This was even better than my initial demo that I have shown you. So what about using
00:21:22 the presets? They will be surely faster and they will be using lesser VRAM since we will be doing
00:21:29 lesser amount of frames. So let's begin with the image to video. Drag and drop it. You can
00:21:35 also open. So this is much more straightforward. Choose and upload your image. Same way like this.
00:21:43 Set your video height and width. I am going to use the same like this. Type your prompt. And
00:21:50 how you can make your prompt better? Now let me show you. In the SwarmUI zip file you will
00:21:56 see that there is LTX 2 enhance prompt feed for LLMs. When you open it you will see its content
00:22:04 like this. So to use this I am going to use Google AI Studio. Then drag and drop the file like this.
00:22:11 Enter your prompt and if you are also using an image add your image as well. This will help
00:22:18 it to also modify prompt according to the image. And run. If you are not using image just text to
00:22:25 video just type like this and it will enhance it automatically for you. The logic is same. Okay it
00:22:32 did enhance it. By the way using Google AI Studio is free. Therefore just copy paste it and run.
00:22:39 You can also change the duration from frame count. This is 24 FPS model therefore when
00:22:45 you divide this 24 it makes 5 second. So 5 multiplied with 24 plus 1 equal to our frame
00:22:54 count. If I want it 10 seconds then I need to make this 241 frames. So let's see the VRAM usage. It
00:23:03 will use very minimal amount of VRAM since we are only generating 121 frames. Okay let's see
00:23:11 how much VRAM it will use. It is also using very minimal amount of RAM memory too. During
00:23:16 the upscale it will use more GPU obviously. And the speed is also amazing even though I
00:23:22 am running it with no VRAM command argument. Okay with the upscale it is using 2.7. It is
00:23:32 using 3 gigabytes of VRAM during the upscale. This is amazing. I mean you will be able to
00:23:37 generate 5 second HD videos with only 3 gigabytes of VRAM memory. Can you believe that? Okay at the
00:23:45 end it used some more for the VAE tiling but it is fine. And we have the video. So let's see.
00:23:53 Hello everyone welcome.
00:23:55 It did some animation maybe changing the seed will help or I need to change the prompt but it
00:24:02 is working great so play with it to see. Then I will restart my ComfyUI with proper starting
00:24:11 and I will show Z-Image but for Z-Image I need to remove the Sage Attention because currently
00:24:18 Z-Image base is not working. The text to video is also same I am not going to show it. So let's
00:24:24 run the application with full optimizations with full speed and without Sage Attention. Okay the
00:24:32 application started. For Z-Image we are going to use the Z-Image with 2x upscale. I recommend this.
00:24:39 This is working really amazing. Amazing quality. Then type your prompt as you wish you know this
00:24:46 is just regular prompting. Let's use this prompt to see what we are going to get. Okay run. I am
00:24:52 also working on Z-Image NVFP4 model a properly made NVFP4 model so it will be 2x faster with
00:25:01 almost same quality. Our SECourses Musubi Trainer application is going to have all these convert to
00:25:09 quant options FP8 int8 and FP4 so that you will be also able to convert any model you
00:25:16 want into FP8 scaled quantized FP8 it will be amazing quality like original BF16 or NVFP4. It
00:25:24 will be 100 percentage faster on Blackwell GPUs. All of them. 5060 5070 5080 whatever 50 or newer
00:25:34 versions you get in future. So this is going to generate 4 megapixel resolution image. This has
00:25:41 the best upscale. If you don't want to upscale you just need to use the Z-Image base preset.
00:25:48 Okay let's see our generated image. By default I am doing 40 steps but you can reduce the step
00:25:55 count to 20 as well. With LTX 2 the step count is not changing. It is 8 and in the
00:26:02 upscale it is 3. But with the Z-Image base or with majority of the other models you can
00:26:08 change the steps count according to your need the speed. Currently it is using 18 gigabytes
00:26:14 of VRAM but with optimizations you can run this as low as 6 gigabyte GPUs. Since we are doing 4
00:26:21 megapixel image it is taking some time but we will get an amazing quality image. By the way
00:26:26 I don't know how will this video prompt work but you know the drill you can change the prompt as
00:26:32 you wish. Okay it generated image let's open in the new tab. So open image like this. Yeah for
00:26:41 this prompt this is a very decent output. All the generated files will be saved inside your
00:26:47 ComfyUI into your output folder. You see they are all saved. Even the videos are saved. To
00:26:54 remember the videos you need to do this. You see there is PNG file drag and drop it and it will
00:26:59 load everything from that PNG with the sets. So with the videos it saves a PNG file to save the
00:27:06 metadata. And the images are also saved like this with the ComfyUI extension. If you want to save
00:27:12 images with another prefix you can change the file name prefix from here. Or with the Z-Image
00:27:18 you can change the save prefix from here. You see ComfyUI you can change it to anything you want.
00:27:24 So this is all for ComfyUI for today. All other presets are working exactly same logic. You can
00:27:30 use all of our amazing presets that is coming with our zip file. You see all of them is working
00:27:37 tested working perfect. You just need to download the bundles. Like Z-Image, Flux 2, Wan, Qwen. We
00:27:45 have bundles for every major model out there. And for SwarmUI I am going to use the same ComfyUI
00:27:52 installation the backend. So let's just close our ComfyUI. So as usual download the latest SwarmUI
00:27:57 model downloader zip file. Overwrite your previous files. Then you should update your SwarmUI Windows
00:28:04 update SwarmUI. This will install necessary newer stuff and also update your SwarmUI to the latest
00:28:12 version. I have developed an extension for SwarmUI to support LTX 2 image to video upscale. Normally
00:28:20 SwarmUI is not supporting it but we are now supporting it. You can see that my extension is
00:28:27 compiled like this when you start it. Then go to the server backends and enter your backend as in
00:28:34 the same as other videos. You see ComfyUI backend. How you give the ComfyUI backend it is so easy. Go
00:28:41 to your ComfyUI installation ComfyUI folder copy it. Edit the here. Actually let me add a new one
00:28:50 to show you. So let's refresh. And ComfyUI self starting and edit it. Copy paste main.py. Extra
00:28:59 arguments. This is where you define your ComfyUI starting arguments like --use-sage-attention like
00:29:06 --cache-none like --no-vram. This way you can customize and optimize your ComfyUI backend.
00:29:14 Normally I just use Sage Attention. The only case that I don't use at the moment is Z-Image
00:29:20 base model because it degrades the quality significantly. It is broken at the moment. So
00:29:25 wait for your backend to start and it is started. Quick tools reset params to default. You need to
00:29:31 get the latest presets. So either import from here or my recommended way is Windows preset
00:29:38 delete import. So run it and hit yes and it will delete all of your previous presets and refresh
00:29:44 them with the new one. So refresh sort by name. All the newer presets are arrived like Z-Image
00:29:50 base or LTX 2 text to video LTX 2 upscale. When you use upscale you need to reduce your resolution
00:29:57 to half. So let's make a demo of image to video. Quick tools reset params to default. Click here
00:30:02 and direct apply. Since I have downloaded the LTX 2 video core bundle all the models
00:30:09 are automatically downloaded and selected. I have spent huge time to prepare these presets.
00:30:14 Then to use the image to video you need to choose your init image like this one. Then you need to
00:30:21 go to the image to video tab and set your video frame count. It is 121 frames right now. So it
00:30:28 will be 5 second video. Then enter your prompt same as in ComfyUI and you can generate as many
00:30:36 as videos you want. Let's generate one. So this will generate an image to video. You can always
00:30:41 follow the status from server logs debug menu to see what is happening. Let's also write another
00:30:48 prompt with an enhancement. So I am just going to use the Google AI Studio Gemini 3 Pro preview and
00:30:56 from SwarmUI we have LTX 2 enhance type my prompt. So this is my prompt. Let's see the enhancement.
00:31:05 Okay you see it had an error. Sometimes you may get this because of how the ComfyUI working just
00:31:11 hit generate again. This is happening because when you first time run it for some reason it
00:31:17 fails to properly load model. To prevent it you need to add --use-cache-none. So when you first
00:31:25 time click generate if you get error just wait and click generate again. This happens with the
00:31:32 LTX 2 video they didn't fix it yet. I am expecting fix hopefully soon. Don't worry just hit generate
00:31:39 again. You can also add the optimizations like disable smart memory or like cache none or low
00:31:49 VRAM. So one of them will work for your case. Once the first generation is completed it will
00:31:56 keep working. Okay now it is generating image to video. It should be also pretty fast. Yes. Okay
00:32:02 you see the image is changed like this because of my prompt so I need to modify
00:32:09 my prompt. For example like this. A video of a talking man saying hello everyone. So if your
00:32:17 image changes dramatically like this because of your prompt. You need to make it in a way
00:32:24 that it will not change the initial image. Let's see the new prompt impact. You see the
00:32:30 first video was done in 55 seconds. Second one is done in 35 seconds. Let's see it.
00:32:37 Hello everyone welcome.
00:32:39 Yes perfect quality. Perfectly matching the initial image. There is no changes so it is
00:32:44 all about your prompt. Prompt enhance may break image to video however it works really good at
00:32:51 text to video. Now let's see the text to video. So quick tools reset params to default and we
00:32:57 have LTX 2 text to video direct apply. Let's see our prompt like this. So this is enhanced prompt.
00:33:05 Change the resolution or aspect ratio. You can also change the duration from text to video. The
00:33:12 image to video is here image to video you change the frame count there. For text to video it is
00:33:18 here. So let's make this 241 frames and that's it generate. You can also upscale with text to
00:33:27 video how for text to video upscale you need to custom and half resolution like 640 to 360 and
00:33:36 go to presets and just apply in additionally this LTX 2 upscale like direct apply and let's generate
00:33:44 another one. The first one will be original HD the second one will use the upscaler model of
00:33:52 the LTX 2. All the generations of the SwarmUI will be saved inside SwarmUI output local raw
00:34:02 and they are categorized by the days that you generated them. You see all my images videos
00:34:08 are all saved here with their metadata as well. Okay the first video generated let's watch it.
00:34:15 Faithfulls! Witness me!
00:34:24 Yeah for this prompt this is what it has generated. Now we have the upscaled version
00:34:29 let's watch it as well. So this way you can tune your prompt try different prompts
00:34:44 and see what you are getting you can also compare. This was a 10 second video and it
00:34:49 was only generated in 40 seconds. 40 seconds for 10 second video. Amazing. I mean this is
00:34:56 241 frames. This would take forever on Wan 2.2 model. It is only taking 40 seconds on LTX 2.
00:35:05 So what about Z-Image? For Z-Image I am going to go to backends and edit my backend and remove the
00:35:12 Sage Attention. Currently this is mandatory. Probably it will be fixed soon. Then let's
00:35:18 open another tab reset params to default and in the preset our base preset is this one. So
00:35:25 let's use the same prompt as in this case I don't know what it will generate. Then I am also going
00:35:33 to apply the upscale. Then with the same seed let's also use the upscale. So for upscale we
00:35:40 are going to use image upscale direct apply and generate another one. The first one will
00:35:45 be 1024 to 1024 the second image will be 2048 to 2048 with our image upscale preset. This image
00:35:54 upscale preset works on every image generation. Okay the first image is done it is here it is
00:36:01 like this. Let's see its upscaled version too. So this is what it has generated as a base. I
00:36:08 am doing 40 steps for highest quality but you can reduce this to 20 steps as well. There will
00:36:15 be minimal impact on quality and it will be twice faster. As I also said I am also working on NVFP4
00:36:24 version of the Z-Image base model therefore it will be 100 percentage faster and almost
00:36:30 same quality. Moreover we have FP8 scaled of the Z-Image base model so you can download
00:36:37 it with our downloader and also use it if you are low on VRAM like 6 gigabyte GPUs. However
00:36:44 this model is not very big. The FP8 scaled is only 5.6 gigabyte the BF16 is like only
00:36:51 12 gigabytes. Okay the upscaled image is here. So this is the base image this is the upscaled
00:36:58 image. Let's compare them with our comparison application video comparison slider. The link
00:37:04 will be also description of the video to install this. Start image comparison. Let's put both the
00:37:11 files and full screen. So on the left we see the original normal image and on the right
00:37:17 we see the upscaled version. I mean look at the quality difference. The upscaled version
00:37:22 is much better. Many times better. It has added amazing amount of details quality so I recommend
00:37:30 you to use upscaled preset of the Z-Image base model to get really good images. The
00:37:38 realism of the Z-Image is already amazing so you can type a realistic prompt to get
00:37:44 realistic image like in this case. So it is all about your prompting to get whatever you want.
00:37:51 Now I will show you how to use ComfyUI and SwarmUI on RunPod. It is exactly actually same as before.
00:37:58 So once you downloaded the zip file you will see that we have RunPod SimplePod ComfyUI
00:38:03 instructions. RunPod and SimplePod are working exactly same way. You just need to follow this
00:38:09 file. You see we have RunPod section. So please use this link to register I appreciate that. Once
00:38:16 you registered login if it doesn't auto login you. Then go to billing add some credits. Then
00:38:25 return back to our instructions file always use the template mentioned here. So currently
00:38:32 we need to use this template for CUDA 13. This is based on official template don't
00:38:37 worry it just have some tweaks for you. Now this is super important. If you are going to
00:38:44 use your permanent storage select your permanent storage from here. So everything you do will be
00:38:49 saved 100 percentage. If you are not going to use permanent storage so just don't select anything
00:38:55 here. What you need to select here is additional filters and CUDA version 12.9. Unfortunately they
00:39:04 still didn't add CUDA 13 here. They need to add and upgrade their Nvidia drivers but they still
00:39:11 didn't do. Therefore we are selecting 12.9. This should bring us machines that has Nvidia drivers
00:39:19 that can run CUDA 13. Then select NVME and 100 gigabytes. So when we apply these two presets
00:39:27 unfortunately there aren't many machines we can use yet with CUDA 13 ComfyUI. So these are the
00:39:34 machines available. I am going to use RTX Pro 6000. This is one of the best as you know. Then
00:39:39 you see it has selected our template. I am going to contact to RunPod team to upgrade their drivers
00:39:45 so that CUDA 13 will arrive and they will upgrade their Nvidia drivers so we will be able to use
00:39:52 more machines. And in the template you can edit and change your volume disk if you are using your
00:40:00 permanent network storage. It will use whatever the space you have there but this will make a new
00:40:07 storage. So making permanent storage is so easy go to storage tab here click new network volume
00:40:13 select the server according to your server you will be limited to that region only so
00:40:19 this will limit your GPU availability. You can also click from here to see which regions have
00:40:25 more GPUs of this for example this one. Type your storage name 600 gigabytes type like 600
00:40:32 gigabyte and create network volume. Then you will be able to use it but I am not going to
00:40:37 use right now so I am going to delete it to not waste any money. Okay let's delete it.
00:40:43 So then deploy on demand. Now you need to wait a little to start the machine. It will take a
00:40:51 while like 5 minutes depending on the region on the server maybe 2 minutes. Unfortunately RunPod
00:40:58 machines are slow. Therefore Massed Compute is much faster. Now we are also supporting
00:41:04 SimplePod so easy to use same way actually. It is also faster and more GPUs available there.
00:41:10 I started 3 identical machines to see which one is starting fastest. Once I verify the
00:41:17 starting fastest one I will delete the other ones. Unfortunately with RunPod this is like
00:41:23 this you don't have many options you can also pick another GPU to make it start faster for
00:41:31 example like H200. This is a beast machine. It will work even faster than RTX Pro 6000.
00:41:38 So if you don't want to wait too long you can pick your machine accordingly or you can use
00:41:43 permanent storage to avoid this. So all of them is failing to start right now. Okay this one started.
00:41:51 Yeah this is probably H200 let's see yes. You see the H200 has a much better connection and
00:41:58 disk speed but I just emailed the RunPod team I think they will upgrade their Nvidia drivers
00:42:05 therefore we will have more machines available. Of course always you can use SimplePod and Massed
00:42:13 Compute. They need to upgrade their Nvidia driver so that we can use any machine available. Yeah I
00:42:19 will continue with H200 for this tutorial because it is already starting so I will
00:42:24 disable other machines stop not waste any time any money start. This is a shame the size of
00:42:32 the template is 9 gigabytes therefore it should be instant but you see the RunPod is very very
00:42:38 slow nowadays. They are degrading in quality unfortunately. I hope they become better. So
00:42:44 we are giving them feedback you should also give them feedback email them contact their support
00:42:50 tell them to upgrade their Nvidia driver add more GPUs. Okay this should start fairly fast.
00:42:56 You can see the download process. It doesn't matter which GPU you select my installers and
00:43:02 template supports all of them. What matters is the Nvidia driver which we are going to see in
00:43:08 telemetry when it starts it will show the Nvidia driver here. It should be 575 plus.
00:43:15 I just emailed them to upgrade all into 580 plus so that we can use any machine hopefully they do.
00:43:23 Okay the machine started when I go to connect you see it shows connect from 3001 this is ComfyUI
00:43:30 port we didn't start it yet and JupyterLab. So click the JupyterLab interface. You see the driver
00:43:36 is when we go to telemetry we see the driver is 575. You can also try CUDA 12.8 but it may not
00:43:44 work. This is guaranteed to work with all cases 575 driver. Hopefully they will upgrade and we
00:43:52 will have it. Okay so the JupyterLab interface started I will just drag and drop my ComfyUI
00:43:58 zip file into workspace wait for upload to be completed. So you see it is uploading right now.
00:44:04 Then I will extract archive refresh. Yes all done. Now open the RunPod SimplePod ComfyUI instructions
00:44:13 copy this command open a new terminal paste and hit enter. Then it will ask you which options to
00:44:20 install same as Windows 1 and 100 we are going to install these two bundles and hit enter then
00:44:26 hit enter. Then it will install the ComfyUI fully properly for us. Now you need to download models.
00:44:33 For downloading models we are going to use SwarmUI model downloader so it is here just drag and drop
00:44:38 it here refresh. It is still uploading wait for upload to be completed. This installation
00:44:45 is extremely optimized and fast but it depends on your machine as always. By the way you should
00:44:52 also verify your GPU open a new terminal pip install nvitop then type nvitop. And verify
00:45:01 that GPU is there and it is empty. Sometimes GPUs are being stuck or frozen or something.
00:45:08 Okay the SwarmUI installer also downloaded. So I will also install SwarmUI exactly same as
00:45:16 in Windows tutorial part. Okay so for SwarmUI installation we have RunPod SimplePod SwarmUI
00:45:24 install instructions. Just copy this entire command open a new terminal and paste it.
00:45:32 This will install SwarmUI very fast. Once you see the SwarmUI folder here you can start downloading
00:45:38 models or you can not install SwarmUI and just download models into ComfyUI as well.
00:45:44 This machine is a little bit slow at installation unfortunately. Okay SwarmUI folder appeared so we
00:45:50 can begin downloading models. RunPod SimplePod model download instructions copy this open a new
00:45:57 terminal paste it. Meanwhile both SwarmUI and ComfyUI are getting installed. You can follow
00:46:04 each of them. Unfortunately this machine is very slow. So for SwarmUI installation you
00:46:09 see it has generated Cloudflared link agree customize okay next just yourself none this
00:46:17 is important because we are always going to use our custom ComfyUI installation next I
00:46:22 am not going to download anything next yes I am sure to install and SwarmUI installed. So
00:46:28 our model downloader started let's open it from this Gradio link. It will automatically see my
00:46:34 SwarmUI installation and its accurate folder. So you see it sees the models folder here.
00:46:41 If you don't want to install SwarmUI and use it go to ComfyUI right click models copy path.
00:46:48 Then put a backslash and type like this. So it starts with backslash then enable ComfyUI
00:46:55 folder structure and download the models. So let's download LTX 2 for ComfyUI installation. I will
00:47:04 also use same folder for SwarmUI. Okay let's wait for ComfyUI installation to be completed
00:47:11 meanwhile the models are getting downloaded we can see them here it is downloading them.
00:47:17 Okay so our ComfyUI installation has been completed. Now let's start our ComfyUI. To do
00:47:24 that return back to your RunPod SimplePod ComfyUI instructions txt file and in the below you will
00:47:31 see the run command. This is important to use it. Open a new terminal and copy paste and hit enter.
00:47:39 You see we are starting with use Sage Attention. You can remove this add here other additional
00:47:45 arguments. This is how we start the ComfyUI instance. Initial starting may be a little bit
00:47:51 slow but you should see the appropriate messages in here. We can see that how much VRAM and RAM
00:47:58 memory our machine has. We have 143 gigabytes of VRAM and 1.8 terabytes of RAM memory yes this is
00:48:07 an massive RAM memory. So it is using CUDA 13 torch 2.9.1 as I said my installers are
00:48:15 properly made for every GPU out there so you can use my installer with every GPU on cloud
00:48:22 or on Windows. So the ComfyUI has been started. When you see this it means it has been started.
00:48:28 How to connect? Return back to your MyPod instance and you will see that there is port 3001. When you
00:48:35 open it now you will see your ComfyUI interface which is running on RunPod and the rest is exactly
00:48:44 same as using on your Windows computer. So my ComfyUI is started. All of the models also have
00:48:51 been downloaded. Now I am ready to use presets. So let's drag and drop our LTX 2 audio lip sync image
00:48:57 to video preset. It is loaded. You see there is no red no error messages it is perfect. So
00:49:06 choose file to upload. I will do the same as on the Windows part. So let's see how much it will
00:49:13 take. This is 45 seconds. Let's also copy paste our prompt. I also need to set my duration the
00:49:21 rest is exactly same as on the Windows tutorial part. You can also calculate 45 seconds equal to
00:49:29 45 multiplied 24 plus 1 1081 frames. And we can hit run here. Now you can follow the CMD window
00:49:41 on here. Let's also see the nvitop. pip install nvitop. nvitop. So we can follow the VRAM usage
00:49:51 from here. The initial loading of the model may be slow but we will see. You will also get these
00:49:58 messages you can ignore them it is still working perfect. This is ComfyUI mistake actually on some
00:50:05 models it shows these as unexpected or some other messages but it is working perfect right now. This
00:50:11 is also ComfyUI mistake which they have to fix but it is not fixed yet. By the way on this machine we
00:50:18 can also use BF16 model. Our unified downloader is downloading the FP8. How you can download it
00:50:24 on this machine? In the video generation models you can go to LTX 2 video models and you can
00:50:30 download the BF16 version directly. Because this machine has a huge amount of VRAM memory.
00:50:37 So the initial loading of the model is taking forever on RunPod disks because they are slow.
00:50:43 Okay model loaded and generation started. You see it is almost instant even for 45 seconds.
00:50:52 The first 8 steps took only 13 seconds. And now it is upscaling. It is using 48 gigabytes of
00:50:59 VRAM memory. So if you have RTX Pro 6000 on your Windows machine you can run this model very fast.
00:51:06 I mean shame on Nvidia. They are still selling us RTX 5090 from only 32 gigabytes of VRAM memory.
00:51:15 Okay this was also done very fast so it is like 30 seconds to generate 45 seconds video. We are
00:51:22 almost done. It is doing the now VAE decoding. It is using 57 gigabytes of VRAM memory right
00:51:29 now. Yeah it is already okay it is now combining video after that it will be completed. So it takes
00:51:36 like 1 minute to generate 45 seconds video on H200 and it is already done yes. Let me show.
00:51:58 Okay it is same as in the Windows tutorial part it is working perfect no issues directly working.
00:52:05 So how you can also use the SwarmUI in here? Since everything is ready all I need to do is
00:52:12 I need to turn off the ComfyUI. So that SwarmUI can use the entire GPU. To turn off the ComfyUI
00:52:21 normally you wouldn't need this if you don't run the ComfyUI. When you return back to the
00:52:26 prompts you will see that it has this command. So copy and open a new terminal if you want to
00:52:33 terminate your ComfyUI. This will terminate the ComfyUI properly and you will see that
00:52:39 it is using zero amount of VRAM. Now my SwarmUI already had started. I am going to go to server
00:52:47 backends ComfyUI self starting. I need to copy the path of the ComfyUI main.py so right click
00:52:56 copy path. Go to your SwarmUI put a backslash like this and --use-sage-attention and save. Now the
00:53:07 SwarmUI will be ready. Normally we won't have any models here because we didn't download them into
00:53:14 here we downloaded them into ComfyUI folder. So I can go to server configuration and I can go to
00:53:21 models folder here copy path put a backslash like this save after that we may be needed to restart
00:53:30 but let's see. Okay let's refresh. Yes the model arrived but it doesn't see the other model because
00:53:38 I need to change this into checkpoints. This is where it is downloaded in the ComfyUI. Yes. Now
00:53:46 I am ready to use everything. The VAE is here let's see. Okay it doesn't show but let's see
00:53:54 the Loras. Okay Lora is also different it must be something like this. By the way you don't
00:54:00 need this if you just download them into your SwarmUI folder. So let's see the Loras arrived.
00:54:07 Okay server configuration checkpoints Loras VAE. Oh I know why it is not working because it is not
00:54:15 Loras but Lora in ComfyUI. Yes like this. And the VAE folder is we need to check inside models it
00:54:24 is yeah accurate VAE. So you can look for the folder names. The Lora is Lora here let's see.
00:54:32 Okay this is empty. The Lora empty. Loras empty. And it is Loras. So it is case sensitive on Linux
00:54:42 machine therefore it is this. Yes. Save. And yes now I can see the Loras. Now I can import
00:54:51 the presets. SwarmUI model downloader the preset 45 import and it is ready. So now I can generate
00:55:02 anything I want. Let's generate text to video. So it is let's sort by the name. You can also
00:55:09 search from here like LTX LTX 2 text to video direct apply. So models are selected. The Loras
00:55:18 are also selected. All I need to do is just type my prompt. A man talking to camera and speaking
00:55:27 this is amazing. This is a just simple prompt you know the drill from the Windows tutorial part. You
00:55:34 can follow the status from logs debug menu. Now the SwarmUI will work perfectly fine on RunPod.
00:55:43 So how you can download the generations? Go to your output folder. So in here I can download
00:55:50 entire output with right click and download as an archive. It will start to download like
00:55:56 this. So you see it is downloaded the entire output. Since our downloader also downloads
00:56:02 the clip models text encoder models VAE models you won't see any of them downloaded from here.
00:56:08 We do everything fully automatically for you with the fastest and most accurate way both
00:56:15 in Windows and also in Linux. My Massed Compute instructions Massed Compute install files are
00:56:22 all for Linux so if you are a Linux user you can use them right away. It is still loading
00:56:27 model for SwarmUI to generate. Okay so the generation has been completed let's see it.
00:56:33 This is amazing.
00:56:35 Yeah this was our simple prompt and we got it. So how to download SwarmUI generations? Go to
00:56:41 SwarmUI go to output local. You can download entire raw folder download as an archive and
00:56:48 it will start download and save it and you have that. So how to terminate your machine
00:56:53 from stop pod. This will reduce your cost per hour but it will not make it zero. To
00:57:00 make it zero just terminate pod and everything is deleted. If we were using storage we would
00:57:06 still have it in here. But we are not using permanent storage so all is gone.
00:57:11 And one more thing before we proceed the RunPod developer team had replied my emails and they
00:57:18 are working on improving the Nvidia driver of the pods. You can see their answer here.
00:57:27 So hopefully in future all of the machines will work with our ComfyUI CUDA 13 installation. So
00:57:35 if your machine doesn't work currently you can select the CUDA 12.9 probably
00:57:42 they will add 13 too or CUDA 12.8 but this will probably won't be necessary
00:57:50 once they upgrade the Nvidia driver of all the machines. This is for now.
00:57:55 And as a next step I will show how to use SimplePod. For SimplePod we are going to use
00:58:00 RunPod SimplePod ComfyUI instructions file. You see the SimplePod is here. What is the difference?
00:58:06 Let's register from this link I appreciate that. Then go to your dashboard add some billing like
00:58:13 this. Then we are going to use this SimplePod template. This is faster than the RunPod. You
00:58:20 see it is showing this use template like this. This template already has all the features that
00:58:29 we need. Now you need to rent a machine. So we have RTX 5090 very cheap on SimplePod you
00:58:38 see it is only 47 cents. This is double price on RunPod. RTX Pro 6000 is only 72 cents. This
00:58:47 is almost triple price on RunPod. So let's go with the RTX Pro 6000 Blackwell GPU. This GPU
00:58:54 has 96 gigabyte of VRAM memory and it is same fast as the RTX 5090 even faster than that. And
00:59:04 then run. So this is going to have 51 gigabytes of disk space therefore you can edit it and increase
00:59:12 your disk space according to your need like 200 gigabytes. Then it will re-create the disk. When
00:59:21 you are first time creating the disk you can also select from there. So let's return back and click
00:59:27 here. The template is selected. You can edit and use. You see there is use template and edit
00:59:32 and use. Let's use the edit and use and there is also add persistent volume. SimplePod also
00:59:38 has persistent volume let me show you that. So go to storage add a new persistent volume like
00:59:44 200 gigabytes and the name is 200. It is like 3 times cheaper than the RunPod. Save. Let's return
00:59:51 back to our template selection double click. And I am going to select add persistent volume
00:59:57 and edit and use. Now it will let me select from here 200 make this workspace this is important.
01:00:06 And the system is ready. So this is going to use my persistent volume. Save and use. And
01:00:12 when I go to my servers you will see that it is going to start. Okay we made a mistake let's do
01:00:20 again. Add persistent volume edit and use. So 200 workspace. Okay save and use. Oh by the way
01:00:29 it changed the template it still didn't selected a machine so we are going to select this machine for
01:00:35 example and run. So this one is going to use my persistent volume. The other one is also started
01:00:42 here. They don't show the persistent volume but you will understand it actually there is
01:00:48 one hint of it. This shows 200 gigabytes and this shows disk volume and volume. When you have disk
01:00:54 volume and volume it means that you are using your persistent volume. So the rest is same how same?
01:01:01 You see that there is Jupiter interface direct link go to there. If it asks you to continue
01:01:08 continue because direct means that it is using direct connection. Then you see the RunPod like
01:01:14 interface started. Upload your ComfyUI SwarmUI whatever you want. By the way everything is much
01:01:21 faster on SimplePod that is the main difference and everything is much cheaper on SimplePod. So
01:01:27 you can use both SimplePod and RunPod then extract archive. And the rest is same let me just show you
01:01:35 the start of the terminal like this 1 and 100. Let's see the installation speed. Everything
01:01:43 is ready in this template. Okay the installation is just amazing amazingly fast as you are seeing
01:01:48 right now. Moreover you can download files much faster on SimplePod or upload. You see there is
01:01:55 file browser when you click this it will let you to download and upload files directly. Go
01:02:00 to workspace and every file will be here you can just select and download like this. Or you
01:02:07 can just upload from here. The upload allows you to upload file and folder. It is very very fast
01:02:14 compared to the RunPod. This way you can upload and download much faster than RunPod. So how to
01:02:22 terminate machines once you are done go back to your my servers there is delete options there
01:02:29 is no stop option here. So let's delete and let's delete. But all the data in your permanent storage
01:02:37 will remain there even if we deleted exactly as on RunPod. By the way on SimplePod we have forgotten
01:02:44 to show one more thing which was connecting the ComfyUI interface. For connecting the ComfyUI
01:02:51 interface we are going to use the 3000 port. My template already exposes that 3000 port so that
01:03:00 you will be able to connect from 3000 port on SimplePod to your ComfyUI instance. For SwarmUI
01:03:08 we are also using Cloudflared link as in the RunPod. So it is same as the RunPod actually.
01:03:14 So this is an alternative to RunPod much faster much cheaper but the disadvantage
01:03:19 is that it has limited amount of different GPUs. So when I go to rent GPU docker you
01:03:25 see 5090 and RTX Pro 6000 are the main GPUs that you can use. These two are the very best
01:03:31 GPUs actually that you can use for cheap but it is missing like A100 H200 B200 very
01:03:38 specific GPUs but it has the most commonly used amazing GPUs. So you were asking me to
01:03:43 make Vast AI scripts but instead of Vast AI I am providing you SimplePod. Cheaper better faster.
01:03:52 So as a next step let's do on the Massed Compute. To follow the Massed Compute open the Massed
01:03:58 Compute instructions read txt file. Please use this link to register I appreciate that. Once
01:04:05 you registered and logged in go to billing and add some credits as usual then go to deploy.
01:04:15 And in here it will show you the available GPUs. They have amazing prices so you can use H200 if
01:04:23 it is available and apply our coupon. Let's see if there is any availability. Oh actually there
01:04:29 is available yes nice. So you see currently it is 2.6 dollars per hour with our coupon SECourses it
01:04:36 becomes 1 dollar 95 cents. This is an amazing it is like 4 dollars on RunPod it is 2 dollars here.
01:04:45 So we can rent this or you can rent RTX Pro 6000 if it is available. Oh they have sold out all
01:04:52 the RTX Pro 6000s. They are adding new GPUs but they are very popular. So let's go with this one.
01:05:00 This is even better at training than inference. So if you are going to do training since this GPU
01:05:06 has HBM memory high bandwidth unified memory it is faster than the other GPUs like RTX Pro 6000
01:05:15 or 5090. By the way when you are deploying don't forget that select category as a creator select
01:05:22 image as an SECourses. This is our image. This is mandatory to select then apply our coupon and
01:05:28 pick any GPU you want. You need to pick any GPU spots are not working but all other GPUs
01:05:34 are working like this. You see it is 41 cents. For example L40S it is 61 cents. Like H100 it
01:05:44 is okay this is spot instance yes H100 it is 1.76 cents but I recommend H200 this is an amazing GPU.
01:05:54 So wait for GPU to be initialized it will take a little while. To connect we are going to use
01:06:00 ThinLinc client as usual so download the ThinLinc client from here for Windows Linux Mac. So just
01:06:07 download it. Then double click and start and next next install it will install and start.
01:06:14 From this ThinLinc client you need to go to options local devices just clipboard
01:06:19 synchronization and drives and add a shared folder with read and write permission I am
01:06:24 using this folder. Then move the downloaded ComfyUI and SwarmUI files into your shared
01:06:32 folder so that you will be able to access and use them. You can also put your other
01:06:37 files but you cannot use this for big files. For big files you have to use other storage
01:06:45 system like Hugging Face or Google Drive. Okay Massed Compute is getting initialized.
01:06:50 Machine initialization is now faster than before it should be done in 5 to 10 minutes.
01:06:56 Okay so the machine has been initialized let's connect it copy this login URL. Okay this was
01:07:02 done. Paste here copy password paste here copy username paste here connect. Continue. Okay it
01:07:11 is connecting then start. Okay it is starting and it is started. This is running on cloud on
01:07:19 Massed Compute. Go to home go to thin drives. You can also login your Patreon from here and
01:07:25 download links. Go to Massed Compute you see it is like your Windows computer there is
01:07:31 Chrome you can login download upload you can do whatever you want. Wait it to list the files in
01:07:38 your shared folder in here. So it will list all the files I have in here. This is synchronizing
01:07:45 with my computer but it is slow. So I am going to copy paste all the files version 78 and SwarmUI
01:07:54 latest version into downloads folder. Do not run anything in this shared folder. Always copy them
01:08:01 into downloads folder then we are going to run them from here. So wait for copy to be completed
01:08:08 it shows the progress here. The ComfyUI is done so extract here and Massed Compute instructions read
01:08:17 txt file. So to install we are going to use this command copy open a new terminal and paste it.
01:08:27 Then it will ask us to which ones to install 1 and 100 and hit enter. So the installation on Massed
01:08:36 Compute is like 10 times to 100 times faster than on RunPod the disk speed is like 100 times faster
01:08:42 than RunPod on Massed Compute. Their disadvantage is that they don't have permanent storage system
01:08:48 yet so every time you have to start from beginning but the speed is just mind blowingly fast so it
01:08:56 takes like 2 minutes to setup and start the Massed Compute. So installation is almost completed you
01:09:02 see. Let's also download the files to download the files we are going to use SwarmUI model downloader
01:09:07 extract here. Massed Compute already has SwarmUI so that you don't need to install but if you want
01:09:14 you can install. So Massed Compute model downloads here I will copy copy paste and open terminal and
01:09:23 paste. Okay it is going to install and start the model downloader this is really fast.
01:09:29 I will download files into the SwarmUI folder but I will make them to be used by the ComfyUI as well
01:09:39 like in the Windows. So I will have the access to the files from both applications. You see it is
01:09:45 currently using this as a base path this is where the SwarmUI installed in Massed Compute. So let's
01:09:54 download LTX 2 bundle download all. The download is also really fast so it started. Before I start
01:10:03 the ComfyUI instance I am going to do a change. Okay it is already installed. It took like 1
01:10:08 minutes. Go back to ComfyUI and you will see that there is extra model paths.yaml file open it and
01:10:18 this is the base path I am going to change this into this. Okay. Let me zoom in. Okay can I zoom
01:10:25 in? No. So this is the path where I want ComfyUI to read model files save it. Okay it is not saved
01:10:34 yet it is saving right now. Okay it is saved. Then copy this and paste into ComfyUI folder.
01:10:42 Then return back to Massed Compute instructions. I am first going to show ComfyUI usage then we
01:10:48 will see in SwarmUI. Okay then we are going to use this command to start this one so copy this
01:10:58 open a new terminal inside this folder ComfyUI version 78. This is our main installation folder
01:11:06 paste. And it will start the ComfyUI. You can see the messages in here. Always look for the
01:11:14 CMD to see the accurate messages. You see it is loading everything properly with torch 2.9.1 and
01:11:22 CUDA 13. It is loaded. How to access this from our computer? For accessing it I am going to use this
01:11:31 URL copy link and in here type 8888 like this and hit enter. Because it has started in this port as
01:11:41 you can see oh actually this is the port not 8888 so let's make it like this yes. So now I am able
01:11:49 to access the ComfyUI running in Massed Compute from my Windows computer. The rest is same as in
01:11:58 the Windows tutorial part I am going to just load my preset for example audio to lip sync.
01:12:05 Yes perfectly loaded this is running on the Massed Compute. I am going to choose my demo files like
01:12:13 this one. Let's choose our audio file this one. Then set this as in the Windows 45 seconds 1081
01:12:25 frames and I need to type my prompt as usual this is my prompt and run generate. Now let's
01:12:36 see the speed of beginning the model loading and everything. Okay it is already loaded.
01:12:43 It is going to start processing in a moment. The speed of Massed Compute is unchallenged.
01:12:49 Not SimplePod not RunPod nothing can match the disk speed of the Massed Compute it already
01:12:55 started generating so the whole process takes like 5 minutes once you done it. You will get
01:13:03 used to it you can use the Massed Compute all the time if you don't have a powerful GPU this is one
01:13:08 of my recommended cloud server provider. So we are almost done to get 45 seconds HD video. It
01:13:18 is taking like 1 minutes lesser than 1 minutes to generate. We can see the whole process live while
01:13:25 I am recording. Everything is so transparent. So it is now VAE decoding. Let's see the output
01:13:33 in here so it is going to do video combine this is running on cloud not on my machine. But I am
01:13:39 using it from my computer. And it is almost ready. Yes. Yes it is ready let's see it.
01:13:55 So you can right click and save as the video like right click and save
01:14:00 preview or it is also saved in the outputs folder inside ComfyUI inside output so the
01:14:09 video is here I can copy this and move it into ThinLinc client you see it is here
01:14:16 this is with audio. So save preview will also download the video into your computer as well.
01:14:26 So how to use SwarmUI with this ComfyUI installation in Massed Compute? I am going to
01:14:32 close my ComfyUI and Ctrl Alt D it will minimize everything. You see we have RunPod stable SwarmUI
01:14:41 update this is actually latest version of SwarmUI and it will update it alternatively you can also
01:14:47 use the new installer from here but this is working pretty good so it will update SwarmUI
01:14:56 then go to server backends and change this. You should use new installed ComfyUI always
01:15:02 so it is here ComfyUI Ctrl L it will select the path Ctrl C and edit this and paste it and then
01:15:12 main.py then in here this is like this main.py then in here --use-sage-attention and save. We are
01:15:23 ready but we need to check the models are already here since I have downloaded them already into the
01:15:54 SwarmUI folder I need to check the Lora folder where it is. So it is Lora here. Choose file.
01:16:00 The presets are here amazing SwarmUI presets import refresh sort by name.
01:16:07 Now I can use the preset that I want for example LTX 2 text to video direct apply
01:16:16 always quick tools reset params to default then direct apply. Then type your prompt a spaceship
01:16:23 going in space with cinematic background music. Okay let's select our aspect ratio and generate.
01:16:33 So the usage is exactly same as in the Windows I am not repeating in details please do not
01:16:39 skip those parts. But this is running on the cloud machine you can see that it will start
01:16:46 the processing immediately on Massed Compute. This is fast and I can generate very long videos
01:16:53 this way advanced options text to video. I can make this 481 frames let's play this one.
01:17:02 So you see it is working amazing. Please like and subscribe. Join
01:17:07 our Patreon to support us and get these amazing presets. See you later.

Uh oh!

LTX 2 and Z Image Base Full Tutorial Audio to Video Lip Sync ComfyUI SwarmUI Windows Cloud

LTX 2 & Z Image Base Full Tutorial + Audio to Video Lip Sync + ComfyUI + SwarmUI + Windows + Cloud

Full tutorial link > https://www.youtube.com/watch?v=SkXrYezeEDc

Video Transcription

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!