A problem regarding stable video diffusion, with inference results yielding black images #7399
                  
                    
                      DoloresChong
                    
                  
                
                  started this conversation in
                General
              
            Replies: 2 comments 6 replies
-
| I suspect that there is/are operation(s) requiring  pipeline = StableVideoDiffusionPipeline.from_pretrained(
"stabilityai/stable-video-diffusion-img2vid", torch_dtype=torch.float32)
pipeline.enable_model_cpu_offload() or pipeline.enable_sequential_cpu_offload()
img = Image.open(r"genner_img\2024-03-11\23-09-10.jpg")
frames = pipeline(img, decode_chunk_size=1, generator=generator, output_type='np', height=256, width=256).frames[0]
export_to_video(frames, "generated.mp4", fps=7)Sequential offload might take long, but I wonder if it works in this way. | 
Beta Was this translation helpful? Give feedback.
                  
                    4 replies
                  
                
            -
| It could also be because SVD is not known to generate videos at such a low resolution. | 
Beta Was this translation helpful? Give feedback.
                  
                    2 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    

Uh oh!
There was an error while loading. Please reload this page.
-
I encountered an issue with inference in stable video diffusion. When I attempted to perform inference using the sample code, it indicated insufficient VRAM (video random access memory). Subsequently, I reduced the width and height of the output size, which resulted in the inference process running quickly. However, every frame image I obtained turned out to be black. I'm unsure about the reason behind this issue.
Sample code:
python
Copy code
pipeline = StableVideoDiffusionPipeline.from_pretrained(
"stabilityai/stable-video-diffusion-img2vid", torch_dtype=torch.float16, variant="fp16"
)
pipeline.enable_model_cpu_offload()
img = Image.open(r"genner_img\2024-03-11\23-09-10.jpg")
frames = pipeline(img, decode_chunk_size=1, generator=generator, output_type='np', height=256, width=256).frames[0]
export_to_video(frames, "generated.mp4", fps=7)
Beta Was this translation helpful? Give feedback.
All reactions