Adapt from DeepFuze and VideoHelperSuite to create command line tool for perfect lip sync #72

pubple · 2025-03-31T09:33:29Z

pubple
Mar 31, 2025

Can someone have any idea on adapting components from [DeepFuze] (https://github.com/SamKhoze/ComfyUI-DeepFuze) and [VideoHelperSuite] (https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite) to create a more efficient lip-syncing run as command line solution with reduced disk I/O.

The deepfuze lipsync workflow on comfyui works perfectly on my mac chip m1, but sadly it is very slow because it writes many temp files.

Correct me if i'm wrong but I think that it is because it does not process frames in memory, instead it uses files.

So does anyone want to have a command line like this:

python lipsync.py --ref-video path/to/video.mp4 \ # the reference video
--audio path/to/input_audio.wav \ # the input audio file
--output path/to/output_lipsynced_and_enhanced.mp4 # the output lip sync video with very high quality

Key points:

Uses MPS (Metal Performance Shaders) on Apple M1 chips for GPU acceleration, the code use CoreMLExecutionProvider
Processes frames in memory to minimize disk I/O => This is critical I think
Parallelizes operations where possible
Uses ONNX runtime for optimized model inference

How It Works:

Input Processing: Loads video frames and audio into memory
Audio Analysis: Converts audio into mel spectrograms
Face Detection: Locates faces in each frame
Lip Synchronization: Adjusts lip movements based on audio, use wav2lip_gan.onnx
Face Enhancement: Improves visual quality of facial features, use gfpgan_1.4.onnx
Output Generation: Combines processed frames with given audio to make final output video
From step 5, it is easy to pipe to another process for upscaling....
I'm a newbie with python, don't know how to come up with the detail solution but this approach could be very practical.
Correct me if I'm wrong.

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Adapt from DeepFuze and VideoHelperSuite to create command line tool for perfect lip sync #72

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Adapt from DeepFuze and VideoHelperSuite to create command line tool for perfect lip sync #72

Uh oh!

pubple Mar 31, 2025

Replies: 0 comments

pubple
Mar 31, 2025