Skip to content

Commit 35dd94f

Browse files
committed
Merge branch 'main' of github.com:Winfredy/SadTalker
2 parents 479a5ad + 8a99c8b commit 35dd94f

File tree

6 files changed

+337
-133
lines changed

6 files changed

+337
-133
lines changed

README.md

Lines changed: 39 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,11 @@
3636
</div>
3737

3838
## 🔥 Highlight
39+
40+
- 🔥 The extension of the [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) is online. Just install it in `extensions -> install from URL -> https://github.com/Winfredy/SadTalker`, checkout more details [here](#sd-webui-extension).
41+
42+
https://user-images.githubusercontent.com/4397546/222513483-89161f58-83d0-40e4-8e41-96c32b47bd4e.mp4
43+
3944
- 🔥 Beta version of the `full image mode` is online! checkout [here](https://github.com/Winfredy/SadTalker#beta-full-bodyimage-generation) for more details.
4045

4146
| still | still + enhancer | [input image @bagbag1815](https://twitter.com/bagbag1815/status/1642754319094108161) |
@@ -49,6 +54,10 @@
4954

5055
## 📋 Changelog (Previous changelog can be founded [here](docs/changlelog.md))
5156

57+
- __[2023.04.06]__: stable-diffiusion webui extension is release.
58+
59+
- __[2023.04.03]__: Enable TTS in huggingface and gradio local demo.
60+
5261
- __[2023.03.30]__: Launch beta version of the full body mode.
5362

5463
- __[2023.03.30]__: Launch new feature: through using reference videos, our algorithm can generate videos with more natural eye blinking and some eyebrow movement.
@@ -82,16 +91,14 @@ the 3D-aware face render for final video generation.
8291
- [ ] training code of each componments.
8392
- [ ] Audio-driven Anime Avatar.
8493
- [ ] interpolate ChatGPT for a conversation demo 🤔
85-
- [ ] integrade with stable-diffusion-web-ui. (stay tunning!)
94+
- [x] integrade with stable-diffusion-web-ui. (stay tunning!)
8695

87-
https://user-images.githubusercontent.com/4397546/222513483-89161f58-83d0-40e4-8e41-96c32b47bd4e.mp4
8896

8997

90-
## ⚙️ Installation
9198

92-
#### Dependence Installation
99+
## ⚙️ Installation
93100

94-
<details><summary>CLICK ME For Mannual Installation </summary>
101+
#### Installing Sadtalker on Linux:
95102

96103
```bash
97104
git clone https://github.com/Winfredy/SadTalker.git
@@ -108,25 +115,39 @@ conda install ffmpeg
108115

109116
pip install -r requirements.txt
110117

118+
### tts is optional for gradio demo.
119+
### pip install TTS
120+
111121
```
112122

113-
</details>
123+
More tips about installnation on Windows and the Docker file can be founded [here](docs/install.md)
124+
125+
#### Sd-Webui-Extension:
126+
<details><summary>CLICK ME</summary>
114127

115-
<details><summary>CLICK For Docker Installation </summary>
128+
Installing the lastest version of [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) and install the sadtalker via `extension`.
129+
<img width="726" alt="image" src="https://user-images.githubusercontent.com/4397546/230698519-267d1d1f-6e99-4dd4-81e1-7b889259efbd.png">
116130

117-
A dockerfile are also provided by [@thegenerativegeneration](https://github.com/thegenerativegeneration) in [docker hub](https://hub.docker.com/repository/docker/wawa9000/sadtalker), which can be used directly as:
131+
Then, retarting the stable-diffusion-webui, set some commandline args. The models will be downloaded automatically in the right place. Alternatively, you can add the path of pre-downloaded sadtalker checkpoints to `SADTALKTER_CHECKPOINTS` in `webui_user.sh`(linux) or `webui_user.bat`(windows) by:
118132

119133
```bash
120-
docker run --gpus "all" --rm -v $(pwd):/host_dir wawa9000/sadtalker \
121-
--driven_audio /host_dir/deyu.wav \
122-
--source_image /host_dir/image.jpg \
123-
--expression_scale 1.0 \
124-
--still \
125-
--result_dir /host_dir
134+
# windows (webui_user.bat)
135+
set COMMANDLINE_ARGS=--no-gradio-queue --disable-safe-unpickle
136+
set SADTALKER_CHECKPOINTS=D:\SadTalker\checkpoints
137+
138+
# linux (webui_user.sh)
139+
export COMMANDLINE_ARGS=--no-gradio-queue --disable-safe-unpickle
140+
export SADTALKER_CHECKPOINTS=/path/to/SadTalker/checkpoints
126141
```
142+
143+
After installation, the SadTalker can be used in stable-diffusion-webui directly.
144+
145+
<img width="726" alt="image" src="https://user-images.githubusercontent.com/4397546/230698614-58015182-2916-4240-b324-e69022ef75b3.png">
146+
127147
</details>
128148

129149

150+
130151
#### Download Trained Models
131152
<details><summary>CLICK ME</summary>
132153

@@ -161,9 +182,12 @@ python inference.py --driven_audio <audio.wav> --source_image <video.mp4 or pict
161182
```
162183
The results will be saved in `results/$SOME_TIMESTAMP/*.mp4`.
163184

164-
Or a local gradio demo can be run by:
185+
Or a local gradio demo similar to our [hugging-face demo](https://huggingface.co/spaces/vinthony/SadTalker) can be run by:
165186

166187
```bash
188+
189+
## you need manually install TTS(https://github.com/coqui-ai/TTS) via `pip install tts` in advanced.
190+
167191
python app.py
168192
```
169193

docs/install.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
2+
3+
4+
### Windows Native
5+
6+
- Make sure you have `ffmpeg` in the `%PATH%` as suggested in [#54](https://github.com/Winfredy/SadTalker/issues/54), following [this](https://www.geeksforgeeks.org/how-to-install-ffmpeg-on-windows/) installation to install `ffmpeg`.
7+
8+
9+
### Windows WSL
10+
- Make sure the environment: `export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH`
11+
12+
13+
### Docker installnation
14+
15+
A dockerfile are also provided by [@thegenerativegeneration](https://github.com/thegenerativegeneration) in [docker hub](https://hub.docker.com/repository/docker/wawa9000/sadtalker), which can be used directly as:
16+
17+
```bash
18+
docker run --gpus "all" --rm -v $(pwd):/host_dir wawa9000/sadtalker \
19+
--driven_audio /host_dir/deyu.wav \
20+
--source_image /host_dir/image.jpg \
21+
--expression_scale 1.0 \
22+
--still \
23+
--result_dir /host_dir
24+
```
25+

scripts/download_models.sh

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
mkdir ./checkpoints
2-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/auido2exp_00300-model.pth -O ./checkpoints/auido2exp_00300-model.pth
3-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/auido2pose_00140-model.pth -O ./checkpoints/auido2pose_00140-model.pth
4-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/epoch_20.pth -O ./checkpoints/epoch_20.pth
5-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/facevid2vid_00189-model.pth.tar -O ./checkpoints/facevid2vid_00189-model.pth.tar
6-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/shape_predictor_68_face_landmarks.dat -O ./checkpoints/shape_predictor_68_face_landmarks.dat
7-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/wav2lip.pth -O ./checkpoints/wav2lip.pth
8-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/mapping_00229-model.pth.tar -O ./checkpoints/mapping_00229-model.pth.tar
9-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/BFM_Fitting.zip -O ./checkpoints/BFM_Fitting.zip
10-
wget https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/hub.zip -O ./checkpoints/hub.zip
11-
unzip ./checkpoints/hub.zip -d ./checkpoints/
12-
unzip ./checkpoints/BFM_Fitting.zip -d ./checkpoints/
2+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/auido2exp_00300-model.pth -O ./checkpoints/auido2exp_00300-model.pth
3+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/auido2pose_00140-model.pth -O ./checkpoints/auido2pose_00140-model.pth
4+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/epoch_20.pth -O ./checkpoints/epoch_20.pth
5+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/facevid2vid_00189-model.pth.tar -O ./checkpoints/facevid2vid_00189-model.pth.tar
6+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/shape_predictor_68_face_landmarks.dat -O ./checkpoints/shape_predictor_68_face_landmarks.dat
7+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/wav2lip.pth -O ./checkpoints/wav2lip.pth
8+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/mapping_00229-model.pth.tar -O ./checkpoints/mapping_00229-model.pth.tar
9+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/BFM_Fitting.zip -O ./checkpoints/BFM_Fitting.zip
10+
wget -nc https://github.com/Winfredy/SadTalker/releases/download/v0.0.1/hub.zip -O ./checkpoints/hub.zip
11+
12+
unzip -n ./checkpoints/hub.zip -d ./checkpoints/
13+
unzip -n ./checkpoints/BFM_Fitting.zip -d ./checkpoints/

scripts/extension.py

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
import os, sys
2+
from pathlib import Path
3+
import tempfile
4+
import gradio as gr
5+
from modules.call_queue import wrap_gradio_gpu_call, wrap_queued_call
6+
from modules.shared import opts, OptionInfo
7+
from modules import shared, paths, script_callbacks
8+
import launch
9+
import glob
10+
11+
def get_source_image(image):
12+
return image
13+
14+
def get_img_from_txt2img(x):
15+
talker_path = Path(paths.script_path) / "outputs"
16+
imgs_from_txt_dir = str(talker_path / "txt2img-images/")
17+
imgs = glob.glob(imgs_from_txt_dir+'/*/*.png')
18+
imgs.sort(key=lambda x:os.path.getmtime(os.path.join(imgs_from_txt_dir, x)))
19+
img_from_txt_path = os.path.join(imgs_from_txt_dir, imgs[-1])
20+
return img_from_txt_path, img_from_txt_path
21+
22+
def get_img_from_img2img(x):
23+
talker_path = Path(paths.script_path) / "outputs"
24+
imgs_from_img_dir = str(talker_path / "img2img-images/")
25+
imgs = glob.glob(imgs_from_img_dir+'/*/*.png')
26+
imgs.sort(key=lambda x:os.path.getmtime(os.path.join(imgs_from_img_dir, x)))
27+
img_from_img_path = os.path.join(imgs_from_img_dir, imgs[-1])
28+
return img_from_img_path, img_from_img_path
29+
30+
def install():
31+
32+
kv = {
33+
"face-alignment": "face-alignment==1.3.5",
34+
"imageio": "imageio==2.19.3",
35+
"imageio-ffmpeg": "imageio-ffmpeg==0.4.7",
36+
"librosa":"librosa==0.8.0",
37+
"pydub":"pydub==0.25.1",
38+
"scipy":"scipy==1.8.1",
39+
"tqdm": "tqdm",
40+
"yacs":"yacs==0.1.8",
41+
"pyyaml": "pyyaml",
42+
"dlib": "dlib-bin",
43+
"gfpgan": "gfpgan",
44+
}
45+
46+
for k,v in kv.items():
47+
print(k, launch.is_installed(k))
48+
if not launch.is_installed(k):
49+
launch.run_pip("install "+ v, "requirements for SadTalker")
50+
51+
52+
if os.getenv('SADTALKER_CHECKPOINTS'):
53+
print('load Sadtalker Checkpoints from '+ os.getenv('SADTALKER_CHECKPOINTS'))
54+
else:
55+
### run the scripts to downlod models to correct localtion.
56+
print('download models for SadTalker')
57+
launch.run("cd " + paths.script_path+"/extensions/SadTalker && bash ./scripts/download_models.sh", live=True)
58+
print('SadTalker is successfully installed!')
59+
60+
61+
def on_ui_tabs():
62+
install()
63+
64+
sys.path.extend([paths.script_path+'/extensions/SadTalker'])
65+
66+
repo_dir = paths.script_path+'/extensions/SadTalker/'
67+
68+
result_dir = opts.sadtalker_result_dir
69+
os.makedirs(result_dir, exist_ok=True)
70+
71+
from src.gradio_demo import SadTalker
72+
73+
if os.getenv('SADTALKER_CHECKPOINTS'):
74+
checkpoint_path = os.getenv('SADTALKER_CHECKPOINTS')
75+
else:
76+
checkpoint_path = repo_dir+'checkpoints/'
77+
78+
sad_talker = SadTalker(checkpoint_path=checkpoint_path, config_path=repo_dir+'src/config', lazy_load=True)
79+
80+
with gr.Blocks(analytics_enabled=False) as audio_to_video:
81+
with gr.Row().style(equal_height=False):
82+
with gr.Column(variant='panel'):
83+
with gr.Tabs(elem_id="sadtalker_source_image"):
84+
with gr.TabItem('Upload image'):
85+
with gr.Row():
86+
input_image = gr.Image(label="Source image", source="upload", type="filepath").style(height=512,width=512)
87+
88+
with gr.Row():
89+
submit_image2 = gr.Button('load From txt2img', variant='primary')
90+
submit_image2.click(fn=get_img_from_txt2img, inputs=input_image, outputs=[input_image, input_image])
91+
92+
submit_image3 = gr.Button('load from img2img', variant='primary')
93+
submit_image3.click(fn=get_img_from_img2img, inputs=input_image, outputs=[input_image, input_image])
94+
95+
with gr.Tabs(elem_id="sadtalker_driven_audio"):
96+
with gr.TabItem('Upload'):
97+
with gr.Column(variant='panel'):
98+
99+
with gr.Row():
100+
driven_audio = gr.Audio(label="Input audio", source="upload", type="filepath")
101+
102+
103+
with gr.Column(variant='panel'):
104+
with gr.Tabs(elem_id="sadtalker_checkbox"):
105+
with gr.TabItem('Settings'):
106+
with gr.Column(variant='panel'):
107+
is_still_mode = gr.Checkbox(label="Still Mode (fewer head motion)").style(container=True)
108+
is_enhance_mode = gr.Checkbox(label="Enhance Mode (better face quality )").style(container=True)
109+
submit = gr.Button('Generate', elem_id="sadtalker_generate", variant='primary')
110+
111+
with gr.Tabs(elem_id="sadtalker_genearted"):
112+
gen_video = gr.Video(label="Generated video", format="mp4").style(width=256)
113+
114+
115+
### gradio gpu call will always return the html,
116+
submit.click(
117+
fn=wrap_queued_call(sad_talker.test),
118+
inputs=[input_image,
119+
driven_audio,
120+
is_still_mode,
121+
is_enhance_mode],
122+
outputs=[gen_video, ]
123+
)
124+
125+
return [(audio_to_video, "SadTalker", "extension")]
126+
127+
def on_ui_settings():
128+
talker_path = Path(paths.script_path) / "outputs"
129+
section = ('extension', "SadTalker")
130+
opts.add_option("sadtalker_result_dir", OptionInfo(str(talker_path / "SadTalker/"), "Path to save results of sadtalker", section=section))
131+
132+
script_callbacks.on_ui_settings(on_ui_settings)
133+
script_callbacks.on_ui_tabs(on_ui_tabs)

0 commit comments

Comments
 (0)