Skip to content

Commit 9db82dc

Browse files
authored
Merge pull request #9 from mowshon/Refactoring
Code refactoring - Python 3.12
2 parents ea24550 + d8cd5b2 commit 9db82dc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+2607
-3250
lines changed

.gitignore

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,4 +127,15 @@ dmypy.json
127127

128128
# Pyre type checker
129129
.pyre/
130-
.idea/
130+
.idea/
131+
132+
/*.mp3
133+
/*.mp4
134+
/*.wav
135+
/weights
136+
/*.png
137+
/cache
138+
/example.py
139+
/*.pth
140+
/test_*.py
141+
/source

MANIFEST.in

Lines changed: 0 additions & 1 deletion
This file was deleted.

README.md

Lines changed: 68 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,87 @@
1-
# LipSync
2-
Lips Synchronization (Wav2Lip).
1+
# lipsync
32

4-
## Install
5-
```
6-
git clone git@github.com:mowshon/lipsync.git
7-
cd lipsync
8-
python setup.py install
9-
```
3+
lipsync is a Python library that moves lips in a video (or image) to match a given audio file. It is based on [Wav2Lip](https://github.com/Rudrabha/Wav2Lip), but many unneeded files and libraries have been removed, and the code has been updated to work with the latest versions of Python.
104

11-
Download the weights
12-
----------
13-
| Model | Description | Link to the model |
14-
| :-------------: | :---------------: | :---------------: |
15-
| Wav2Lip | Highly accurate lip-sync | [Link](https://iiitaphyd-my.sharepoint.com/:u:/g/personal/radrabha_m_research_iiit_ac_in/Eb3LEzbfuKlJiR600lQWRxgBIY27JZg80f7V9jtMfbNDaQ?e=TBFBVW) |
16-
| Wav2Lip + GAN | Slightly inferior lip-sync, but better visual quality | [Link](https://iiitaphyd-my.sharepoint.com/:u:/g/personal/radrabha_m_research_iiit_ac_in/EdjI7bZlgApMqsVoEUUXpLsBxqXbn5z8VTmoxp55YNDcIA?e=n9ljGW) |
5+
---
176

18-
### Project structure
19-
```
20-
└── project-folder
21-
   ├── cache/
22-
   ├── main.py
23-
   ├── wav2lip.pth
24-
   ├── face.mp4
25-
   └── audio.wav
7+
## Features
8+
9+
- **Video lip synchronization**
10+
Synchronize lips in an existing video to match a new audio file.
11+
12+
- **Image lip animation**
13+
Provide a single image and an audio file to generate a talking video.
14+
15+
- **Runs on CPU and CUDA**
16+
You can choose whether to run on your CPU or a CUDA-enabled GPU for faster processing.
17+
18+
- **Caching**
19+
If you use the same video multiple times with different audio files, lipsync can cache frames and reuse them. This makes future runs much faster.
20+
21+
---
22+
23+
## Pre-Trained Weights
24+
25+
lipsync works with two different pre-trained models:
26+
27+
1. **Wav2Lip ([Download wav2lip.pth](https://drive.google.com/file/d/1qKU8HG8dR4nW4LvCqpEYmSy6LLpVkZ21/view?usp=sharing))**
28+
- More accurate lip synchronization
29+
- Lips in the result may appear somewhat blurred
30+
31+
2. **Wav2Lip + GAN ([Download wav2lip_gan.pth](https://drive.google.com/file/d/13Ktexq-nZOsAxqrTdMh3Q0ntPB3yiBtv/view?usp=sharing))**
32+
- Lips in the result are clearer
33+
- Synchronization may be slightly less accurate
34+
35+
---
36+
37+
## Installation
38+
39+
```bash
40+
pip install lipsync
2641
```
2742

28-
## Example
43+
---
44+
45+
## Usage Example
46+
47+
Below is a simple example in Python. This assumes you have the model weights (either `wav2lip.pth` or `wav2lip_gan.pth`) in a `weights/` folder.
48+
2949
```python
3050
from lipsync import LipSync
3151

32-
3352
lip = LipSync(
34-
checkpoint_path='wav2lip.pth', # Downloaded weights
53+
model='wav2lip',
54+
checkpoint_path='weights/wav2lip.pth',
3555
nosmooth=True,
36-
cache_dir='cache' # Cache directory
56+
device='cuda',
57+
cache_dir='cache',
58+
img_size=96,
59+
save_cache=True,
3760
)
3861

3962
lip.sync(
40-
'face.mp4',
41-
'audio.wav',
42-
'output-file.mp4'
63+
'source/person.mp4',
64+
'source/audio.wav',
65+
'result.mp4',
4366
)
4467
```
4568

46-
License and Citation
47-
----------
69+
### Important Parameters
70+
- **model**: `'wav2lip'` or `'wav2lip_gan'`
71+
- **checkpoint_path**: Path to the model weights (e.g., `wav2lip.pth`, `wav2lip_gan.pth`)
72+
- **nosmooth**: Set `True` to disable smoothing
73+
- **device**: `'cpu'` or `'cuda'`
74+
- **cache_dir**: Directory for saving frames
75+
- **save_cache**: Set `True` to save frames to `cache_dir` for faster re-runs
76+
77+
---
78+
79+
### Ethical Use
80+
81+
Please be mindful when using **lipsync**. This library can generate videos that look convincing, so it could be used to spread disinformation or harm someone’s reputation. We encourage using it only for **entertainment** or **scientific** purposes, and always with **respect and consent** from any people involved.
82+
83+
### License and Citation
84+
4885
The software can only be used for personal/research/non-commercial purposes. Please cite the following paper if you have use this code:
4986
```
5087
@inproceedings{10.1145/3394171.3413532,
@@ -64,8 +101,3 @@ The software can only be used for personal/research/non-commercial purposes. Ple
64101
series = {MM '20}
65102
}
66103
```
67-
68-
69-
Acknowledgements
70-
----------
71-
Parts of the code structure is inspired by this [TTS repository](https://github.com/r9y9/deepvoice3_pytorch). We thank the author for this wonderful code. The code for Face Detection has been taken from the [face_alignment](https://github.com/1adrianb/face-alignment) repository. We thank the authors for releasing their code and models.

lipsync/Wav2Lip/.gitignore

Lines changed: 0 additions & 16 deletions
This file was deleted.

lipsync/Wav2Lip/audio.py

Lines changed: 0 additions & 136 deletions
This file was deleted.

lipsync/Wav2Lip/checkpoints/README.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

lipsync/Wav2Lip/evaluation/README.md

Lines changed: 0 additions & 59 deletions
This file was deleted.

0 commit comments

Comments
 (0)