Skip to content

Commit 25cf23c

Browse files
author
Rick-McCoy
authored
Merge pull request #2 from Deepest-Project/master
Merging Deepest-Project master into swpark.
2 parents 53c5b3e + 3aaad48 commit 25cf23c

File tree

3 files changed

+43
-1
lines changed

3 files changed

+43
-1
lines changed

README.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,35 @@
1-
# MelNet
1+
# MelNet (WIP)
2+
3+
Implementation of [MelNet: A Generative Model for Audio in the Frequency Domain](<https://arxiv.org/abs/1906.01083>) (Work in progress)
4+
5+
## Prerequisites
6+
7+
- Tested with Python 3.6.8, PyTorch 1.2.0.
8+
- `pip install -r requirements.txt`
9+
10+
## How to train
11+
12+
- Download train data: You may use either Blizzard(22,050Hz) or VoxCeleb2(16,000Hz) data. Both `m4a`, `wav` extension can be used.
13+
- For `wav` extension, you need to fix `datasets/wavloader.py#L38`. This hardcoded file extension will be fixed soon.
14+
- `python trainer.py -c config/voxceleb2.yaml -n [name of run] -t [tier number] -b [batch size]`
15+
- You may need to adjust the batch size for each tier. For Tesla V100(32GB), b=4 for t=1, b=8 for t=2 was tested.
16+
- We found that only SGD optimizer with `lr=0.0001, momentum=0` works properly. Other optimizers like RMSProp or Adam have lead to severe unstability of loss.
17+
18+
![](./assets/tensorboard.png)
19+
20+
## To-do
21+
22+
- [x] Implement upsampling procedure
23+
- [x] GMM sampling + loss function
24+
- [ ] Unconditional audio generation
25+
- [ ] TTS synthesis (PR [#3](<https://github.com/Deepest-Project/MelNet/pull/3>) is in review)
26+
- [x] Tensorboard logging
27+
- [ ] Multi-GPU training
28+
29+
## Implementation authors
30+
31+
- [Seungwon Park](<https://github.com/seungwonpark>), [Joonyoung Lee](<https://github.com/Rick-McCoy>), [Yoonhyung Lee](<https://github.com/LEEYOONHYUNG>), [Joowhan Song](<https://github.com/Joovvhan>) @ Deepest Season 6
32+
33+
## License
34+
35+
MIT License

assets/tensorboard.png

52.8 KB
Loading

requirements.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
librosa
2+
matplotlib
3+
numpy
4+
pydub
5+
pyyaml
6+
tensorboard
7+
torch==1.2.0
8+
tqdm

0 commit comments

Comments
 (0)