Skip to content

Commit 31f6f5c

Browse files
Merge branch 'openvpi:master' into master
2 parents beba3ad + d32e41d commit 31f6f5c

29 files changed

+1396
-160
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ local_tools/
99
infer_out/
1010
config.yaml
1111
*.onnx
12+
checkpoints/
13+
data/
1214

1315
.vscode
1416
WPy64-38100

README.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,9 @@
99

1010
This repository is the official PyTorch implementation of our AAAI-2022 [paper](https://arxiv.org/abs/2105.02446), in which we propose DiffSinger (for Singing-Voice-Synthesis) and DiffSpeech (for Text-to-Speech).
1111

12-
<table style="width:100%">
13-
<tr>
14-
<th>DiffSinger/DiffSpeech at training</th>
15-
<th>DiffSinger/DiffSpeech at inference</th>
16-
</tr>
17-
<tr>
18-
<td><img src="resources/model_a.png" alt="Training" height="300"></td>
19-
<td><img src="resources/model_b.png" alt="Inference" height="300"></td>
20-
</tr>
21-
</table>
2212

2313
:tada: :tada: :tada: **Updates**:
14+
- Sep.11, 2022: :electric_plug: [DiffSinger-PN](docs/README-SVS-opencpop-pndm.md). Add plug-in [PNDM](https://arxiv.org/abs/2202.09778), ICLR 2022 in our laboratory, to accelerate DiffSinger freely.
2415
- Jul.27, 2022: Update documents for [SVS](docs/README-SVS.md). Add easy inference [A](docs/README-SVS-opencpop-cascade.md#4-inference-from-raw-inputs) & [B](docs/README-SVS-opencpop-e2e.md#4-inference-from-raw-inputs); Add Interactive SVS running on [HuggingFace🤗 SVS](https://huggingface.co/spaces/Silentlin/DiffSinger).
2516
- Mar.2, 2022: MIDI-B-version.
2617
- Mar.1, 2022: [NeuralSVB](https://github.com/MoonInTheRiver/NeuralSVB), for singing voice beautifying, has been released.
@@ -47,6 +38,17 @@ or pip install -r requirements_3090.txt (GPU 3090, CUDA 11.4)
4738
- [Run DiffSpeech (TTS version)](docs/README-TTS.md).
4839
- [Run DiffSinger (SVS version)](docs/README-SVS.md).
4940

41+
## Overview
42+
| Mel Pipeline | Dataset | Pitch Input | F0 Prediction | Acceleration Method | Vocoder |
43+
| ------------------------------------------------------------------------------------------- | ---------------------------------------------------------| ----------------- | ------------- | --------------------------- | ----------------------------- |
44+
| [DiffSpeech (Text->F0, Text+F0->Mel, Mel->Wav)](docs/README-TTS.md) | [Ljspeech](https://keithito.com/LJ-Speech-Dataset/) | None | Explicit | Shallow Diffusion | NSF-HiFiGAN |
45+
| [DiffSinger (Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-popcs.md) | [PopCS](https://github.com/MoonInTheRiver/DiffSinger) | Ground-Truth F0 | None | Shallow Diffusion | NSF-HiFiGAN |
46+
| [DiffSinger (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-opencpop-cascade.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Explicit | Shallow Diffusion | NSF-HiFiGAN |
47+
| [FFT-Singer (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-opencpop-cascade.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Explicit | Invalid | NSF-HiFiGAN |
48+
| [DiffSinger (Lyric+MIDI->Mel, Mel->Wav)](docs/README-SVS-opencpop-e2e.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Implicit | None | Pitch-Extractor + NSF-HiFiGAN |
49+
| [DiffSinger+PNDM (Lyric+MIDI->Mel, Mel->Wav)](docs/README-SVS-opencpop-pndm.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Implicit | PLMS | Pitch-Extractor + NSF-HiFiGAN |
50+
51+
5052
## Tensorboard
5153
```sh
5254
tensorboard --logdir_spec exp_name

acoustic/models/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*.onnx

acoustic/onnx-instructions.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# 合成简易版
2+
3+
暂不支持手动指定音素时长。
4+
5+
---
6+
7+
## 准备环境
8+
9+
+ 新建Conda环境
10+
````
11+
conda create -n dfs_onnx python=3.9
12+
conda activate dfs_onnx
13+
````
14+
15+
+ 安装依赖
16+
````
17+
pip install librosa pypinyin tqdm six pyyaml numpy scipy
18+
````
19+
20+
+ 安装ONNXRuntime
21+
````
22+
pip install onnxruntime-gpu
23+
````
24+
25+
+ 准备模型
26+
+ 下载`hifigan.onnx``singer_denoise.onnx``singer_fs.onnx``xiaoma_pe.onnx`,将其移动到`acoustic/models`目录下
27+
28+
## 运行
29+
30+
+ 使用CPU
31+
````
32+
python my_numpy.py xxx.ds
33+
````
34+
35+
+ 使用GPU(需要确保CUDA版本与ONNXRuntime版本兼容)
36+
````
37+
python my_numpy.py xxx.ds -d gpu
38+
````
39+
40+
## 示例
41+
42+
+ 小酒窝.ds
43+
44+
````
45+
{
46+
"text": "小酒窝长睫毛AP是你最美的记号",
47+
"notes": "C#4/Db4 | F#4/Gb4 | G#4/Ab4 | A#4/Bb4 F#4/Gb4 | F#4/Gb4 C#4/Db4 | C#4/Db4 | rest | C#4/Db4 | A#4/Bb4 | G#4/Ab4 | A#4/Bb4 | G#4/Ab4 | F4 | C#4/Db4",
48+
"notes_duration": "0.407140 | 0.376190 | 0.242180 | 0.509550 0.183420 | 0.315400 0.235020 | 0.361660 | 0.223070 | 0.377270 | 0.340550 | 0.299620 | 0.344510 | 0.283770 | 0.323390 | 0.360340",
49+
"input_type": "word"
50+
}
51+
````
52+
53+
+ 运行
54+
55+
````
56+
> python my_numpy.py 小酒窝.ds
57+
...
58+
Pass word-notes check.
59+
29 29 29
60+
Pass word-notes check.
61+
[Status] Preprocess
62+
[Status] Run fs
63+
[Status] Run sample
64+
[Status] Sample step: 100%|███████████████████████████| 100/100 [00:09<00:00, 10.25it/s]
65+
[Status] Run pe
66+
[Status] Run vocoder
67+
[Status] Save audio: ./infer_out\小酒窝0.wav
68+
OK
69+
````

acoustic/onnx-instructions.pdf

39.9 KB
Binary file not shown.

acoustic/tmp_audio.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
import librosa
2+
import librosa.filters
3+
import numpy as np
4+
from scipy.io import wavfile
5+
6+
7+
def save_wav(wav, path, sr, norm=False):
8+
if norm:
9+
wav = wav / np.abs(wav).max()
10+
wav *= 32767
11+
# proposed by @dsmiller
12+
wavfile.write(path, sr, wav.astype(np.int16))

acoustic/tmp_cuda.py

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
import ctypes
2+
import sys
3+
import os
4+
5+
cuda_version = "11.3"
6+
7+
if sys.platform == 'win32':
8+
pfiles_path = os.getenv('ProgramFiles', 'C:\\Program Files')
9+
py_dll_path = os.path.join(sys.exec_prefix, 'Library', 'bin')
10+
th_dll_path = os.path.join(os.path.dirname(__file__), 'lib')
11+
12+
# When users create a virtualenv that inherits the base environment,
13+
# we will need to add the corresponding library directory into
14+
# DLL search directories. Otherwise, it will rely on `PATH` which
15+
# is dependent on user settings.
16+
if sys.exec_prefix != sys.base_exec_prefix:
17+
base_py_dll_path = os.path.join(sys.base_exec_prefix, 'Library', 'bin')
18+
else:
19+
base_py_dll_path = ''
20+
21+
dll_paths = list(filter(os.path.exists, [th_dll_path, py_dll_path, base_py_dll_path]))
22+
23+
if all([not os.path.exists(os.path.join(p, 'nvToolsExt64_1.dll')) for p in dll_paths]):
24+
nvtoolsext_dll_path = os.path.join(
25+
os.getenv('NVTOOLSEXT_PATH', os.path.join(pfiles_path, 'NVIDIA Corporation', 'NvToolsExt')), 'bin', 'x64')
26+
else:
27+
nvtoolsext_dll_path = ''
28+
29+
import glob
30+
if cuda_version and all([not glob.glob(os.path.join(p, 'cudart64*.dll')) for p in dll_paths]):
31+
cuda_version_1 = cuda_version.replace('.', '_')
32+
cuda_path_var = 'CUDA_PATH_V' + cuda_version_1
33+
default_path = os.path.join(pfiles_path, 'NVIDIA GPU Computing Toolkit', 'CUDA', 'v' + cuda_version)
34+
cuda_path = os.path.join(os.getenv(cuda_path_var, default_path), 'bin')
35+
else:
36+
cuda_path = ''
37+
38+
dll_paths.extend(filter(os.path.exists, [nvtoolsext_dll_path, cuda_path]))
39+
40+
kernel32 = ctypes.WinDLL('kernel32.dll', use_last_error=True)
41+
with_load_library_flags = hasattr(kernel32, 'AddDllDirectory')
42+
prev_error_mode = kernel32.SetErrorMode(0x0001)
43+
44+
kernel32.LoadLibraryW.restype = ctypes.c_void_p
45+
if with_load_library_flags:
46+
kernel32.AddDllDirectory.restype = ctypes.c_void_p
47+
kernel32.LoadLibraryExW.restype = ctypes.c_void_p
48+
49+
for dll_path in dll_paths:
50+
if sys.version_info >= (3, 8):
51+
os.add_dll_directory(dll_path)
52+
elif with_load_library_flags:
53+
res = kernel32.AddDllDirectory(dll_path)
54+
if res is None:
55+
err = ctypes.WinError(ctypes.get_last_error())
56+
err.strerror += f' Error adding "{dll_path}" to the DLL directories.'
57+
raise err
58+
59+
try:
60+
ctypes.CDLL('vcruntime140.dll')
61+
ctypes.CDLL('msvcp140.dll')
62+
ctypes.CDLL('vcruntime140_1.dll')
63+
except OSError:
64+
print('''Microsoft Visual C++ Redistributable is not installed, this may lead to the DLL load failure.
65+
It can be downloaded at https://aka.ms/vs/16/release/vc_redist.x64.exe''')
66+
67+
dlls = glob.glob(os.path.join(th_dll_path, '*.dll'))
68+
path_patched = False
69+
for dll in dlls:
70+
is_loaded = False
71+
if with_load_library_flags:
72+
res = kernel32.LoadLibraryExW(dll, None, 0x00001100)
73+
last_error = ctypes.get_last_error()
74+
if res is None and last_error != 126:
75+
err = ctypes.WinError(last_error)
76+
err.strerror += f' Error loading "{dll}" or one of its dependencies.'
77+
raise err
78+
elif res is not None:
79+
is_loaded = True
80+
if not is_loaded:
81+
if not path_patched:
82+
os.environ['PATH'] = ';'.join(dll_paths + [os.environ['PATH']])
83+
path_patched = True
84+
res = kernel32.LoadLibraryW(dll)
85+
if res is None:
86+
err = ctypes.WinError(ctypes.get_last_error())
87+
err.strerror += f' Error loading "{dll}" or one of its dependencies.'
88+
raise err
89+
90+
kernel32.SetErrorMode(prev_error_mode)

acoustic/tmp_hparams.py

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
import argparse
2+
import os
3+
import yaml
4+
5+
global_print_hparams = True
6+
hparams = {}
7+
8+
9+
class Args:
10+
def __init__(self, **kwargs):
11+
for k, v in kwargs.items():
12+
self.__setattr__(k, v)
13+
14+
15+
def override_config(old_config: dict, new_config: dict):
16+
for k, v in new_config.items():
17+
if isinstance(v, dict) and k in old_config:
18+
override_config(old_config[k], new_config[k])
19+
else:
20+
old_config[k] = v
21+
22+
23+
def set_hparams(config='', exp_name='', hparams_str='', print_hparams=True, global_hparams=True):
24+
if config == '':
25+
parser = argparse.ArgumentParser(description='neural music')
26+
parser.add_argument('--config', type=str, default='',
27+
help='location of the data corpus')
28+
parser.add_argument('--exp_name', type=str, default='', help='exp_name')
29+
parser.add_argument('--hparams', type=str, default='',
30+
help='location of the data corpus')
31+
parser.add_argument('--infer', action='store_true', help='infer')
32+
parser.add_argument('--validate', action='store_true', help='validate')
33+
parser.add_argument('--reset', action='store_true', help='reset hparams')
34+
parser.add_argument('--debug', action='store_true', help='debug')
35+
args, unknown = parser.parse_known_args()
36+
else:
37+
args = Args(config=config, exp_name=exp_name, hparams=hparams_str,
38+
infer=False, validate=False, reset=False, debug=False)
39+
args_work_dir = ''
40+
if args.exp_name != '':
41+
args.work_dir = args.exp_name
42+
args_work_dir = f'checkpoints/{args.work_dir}'
43+
44+
config_chains = []
45+
loaded_config = set()
46+
47+
def load_config(config_fn): # deep first
48+
if(config_fn.startswith("/")):
49+
config_fn_path=os.path.abspath(config_fn[1:])
50+
else:
51+
config_fn_path=config_fn
52+
with open(config_fn_path, encoding='utf-8') as f:
53+
hparams_ = yaml.safe_load(f)
54+
loaded_config.add(config_fn)
55+
if 'base_config' in hparams_:
56+
ret_hparams = {}
57+
if not isinstance(hparams_['base_config'], list):
58+
hparams_['base_config'] = [hparams_['base_config']]
59+
for c in hparams_['base_config']:
60+
if c not in loaded_config:
61+
if c.startswith('.'):
62+
c = f'{os.path.dirname(config_fn)}/{c}'
63+
c = os.path.normpath(c)
64+
override_config(ret_hparams, load_config(c))
65+
override_config(ret_hparams, hparams_)
66+
else:
67+
ret_hparams = hparams_
68+
config_chains.append(config_fn)
69+
return ret_hparams
70+
71+
global hparams
72+
assert args.config != '' or args_work_dir != ''
73+
saved_hparams = {}
74+
if args_work_dir != 'checkpoints/':
75+
ckpt_config_path = f'{args_work_dir}/config.yaml'
76+
if os.path.exists(ckpt_config_path):
77+
try:
78+
with open(ckpt_config_path, encoding='utf-8') as f:
79+
saved_hparams.update(yaml.safe_load(f))
80+
except:
81+
pass
82+
if args.config == '':
83+
args.config = ckpt_config_path
84+
85+
hparams_ = {}
86+
87+
hparams_.update(load_config(args.config))
88+
89+
if not args.reset:
90+
hparams_.update(saved_hparams)
91+
hparams_['work_dir'] = args_work_dir
92+
93+
if args.hparams != "":
94+
for new_hparam in args.hparams.split(","):
95+
k, v = new_hparam.split("=")
96+
if v in ['True', 'False'] or type(hparams_[k]) == bool:
97+
hparams_[k] = eval(v)
98+
else:
99+
hparams_[k] = type(hparams_[k])(v)
100+
101+
if args_work_dir != '' and (not os.path.exists(ckpt_config_path) or args.reset) and not args.infer:
102+
os.makedirs(hparams_['work_dir'], exist_ok=True)
103+
with open(ckpt_config_path, 'w', encoding='utf-8') as f:
104+
yaml.safe_dump(hparams_, f)
105+
106+
hparams_['infer'] = args.infer
107+
hparams_['debug'] = args.debug
108+
hparams_['validate'] = args.validate
109+
global global_print_hparams
110+
if global_hparams:
111+
hparams.clear()
112+
hparams.update(hparams_)
113+
114+
if print_hparams and global_print_hparams and global_hparams:
115+
print('| Hparams chains: ', config_chains)
116+
print('| Hparams: ')
117+
for i, (k, v) in enumerate(sorted(hparams_.items())):
118+
print(f"\033[;33;m{k}\033[0m: {v}, ", end="\n" if i % 5 == 4 else "")
119+
print("")
120+
global_print_hparams = False
121+
# print(hparams_.keys())
122+
if hparams.get('exp_name') is None:
123+
hparams['exp_name'] = args.exp_name
124+
if hparams_.get('exp_name') is None:
125+
hparams_['exp_name'] = args.exp_name
126+
return hparams_

0 commit comments

Comments
 (0)