xunmengshe
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 12 additions & 10 deletions b/‎README.md‎
Lines changed: 12 additions & 10 deletions
diff --git a/‎acoustic/models/.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎acoustic/models/.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎acoustic/onnx-instructions.md‎
Lines changed: 69 additions & 0 deletions b/‎acoustic/onnx-instructions.md‎
Lines changed: 69 additions & 0 deletions
diff --git a/‎acoustic/onnx-instructions.pdf‎
39.9 KB b/‎acoustic/onnx-instructions.pdf‎
39.9 KB
diff --git a/‎acoustic/tmp_audio.py‎
Lines changed: 12 additions & 0 deletions b/‎acoustic/tmp_audio.py‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎acoustic/tmp_cuda.py‎
Lines changed: 90 additions & 0 deletions b/‎acoustic/tmp_cuda.py‎
Lines changed: 90 additions & 0 deletions
diff --git a/‎acoustic/tmp_hparams.py‎
Lines changed: 126 additions & 0 deletions b/‎acoustic/tmp_hparams.py‎
Lines changed: 126 additions & 0 deletions
@@ -9,6 +9,8 @@ local_tools/
 infer_out/
 config.yaml
 *.onnx
+checkpoints/
+data/
 
 .vscode
 WPy64-38100
 
@@ -9,18 +9,9 @@
 
 This repository is the official PyTorch implementation of our AAAI-2022 [paper](https://arxiv.org/abs/2105.02446), in which we propose DiffSinger (for Singing-Voice-Synthesis) and DiffSpeech (for Text-to-Speech).
 
-<table style="width:100%">
-  <tr>
-    <th>DiffSinger/DiffSpeech at training</th>
-    <th>DiffSinger/DiffSpeech at inference</th>
-  </tr>
-  <tr>
-    <td><img src="resources/model_a.png" alt="Training" height="300"></td>
-    <td><img src="resources/model_b.png" alt="Inference" height="300"></td>
-  </tr>
-</table>
 
 :tada: :tada: :tada: **Updates**:
+ - Sep.11, 2022: :electric_plug: [DiffSinger-PN](docs/README-SVS-opencpop-pndm.md). Add plug-in [PNDM](https://arxiv.org/abs/2202.09778), ICLR 2022 in our laboratory, to accelerate DiffSinger freely.
  - Jul.27, 2022: Update documents for [SVS](docs/README-SVS.md). Add easy inference [A](docs/README-SVS-opencpop-cascade.md#4-inference-from-raw-inputs) & [B](docs/README-SVS-opencpop-e2e.md#4-inference-from-raw-inputs); Add Interactive SVS running on [HuggingFace🤗 SVS](https://huggingface.co/spaces/Silentlin/DiffSinger).
  - Mar.2, 2022: MIDI-B-version.
  - Mar.1, 2022: [NeuralSVB](https://github.com/MoonInTheRiver/NeuralSVB), for singing voice beautifying, has been released.
@@ -47,6 +38,17 @@ or pip install -r requirements_3090.txt   (GPU 3090, CUDA 11.4)
 - [Run DiffSpeech (TTS version)](docs/README-TTS.md).
 - [Run DiffSinger (SVS version)](docs/README-SVS.md).
 
+## Overview
+| Mel Pipeline                                                                                | Dataset                                                  | Pitch Input       | F0 Prediction |   Acceleration Method       | Vocoder                       |
+| ------------------------------------------------------------------------------------------- | ---------------------------------------------------------| ----------------- | ------------- | --------------------------- | ----------------------------- |
+| [DiffSpeech (Text->F0, Text+F0->Mel, Mel->Wav)](docs/README-TTS.md)                         | [Ljspeech](https://keithito.com/LJ-Speech-Dataset/)      | None              | Explicit      | Shallow Diffusion           | NSF-HiFiGAN                   |
+| [DiffSinger (Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-popcs.md)                            | [PopCS](https://github.com/MoonInTheRiver/DiffSinger)    | Ground-Truth F0   | None          | Shallow Diffusion           | NSF-HiFiGAN                   |
+| [DiffSinger (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-opencpop-cascade.md) | [OpenCpop](https://wenet.org.cn/opencpop/)               | MIDI              | Explicit      | Shallow Diffusion           | NSF-HiFiGAN                   |
+| [FFT-Singer (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-opencpop-cascade.md) | [OpenCpop](https://wenet.org.cn/opencpop/)               | MIDI              | Explicit      | Invalid                     | NSF-HiFiGAN                   |
+| [DiffSinger (Lyric+MIDI->Mel, Mel->Wav)](docs/README-SVS-opencpop-e2e.md)                   | [OpenCpop](https://wenet.org.cn/opencpop/)               | MIDI              | Implicit      | None                        | Pitch-Extractor + NSF-HiFiGAN |
+| [DiffSinger+PNDM (Lyric+MIDI->Mel, Mel->Wav)](docs/README-SVS-opencpop-pndm.md)             | [OpenCpop](https://wenet.org.cn/opencpop/)               | MIDI              | Implicit      | PLMS                        | Pitch-Extractor + NSF-HiFiGAN |
+ 
+
 ## Tensorboard
 ```sh
 tensorboard --logdir_spec exp_name
 
@@ -0,0 +1 @@
+*.onnx
@@ -0,0 +1,69 @@
+# 合成简易版
+
+暂不支持手动指定音素时长。
+
+---
+
+## 准备环境
+
++ 新建Conda环境
+````
+conda create -n dfs_onnx python=3.9
+conda activate dfs_onnx
+````
+
++ 安装依赖
+````
+pip install librosa pypinyin tqdm six pyyaml numpy scipy
+````
+
++ 安装ONNXRuntime
+````
+pip install onnxruntime-gpu
+````
+
++ 准备模型
+    + 下载`hifigan.onnx`、`singer_denoise.onnx`、`singer_fs.onnx`、`xiaoma_pe.onnx`，将其移动到`acoustic/models`目录下
+
+## 运行
+
++ 使用CPU
+````
+python my_numpy.py xxx.ds
+````
+
++ 使用GPU（需要确保CUDA版本与ONNXRuntime版本兼容）
+````
+python my_numpy.py xxx.ds -d gpu
+````
+
+## 示例
+
++ 小酒窝.ds
+
+````
+{
+    "text": "小酒窝长睫毛AP是你最美的记号",
+    "notes": "C#4/Db4 | F#4/Gb4 | G#4/Ab4 | A#4/Bb4 F#4/Gb4 | F#4/Gb4 C#4/Db4 | C#4/Db4 | rest | C#4/Db4 | A#4/Bb4 | G#4/Ab4 | A#4/Bb4 | G#4/Ab4 | F4 | C#4/Db4",
+    "notes_duration": "0.407140 | 0.376190 | 0.242180 | 0.509550 0.183420 | 0.315400 0.235020 | 0.361660 | 0.223070 | 0.377270 | 0.340550 | 0.299620 | 0.344510 | 0.283770 | 0.323390 | 0.360340",
+    "input_type": "word"
+}
+````
+
++ 运行
+
+````
+> python my_numpy.py 小酒窝.ds
+...
+Pass word-notes check.
+29 29 29
+Pass word-notes check.
+[Status] Preprocess
+[Status] Run fs
+[Status] Run sample
+[Status] Sample step: 100%|███████████████████████████| 100/100 [00:09<00:00, 10.25it/s] 
+[Status] Run pe
+[Status] Run vocoder
+[Status] Save audio: ./infer_out\小酒窝0.wav
+OK
+````
@@ -0,0 +1,12 @@
+import librosa
+import librosa.filters
+import numpy as np
+from scipy.io import wavfile
+
+
+def save_wav(wav, path, sr, norm=False):
+    if norm:
+        wav = wav / np.abs(wav).max()
+    wav *= 32767
+    # proposed by @dsmiller
+    wavfile.write(path, sr, wav.astype(np.int16))
@@ -0,0 +1,90 @@
+import ctypes
+import sys
+import os
+
+cuda_version = "11.3"
+
+if sys.platform == 'win32':
+    pfiles_path = os.getenv('ProgramFiles', 'C:\\Program Files')
+    py_dll_path = os.path.join(sys.exec_prefix, 'Library', 'bin')
+    th_dll_path = os.path.join(os.path.dirname(__file__), 'lib')
+
+    # When users create a virtualenv that inherits the base environment,
+    # we will need to add the corresponding library directory into
+    # DLL search directories. Otherwise, it will rely on `PATH` which
+    # is dependent on user settings.
+    if sys.exec_prefix != sys.base_exec_prefix:
+        base_py_dll_path = os.path.join(sys.base_exec_prefix, 'Library', 'bin')
+    else:
+        base_py_dll_path = ''
+
+    dll_paths = list(filter(os.path.exists, [th_dll_path, py_dll_path, base_py_dll_path]))
+
+    if all([not os.path.exists(os.path.join(p, 'nvToolsExt64_1.dll')) for p in dll_paths]):
+        nvtoolsext_dll_path = os.path.join(
+            os.getenv('NVTOOLSEXT_PATH', os.path.join(pfiles_path, 'NVIDIA Corporation', 'NvToolsExt')), 'bin', 'x64')
+    else:
+        nvtoolsext_dll_path = ''
+
+    import glob
+    if cuda_version and all([not glob.glob(os.path.join(p, 'cudart64*.dll')) for p in dll_paths]):
+        cuda_version_1 = cuda_version.replace('.', '_')
+        cuda_path_var = 'CUDA_PATH_V' + cuda_version_1
+        default_path = os.path.join(pfiles_path, 'NVIDIA GPU Computing Toolkit', 'CUDA', 'v' + cuda_version)
+        cuda_path = os.path.join(os.getenv(cuda_path_var, default_path), 'bin')
+    else:
+        cuda_path = ''
+
+    dll_paths.extend(filter(os.path.exists, [nvtoolsext_dll_path, cuda_path]))
+
+    kernel32 = ctypes.WinDLL('kernel32.dll', use_last_error=True)
+    with_load_library_flags = hasattr(kernel32, 'AddDllDirectory')
+    prev_error_mode = kernel32.SetErrorMode(0x0001)
+
+    kernel32.LoadLibraryW.restype = ctypes.c_void_p
+    if with_load_library_flags:
+        kernel32.AddDllDirectory.restype = ctypes.c_void_p
+        kernel32.LoadLibraryExW.restype = ctypes.c_void_p
+
+    for dll_path in dll_paths:
+        if sys.version_info >= (3, 8):
+            os.add_dll_directory(dll_path)
+        elif with_load_library_flags:
+            res = kernel32.AddDllDirectory(dll_path)
+            if res is None:
+                err = ctypes.WinError(ctypes.get_last_error())
+                err.strerror += f' Error adding "{dll_path}" to the DLL directories.'
+                raise err
+
+    try:
+        ctypes.CDLL('vcruntime140.dll')
+        ctypes.CDLL('msvcp140.dll')
+        ctypes.CDLL('vcruntime140_1.dll')
+    except OSError:
+        print('''Microsoft Visual C++ Redistributable is not installed, this may lead to the DLL load failure.
+                 It can be downloaded at https://aka.ms/vs/16/release/vc_redist.x64.exe''')
+
+    dlls = glob.glob(os.path.join(th_dll_path, '*.dll'))
+    path_patched = False
+    for dll in dlls:
+        is_loaded = False
+        if with_load_library_flags:
+            res = kernel32.LoadLibraryExW(dll, None, 0x00001100)
+            last_error = ctypes.get_last_error()
+            if res is None and last_error != 126:
+                err = ctypes.WinError(last_error)
+                err.strerror += f' Error loading "{dll}" or one of its dependencies.'
+                raise err
+            elif res is not None:
+                is_loaded = True
+        if not is_loaded:
+            if not path_patched:
+                os.environ['PATH'] = ';'.join(dll_paths + [os.environ['PATH']])
+                path_patched = True
+            res = kernel32.LoadLibraryW(dll)
+            if res is None:
+                err = ctypes.WinError(ctypes.get_last_error())
+                err.strerror += f' Error loading "{dll}" or one of its dependencies.'
+                raise err
+
+    kernel32.SetErrorMode(prev_error_mode)
@@ -0,0 +1,126 @@
+import argparse
+import os
+import yaml
+
+global_print_hparams = True
+hparams = {}
+
+
+class Args:
+    def __init__(self, **kwargs):
+        for k, v in kwargs.items():
+            self.__setattr__(k, v)
+
+
+def override_config(old_config: dict, new_config: dict):
+    for k, v in new_config.items():
+        if isinstance(v, dict) and k in old_config:
+            override_config(old_config[k], new_config[k])
+        else:
+            old_config[k] = v
+
+
+def set_hparams(config='', exp_name='', hparams_str='', print_hparams=True, global_hparams=True):
+    if config == '':
+        parser = argparse.ArgumentParser(description='neural music')
+        parser.add_argument('--config', type=str, default='',
+                            help='location of the data corpus')
+        parser.add_argument('--exp_name', type=str, default='', help='exp_name')
+        parser.add_argument('--hparams', type=str, default='',
+                            help='location of the data corpus')
+        parser.add_argument('--infer', action='store_true', help='infer')
+        parser.add_argument('--validate', action='store_true', help='validate')
+        parser.add_argument('--reset', action='store_true', help='reset hparams')
+        parser.add_argument('--debug', action='store_true', help='debug')
+        args, unknown = parser.parse_known_args()
+    else:
+        args = Args(config=config, exp_name=exp_name, hparams=hparams_str,
+                    infer=False, validate=False, reset=False, debug=False)
+    args_work_dir = ''
+    if args.exp_name != '':
+        args.work_dir = args.exp_name
+        args_work_dir = f'checkpoints/{args.work_dir}'
+
+    config_chains = []
+    loaded_config = set()
+
+    def load_config(config_fn):  # deep first
+        if(config_fn.startswith("/")):
+            config_fn_path=os.path.abspath(config_fn[1:])
+        else:
+            config_fn_path=config_fn
+        with open(config_fn_path, encoding='utf-8') as f:
+            hparams_ = yaml.safe_load(f)
+        loaded_config.add(config_fn)
+        if 'base_config' in hparams_:
+            ret_hparams = {}
+            if not isinstance(hparams_['base_config'], list):
+                hparams_['base_config'] = [hparams_['base_config']]
+            for c in hparams_['base_config']:
+                if c not in loaded_config:
+                    if c.startswith('.'):
+                        c = f'{os.path.dirname(config_fn)}/{c}'
+                        c = os.path.normpath(c)
+                    override_config(ret_hparams, load_config(c))
+            override_config(ret_hparams, hparams_)
+        else:
+            ret_hparams = hparams_
+        config_chains.append(config_fn)
+        return ret_hparams
+
+    global hparams
+    assert args.config != '' or args_work_dir != ''
+    saved_hparams = {}
+    if args_work_dir != 'checkpoints/':
+        ckpt_config_path = f'{args_work_dir}/config.yaml'
+        if os.path.exists(ckpt_config_path):
+            try:
+                with open(ckpt_config_path, encoding='utf-8') as f:
+                    saved_hparams.update(yaml.safe_load(f))
+            except:
+                pass
+        if args.config == '':
+            args.config = ckpt_config_path
+
+    hparams_ = {}
+
+    hparams_.update(load_config(args.config))
+    
+    if not args.reset:
+        hparams_.update(saved_hparams)
+    hparams_['work_dir'] = args_work_dir
+
+    if args.hparams != "":
+        for new_hparam in args.hparams.split(","):
+            k, v = new_hparam.split("=")
+            if v in ['True', 'False'] or type(hparams_[k]) == bool:
+                hparams_[k] = eval(v)
+            else:
+                hparams_[k] = type(hparams_[k])(v)
+
+    if args_work_dir != '' and (not os.path.exists(ckpt_config_path) or args.reset) and not args.infer:
+        os.makedirs(hparams_['work_dir'], exist_ok=True)
+        with open(ckpt_config_path, 'w', encoding='utf-8') as f:
+            yaml.safe_dump(hparams_, f)
+
+    hparams_['infer'] = args.infer
+    hparams_['debug'] = args.debug
+    hparams_['validate'] = args.validate
+    global global_print_hparams
+    if global_hparams:
+        hparams.clear()
+        hparams.update(hparams_)
+
+    if print_hparams and global_print_hparams and global_hparams:
+        print('| Hparams chains: ', config_chains)
+        print('| Hparams: ')
+        for i, (k, v) in enumerate(sorted(hparams_.items())):
+            print(f"\033[;33;m{k}\033[0m: {v}, ", end="\n" if i % 5 == 4 else "")
+        print("")
+        global_print_hparams = False
+    # print(hparams_.keys())
+    if hparams.get('exp_name') is None:
+        hparams['exp_name'] = args.exp_name
+    if hparams_.get('exp_name') is None:
+        hparams_['exp_name'] = args.exp_name
+    return hparams_