Skip to content

Commit bc2613b

Browse files
authored
Merge pull request #2174 from lym0302/mix_test
[tts] add mix tts test
2 parents 21dc77f + e1f8695 commit bc2613b

File tree

5 files changed

+101
-0
lines changed

5 files changed

+101
-0
lines changed

examples/zh_en_tts/tts3/README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Test
2+
We train a Chinese-English mixed fastspeech2 model. The training code is still being sorted out, let's show how to use it first.
3+
The sample rate of the synthesized audio is 22050 Hz.
4+
5+
## Download pretrained models
6+
Put pretrained models in a directory named `models`.
7+
8+
- [fastspeech2_csmscljspeech_add-zhen.zip](https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_csmscljspeech_add-zhen.zip)
9+
- [hifigan_ljspeech_ckpt_0.2.0.zip](https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_ljspeech_ckpt_0.2.0.zip)
10+
11+
```bash
12+
mkdir models
13+
cd models
14+
wget https://paddlespeech.bj.bcebos.com/t2s/chinse_english_mixed/models/fastspeech2_csmscljspeech_add-zhen.zip
15+
unzip fastspeech2_csmscljspeech_add-zhen.zip
16+
wget https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_ljspeech_ckpt_0.2.0.zip
17+
unzip hifigan_ljspeech_ckpt_0.2.0.zip
18+
cd ../
19+
```
20+
21+
## test
22+
You can choose `--spk_id` {0, 1} in `local/synthesize_e2e.sh`.
23+
24+
```bash
25+
bash test.sh
26+
```
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
#!/bin/bash
2+
3+
model_dir=$1
4+
output=$2
5+
am_name=fastspeech2_csmscljspeech_add-zhen
6+
am_model_dir=${model_dir}/${am_name}/
7+
8+
stage=1
9+
stop_stage=1
10+
11+
12+
# hifigan
13+
if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
14+
FLAGS_allocator_strategy=naive_best_fit \
15+
FLAGS_fraction_of_gpu_memory_to_use=0.01 \
16+
python3 ${BIN_DIR}/../synthesize_e2e.py \
17+
--am=fastspeech2_mix \
18+
--am_config=${am_model_dir}/default.yaml \
19+
--am_ckpt=${am_model_dir}/snapshot_iter_94000.pdz \
20+
--am_stat=${am_model_dir}/speech_stats.npy \
21+
--voc=hifigan_ljspeech \
22+
--voc_config=${model_dir}/hifigan_ljspeech_ckpt_0.2.0/default.yaml \
23+
--voc_ckpt=${model_dir}/hifigan_ljspeech_ckpt_0.2.0/snapshot_iter_2500000.pdz \
24+
--voc_stat=${model_dir}/hifigan_ljspeech_ckpt_0.2.0/feats_stats.npy \
25+
--lang=mix \
26+
--text=${BIN_DIR}/../sentences_mix.txt \
27+
--output_dir=${output}/test_e2e \
28+
--phones_dict=${am_model_dir}/phone_id_map.txt \
29+
--speaker_dict=${am_model_dir}/speaker_id_map.txt \
30+
--spk_id 0
31+
fi

examples/zh_en_tts/tts3/path.sh

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#!/bin/bash
2+
export MAIN_ROOT=`realpath ${PWD}/../../../`
3+
4+
export PATH=${MAIN_ROOT}:${MAIN_ROOT}/utils:${PATH}
5+
export LC_ALL=C
6+
7+
export PYTHONDONTWRITEBYTECODE=1
8+
# Use UTF-8 in Python to avoid UnicodeDecodeError when LC_ALL=C
9+
export PYTHONIOENCODING=UTF-8
10+
export PYTHONPATH=${MAIN_ROOT}:${PYTHONPATH}
11+
12+
MODEL=fastspeech2
13+
export BIN_DIR=${MAIN_ROOT}/paddlespeech/t2s/exps/${MODEL}

examples/zh_en_tts/tts3/test.sh

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#!/bin/bash
2+
3+
set -e
4+
source path.sh
5+
6+
gpus=0,1
7+
stage=3
8+
stop_stage=100
9+
10+
model_dir=models
11+
output_dir=output
12+
13+
# with the following command, you can choose the stage range you want to run
14+
# such as `./run.sh --stage 0 --stop-stage 0`
15+
# this can not be mixed use with `$1`, `$2` ...
16+
source ${MAIN_ROOT}/utils/parse_options.sh || exit 1
17+
18+
19+
if [ ${stage} -le 3 ] && [ ${stop_stage} -ge 3 ]; then
20+
# synthesize_e2e, vocoder is hifigan by default
21+
CUDA_VISIBLE_DEVICES=${gpus} ./local/synthesize_e2e.sh ${model_dir} ${output_dir} || exit -1
22+
fi
23+
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
001 你好,欢迎使用 Paddle Speech 中英文混合 T T S 功能,开始你的合成之旅吧!
2+
002 我们的声学模型使用了 Fast Speech Two, 声码器使用了 Parallel Wave GAN and Hifi GAN.
3+
003 Paddle N L P 发布 ERNIE Tiny 全系列中文预训练小模型,快速提升预训练模型部署效率,通用信息抽取技术 U I E Tiny 系列模型全新升级,支持速度更快效果更好的 U I E 小模型。
4+
004 Paddle Speech 发布 P P A S R 流式语音识别系统、P P T T S 流式语音合成系统、P P V P R 全链路声纹识别系统。
5+
005 Paddle Bo Bo: 使用 Paddle Speech 的语音合成模块生成虚拟人的声音。
6+
006 热烈欢迎您在 Discussions 中提交问题,并在 Issues 中指出发现的 bug。此外,我们非常希望您参与到 Paddle Speech 的开发中!
7+
007 我喜欢 eat apple, 你喜欢 drink milk。
8+
008 我们要去云南 team building, 非常非常 happy.

0 commit comments

Comments
 (0)