Skip to content

Commit dfb09ee

Browse files
committed
add pad calc link, test=doc
1 parent 7bbd909 commit dfb09ee

File tree

6 files changed

+457
-499
lines changed

6 files changed

+457
-499
lines changed

demos/speech_server/README.md

Lines changed: 29 additions & 29 deletions
Large diffs are not rendered by default.

demos/speech_server/README_cn.md

Lines changed: 31 additions & 40 deletions
Large diffs are not rendered by default.

demos/streaming_asr_server/README.md

Lines changed: 146 additions & 150 deletions
Large diffs are not rendered by default.

demos/streaming_asr_server/README_cn.md

Lines changed: 175 additions & 184 deletions
Large diffs are not rendered by default.

demos/streaming_tts_server/README.md

Lines changed: 41 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ For service interface definition, please check:
99
- [PaddleSpeech Server RESTful API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-RESTful-API)
1010
- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
1111

12-
1312
## Usage
1413
### 1. Installation
1514
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
@@ -34,11 +33,10 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
3433
- Both hifigan and mb_melgan support streaming voc inference.
3534
- When the voc model is mb_melgan, when voc_pad=14, the synthetic audio for streaming inference is consistent with the non-streaming synthetic audio; the minimum voc_pad can be set to 7, and the synthetic audio has no abnormal hearing. If the voc_pad is less than 7, the synthetic audio sounds abnormal.
3635
- When the voc model is hifigan, when voc_pad=19, the streaming inference synthetic audio is consistent with the non-streaming synthetic audio; when voc_pad=14, the synthetic audio has no abnormal hearing.
36+
- Pad calculation method of streaming vocoder in PaddleSpeech: [AIStudio tutorial](https://aistudio.baidu.com/aistudio/projectdetail/4151335)
3737
- Inference speed: mb_melgan > hifigan; Audio quality: mb_melgan < hifigan
3838
- **Note:** If the service can be started normally in the container, but the client access IP is unreachable, you can try to replace the `host` address in the configuration file with the local IP address.
3939

40-
41-
4240
### 3. Streaming speech synthesis server and client using http protocol
4341
#### 3.1 Server Usage
4442
- Command Line (Recommended)
@@ -58,7 +56,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
5856
- `log_file`: log file. Default: ./log/paddlespeech.log
5957

6058
Output:
61-
```bash
59+
```text
6260
[2022-04-24 20:05:27,887] [ INFO] - The first response time of the 0 warm up: 1.0123658180236816 s
6361
[2022-04-24 20:05:28,038] [ INFO] - The first response time of the 1 warm up: 0.15108466148376465 s
6462
[2022-04-24 20:05:28,191] [ INFO] - The first response time of the 2 warm up: 0.15317344665527344 s
@@ -84,8 +82,8 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
8482
log_file="./log/paddlespeech.log")
8583
```
8684

87-
Output:
88-
```bash
85+
Output:
86+
```text
8987
[2022-04-24 21:00:16,934] [ INFO] - The first response time of the 0 warm up: 1.268730878829956 s
9088
[2022-04-24 21:00:17,046] [ INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
9189
[2022-04-24 21:00:17,151] [ INFO] - The first response time of the 2 warm up: 0.10413002967834473 s
@@ -98,8 +96,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
9896
[2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
9997
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
10098
[2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
101-
102-
10399
```
104100

105101
#### 3.2 Streaming TTS client Usage
@@ -130,7 +126,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
130126
- Currently, only the single-speaker model is supported in the code, so `spk_id` does not take effect. Streaming TTS does not support changing sample rate, variable speed and volume.
131127

132128
Output:
133-
```bash
129+
```text
134130
[2022-04-24 21:08:18,559] [ INFO] - tts http client start
135131
[2022-04-24 21:08:21,702] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
136132
[2022-04-24 21:08:21,703] [ INFO] - 首包响应:0.18863153457641602 s
@@ -159,7 +155,7 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
159155
```
160156

161157
Output:
162-
```bash
158+
```text
163159
[2022-04-24 21:11:13,798] [ INFO] - tts http client start
164160
[2022-04-24 21:11:16,800] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
165161
[2022-04-24 21:11:16,801] [ INFO] - 首包响应:0.18234872817993164 s
@@ -169,7 +165,6 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
169165
[2022-04-24 21:11:16,837] [ INFO] - 音频保存至:./output.wav
170166
```
171167

172-
173168
### 4. Streaming speech synthesis server and client using websocket protocol
174169
#### 4.1 Server Usage
175170
- Command Line (Recommended)
@@ -189,21 +184,19 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
189184
- `log_file`: log file. Default: ./log/paddlespeech.log
190185

191186
Output:
192-
```bash
193-
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
194-
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
195-
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
196-
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
197-
INFO: Started server process [17600]
198-
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
199-
INFO: Waiting for application startup.
200-
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
201-
INFO: Application startup complete.
202-
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
203-
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
204-
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
205-
206-
187+
```text
188+
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
189+
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
190+
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
191+
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
192+
INFO: Started server process [17600]
193+
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
194+
INFO: Waiting for application startup.
195+
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
196+
INFO: Application startup complete.
197+
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
198+
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
199+
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
207200
```
208201

209202
- Python API
@@ -217,20 +210,19 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
217210
```
218211

219212
Output:
220-
```bash
221-
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
222-
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
223-
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
224-
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
225-
INFO: Started server process [23466]
226-
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
227-
INFO: Waiting for application startup.
228-
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
229-
INFO: Application startup complete.
230-
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
231-
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
232-
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
233-
213+
```text
214+
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
215+
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
216+
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
217+
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
218+
INFO: Started server process [23466]
219+
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
220+
INFO: Waiting for application startup.
221+
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
222+
INFO: Application startup complete.
223+
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
224+
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
225+
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
234226
```
235227

236228
#### 4.2 Streaming TTS client Usage
@@ -263,15 +255,14 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
263255

264256

265257
Output:
266-
```bash
258+
```text
267259
[2022-04-27 10:21:04,262] [ INFO] - tts websocket client start
268260
[2022-04-27 10:21:04,496] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
269261
[2022-04-27 10:21:04,496] [ INFO] - 首包响应:0.2124948501586914 s
270262
[2022-04-27 10:21:07,483] [ INFO] - 尾包响应:3.199106454849243 s
271263
[2022-04-27 10:21:07,484] [ INFO] - 音频时长:3.825 s
272264
[2022-04-27 10:21:07,484] [ INFO] - RTF: 0.8363677006141812
273265
[2022-04-27 10:21:07,516] [ INFO] - 音频保存至:output.wav
274-
275266
```
276267

277268
- Python API
@@ -288,21 +279,15 @@ The configuration file can be found in `conf/tts_online_application.yaml`.
288279
spk_id=0,
289280
output="./output.wav",
290281
play=False)
291-
292282
```
293283

294284
Output:
295-
```bash
296-
[2022-04-27 10:22:48,852] [ INFO] - tts websocket client start
297-
[2022-04-27 10:22:49,080] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
298-
[2022-04-27 10:22:49,080] [ INFO] - 首包响应:0.21017956733703613 s
299-
[2022-04-27 10:22:52,100] [ INFO] - 尾包响应:3.2304444313049316 s
300-
[2022-04-27 10:22:52,101] [ INFO] - 音频时长:3.825 s
301-
[2022-04-27 10:22:52,101] [ INFO] - RTF: 0.8445606356352762
302-
[2022-04-27 10:22:52,134] [ INFO] - 音频保存至:./output.wav
303-
285+
```text
286+
[2022-04-27 10:22:48,852] [ INFO] - tts websocket client start
287+
[2022-04-27 10:22:49,080] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
288+
[2022-04-27 10:22:49,080] [ INFO] - 首包响应:0.21017956733703613 s
289+
[2022-04-27 10:22:52,100] [ INFO] - 尾包响应:3.2304444313049316 s
290+
[2022-04-27 10:22:52,101] [ INFO] - 音频时长:3.825 s
291+
[2022-04-27 10:22:52,101] [ INFO] - RTF: 0.8445606356352762
292+
[2022-04-27 10:22:52,134] [ INFO] - 音频保存至:./output.wav
304293
```
305-
306-
307-
308-

demos/streaming_tts_server/README_cn.md

Lines changed: 35 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@
1818

1919
**如果使用简单模式安装,需要自行准备 yaml 文件,可参考 conf 目录下的 yaml 文件。**
2020

21-
2221
### 2. 准备配置文件
2322
配置文件可参见 `conf/tts_online_application.yaml`
2423
- `protocol` 表示该流式 TTS 服务使用的网络协议,目前支持 **http 和 websocket** 两种。
@@ -33,6 +32,7 @@
3332
- hifigan, mb_melgan 均支持流式 voc 推理
3433
- 当 voc 模型为 mb_melgan,当 voc_pad=14 时,流式推理合成音频与非流式合成音频一致;voc_pad 最小可以设置为7,合成音频听感上没有异常,若 voc_pad 小于7,合成音频听感上存在异常。
3534
- 当 voc 模型为 hifigan,当 voc_pad=19 时,流式推理合成音频与非流式合成音频一致;当 voc_pad=14 时,合成音频听感上没有异常。
35+
- PaddleSpeech 中流式声码器 Pad 计算方法: [AIStudio 教程](https://aistudio.baidu.com/aistudio/projectdetail/4151335)
3636
- 推理速度:mb_melgan > hifigan; 音频质量:mb_melgan < hifigan
3737
- **注意:** 如果在容器里可正常启动服务,但客户端访问 ip 不可达,可尝试将配置文件中 `host` 地址换成本地 ip 地址。
3838

@@ -56,7 +56,7 @@
5656
- `log_file`: log 文件. 默认:./log/paddlespeech.log
5757

5858
输出:
59-
```bash
59+
```text
6060
[2022-04-24 20:05:27,887] [ INFO] - The first response time of the 0 warm up: 1.0123658180236816 s
6161
[2022-04-24 20:05:28,038] [ INFO] - The first response time of the 1 warm up: 0.15108466148376465 s
6262
[2022-04-24 20:05:28,191] [ INFO] - The first response time of the 2 warm up: 0.15317344665527344 s
@@ -81,8 +81,8 @@
8181
log_file="./log/paddlespeech.log")
8282
```
8383

84-
输出
85-
```bash
84+
输出:
85+
```text
8686
[2022-04-24 21:00:16,934] [ INFO] - The first response time of the 0 warm up: 1.268730878829956 s
8787
[2022-04-24 21:00:17,046] [ INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
8888
[2022-04-24 21:00:17,151] [ INFO] - The first response time of the 2 warm up: 0.10413002967834473 s
@@ -126,7 +126,7 @@
126126

127127

128128
输出:
129-
```bash
129+
```text
130130
[2022-04-24 21:08:18,559] [ INFO] - tts http client start
131131
[2022-04-24 21:08:21,702] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
132132
[2022-04-24 21:08:21,703] [ INFO] - 首包响应:0.18863153457641602 s
@@ -184,20 +184,19 @@
184184
- `log_file`: log 文件. 默认:./log/paddlespeech.log
185185

186186
输出:
187-
```bash
188-
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
189-
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
190-
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
191-
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
192-
INFO: Started server process [17600]
193-
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
194-
INFO: Waiting for application startup.
195-
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
196-
INFO: Application startup complete.
197-
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
198-
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
199-
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
200-
187+
```text
188+
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
189+
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
190+
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
191+
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
192+
INFO: Started server process [17600]
193+
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
194+
INFO: Waiting for application startup.
195+
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
196+
INFO: Application startup complete.
197+
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
198+
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
199+
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
201200
```
202201

203202
- Python API
@@ -210,21 +209,20 @@
210209
log_file="./log/paddlespeech.log")
211210
```
212211

213-
输出:
214-
```bash
215-
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
216-
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
217-
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
218-
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
219-
INFO: Started server process [23466]
220-
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
221-
INFO: Waiting for application startup.
222-
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
223-
INFO: Application startup complete.
224-
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
225-
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
226-
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
227-
212+
输出:
213+
```text
214+
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
215+
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
216+
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
217+
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
218+
INFO: Started server process [23466]
219+
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
220+
INFO: Waiting for application startup.
221+
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
222+
INFO: Application startup complete.
223+
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
224+
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
225+
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
228226
```
229227

230228
#### 4.2 客户端使用方法
@@ -256,15 +254,14 @@
256254

257255

258256
输出:
259-
```bash
257+
```text
260258
[2022-04-27 10:21:04,262] [ INFO] - tts websocket client start
261259
[2022-04-27 10:21:04,496] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
262260
[2022-04-27 10:21:04,496] [ INFO] - 首包响应:0.2124948501586914 s
263261
[2022-04-27 10:21:07,483] [ INFO] - 尾包响应:3.199106454849243 s
264262
[2022-04-27 10:21:07,484] [ INFO] - 音频时长:3.825 s
265263
[2022-04-27 10:21:07,484] [ INFO] - RTF: 0.8363677006141812
266264
[2022-04-27 10:21:07,516] [ INFO] - 音频保存至:output.wav
267-
268265
```
269266

270267
- Python API
@@ -281,17 +278,15 @@
281278
spk_id=0,
282279
output="./output.wav",
283280
play=False)
284-
285281
```
286282

287283
输出:
288-
```bash
284+
```text
289285
[2022-04-27 10:22:48,852] [ INFO] - tts websocket client start
290286
[2022-04-27 10:22:49,080] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
291287
[2022-04-27 10:22:49,080] [ INFO] - 首包响应:0.21017956733703613 s
292288
[2022-04-27 10:22:52,100] [ INFO] - 尾包响应:3.2304444313049316 s
293289
[2022-04-27 10:22:52,101] [ INFO] - 音频时长:3.825 s
294290
[2022-04-27 10:22:52,101] [ INFO] - RTF: 0.8445606356352762
295291
[2022-04-27 10:22:52,134] [ INFO] - 音频保存至:./output.wav
296-
297-
```
292+
```

0 commit comments

Comments
 (0)