Skip to content

Commit de9c3e6

Browse files
authored
[Example] Add piper example (#145)
* Add piper example Signed-off-by: PeterD1524 <[email protected]> * update dependencies Signed-off-by: PeterD1524 <[email protected]> * fix typo Signed-off-by: PeterD1524 <[email protected]> * Add a GitHub workflow to ensure that the example can run successfully Signed-off-by: PeterD1524 <[email protected]> * simplify layout Signed-off-by: PeterD1524 <[email protected]> * Provide a simple explanation of the related dependency installation Signed-off-by: PeterD1524 <[email protected]> * add config description Signed-off-by: PeterD1524 <[email protected]> * use -DWASMEDGE_USE_LLVM=OFF to disable all AOT-related components Signed-off-by: PeterD1524 <[email protected]> * ask users to download and install the onnx runtime Signed-off-by: PeterD1524 <[email protected]> * add sudo for mv Signed-off-by: PeterD1524 <[email protected]> * ldconfig for installing onnxruntime Signed-off-by: PeterD1524 <[email protected]> * use the install script from the WasmEdge repo to install ONNX Runtime Signed-off-by: PeterD1524 <[email protected]> --------- Signed-off-by: PeterD1524 <[email protected]>
1 parent c6312e8 commit de9c3e6

File tree

8 files changed

+429
-0
lines changed

8 files changed

+429
-0
lines changed

.github/workflows/piper.yml

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
name: Piper Example
2+
3+
on:
4+
schedule:
5+
- cron: "0 0 * * *"
6+
push:
7+
paths:
8+
- ".github/workflows/piper.yml"
9+
- "wasmedge-piper/**"
10+
pull_request:
11+
paths:
12+
- ".github/workflows/piper.yml"
13+
- "wasmedge-piper/**"
14+
15+
jobs:
16+
build:
17+
runs-on: ubuntu-22.04
18+
steps:
19+
- name: Install Dependencies for building WasmEdge
20+
run: |
21+
sudo apt-get update
22+
sudo apt-get install ninja-build
23+
24+
- name: Checkout WasmEdge
25+
uses: actions/checkout@v4
26+
with:
27+
repository: WasmEdge/WasmEdge
28+
path: WasmEdge
29+
30+
- name: Install ONNX Runtime
31+
run: sudo bash utils/wasi-nn/install-onnxruntime.sh
32+
working-directory: WasmEdge
33+
34+
- name: Build WasmEdge with WASI-NN Piper plugin
35+
run: |
36+
cmake -GNinja -Bbuild -DCMAKE_BUILD_TYPE=Release -DWASMEDGE_USE_LLVM=OFF -DWASMEDGE_PLUGIN_WASI_NN_BACKEND=Piper
37+
cmake --build build
38+
working-directory: WasmEdge
39+
40+
- name: Install Rust target for wasm
41+
run: rustup target add wasm32-wasi
42+
43+
- name: Checkout WasmEdge-WASINN-examples
44+
uses: actions/checkout@v4
45+
with:
46+
path: WasmEdge-WASINN-examples
47+
48+
- name: Build wasm
49+
run: cargo build --target wasm32-wasi --release
50+
working-directory: WasmEdge-WASINN-examples/wasmedge-piper
51+
52+
- name: Download model
53+
run: curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx
54+
55+
- name: Download config
56+
run: curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json
57+
58+
- name: Download espeak-ng-data
59+
run: |
60+
curl -LO https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz
61+
tar -xzf piper_linux_x86_64.tar.gz piper/espeak-ng-data --strip-components=1
62+
rm piper_linux_x86_64.tar.gz
63+
64+
- name: Execute
65+
run: WASMEDGE_PLUGIN_PATH=WasmEdge/build/plugins/wasi_nn WasmEdge/build/tools/wasmedge/wasmedge --dir .:. WasmEdge-WASINN-examples/wasmedge-piper/target/wasm32-wasi/release/wasmedge-piper.wasm
66+
67+
- name: Verify output
68+
run: test "$(file --brief welcome.wav)" == 'RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 22050 Hz'

wasmedge-piper/Cargo.toml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[package]
2+
name = "wasmedge-piper"
3+
version = "0.1.0"
4+
edition = "2021"
5+
6+
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
7+
8+
[dependencies]
9+
serde_json = "1.0.120"
10+
wasmedge-wasi-nn = "0.8.0"

wasmedge-piper/README.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# Text to speech example with WasmEdge WASI-NN Piper plugin
2+
3+
This example demonstrates how to use WasmEdge WASI-NN Piper plugin to perform TTS.
4+
5+
## Build WasmEdge with WASI-NN Piper plugin
6+
7+
Overview of WASI-NN Piper plugin dependencies:
8+
9+
![d2 --layout elk dependencies.d2 dependencies.svg](dependencies.svg)
10+
11+
- [piper](https://github.com/rhasspy/piper): A fast, local neural text to speech system.
12+
- [piper-phonemize](https://github.com/rhasspy/piper-phonemize): C++ library for converting text to phonemes for Piper.
13+
- [espeak-ng](https://github.com/rhasspy/espeak-ng): An open source speech synthesizer that supports more than hundred languages and accents. Piper uses it for text to phoneme translation.
14+
- [onnxruntime](https://github.com/microsoft/onnxruntime): A cross-platform inference and training machine-learning accelerator. [ONNX](https://onnx.ai/) is an open format built to represent machine learning models. Piper uses ONNX Runtime as an inference backend for its ONNX models to convert phoneme ids to WAV audio.
15+
16+
The WasmEdge WASI-NN Piper plugin relies on the ONNX Runtime C++ API. For installation instructions, please refer to the installation table on the [official website](https://onnxruntime.ai/getting-started).
17+
18+
Example of installing ONNX Runtime 1.14.1 on Ubuntu:
19+
20+
```bash
21+
curl -LO https://github.com/microsoft/onnxruntime/releases/download/v1.14.1/onnxruntime-linux-x64-1.14.1.tgz
22+
tar zxf onnxruntime-linux-x64-1.14.1.tgz
23+
mv onnxruntime-linux-x64-1.14.1/include/* /usr/local/include/
24+
mv onnxruntime-linux-x64-1.14.1/lib/* /usr/local/lib/
25+
rm -rf onnxruntime-linux-x64-1.14.1.tgz onnxruntime-linux-x64-1.14.1
26+
ldconfig
27+
```
28+
29+
For other dependencies, WasmEdge will download and build them automatically.
30+
31+
Build WasmEdge from source:
32+
33+
```bash
34+
cd /path/to/wasmedge/source/folder
35+
36+
cmake -GNinja -Bbuild -DCMAKE_BUILD_TYPE=Release -DWASMEDGE_USE_LLVM=OFF -DWASMEDGE_PLUGIN_WASI_NN_BACKEND=Piper
37+
cmake --build build
38+
```
39+
40+
Then you will have an executable `wasmedge` runtime at `build/tools/wasmedge/wasmedge` and the WASI-NN with Piper backend plug-in at `build/plugins/wasi_nn/libwasmedgePluginWasiNN.so`.
41+
42+
## Model Download Link
43+
44+
In this example, we will use the [en_US-lessac-medium](https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/lessac/medium) model.
45+
46+
[MODEL CARD](https://huggingface.co/rhasspy/piper-voices/blob/main/en/en_US/lessac/medium/MODEL_CARD):
47+
48+
```
49+
# Model card for lessac (medium)
50+
51+
* Language: en_US (English, United States)
52+
* Speakers: 1
53+
* Quality: medium
54+
* Samplerate: 22,050Hz
55+
56+
## Dataset
57+
58+
* URL: https://www.cstr.ed.ac.uk/projects/blizzard/2013/lessac_blizzard2013/
59+
* License: https://www.cstr.ed.ac.uk/projects/blizzard/2013/lessac_blizzard2013/license.html
60+
61+
## Training
62+
63+
Trained from scratch.
64+
65+
```
66+
67+
It has a model file [en_US-lessac-medium.onnx](https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx) and a config file [en_US-lessac-medium.onnx.json](https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json).
68+
69+
```bash
70+
# Download model
71+
curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx
72+
# Download config
73+
curl -LO https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json
74+
```
75+
76+
This model uses [eSpeak NG](https://github.com/rhasspy/espeak-ng) to convert text to phonemes, so we also need to download the required espeak-ng-data.
77+
78+
This will download and extract the espeak-ng-data directory to the current working directory:
79+
80+
```bash
81+
curl -LO https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_linux_x86_64.tar.gz
82+
tar -xzf piper_linux_x86_64.tar.gz piper/espeak-ng-data --strip-components=1
83+
```
84+
85+
## Build wasm
86+
87+
Run the following command to build wasm, the output WASM file will be at `target/wasm32-wasi/release/`
88+
89+
```bash
90+
cargo build --target wasm32-wasi --release
91+
```
92+
93+
## Execute
94+
95+
Execute the WASM with the `wasmedge`.
96+
97+
```bash
98+
WASMEDGE_PLUGIN_PATH=/path/to/parent/directory/of/libwasmedgePluginWasiNN.so /path/to/wasmedge --dir .:. /path/to/wasm
99+
```
100+
101+
Example layout:
102+
103+
```
104+
.
105+
├── en_US-lessac-medium.onnx
106+
├── en_US-lessac-medium.onnx.json
107+
├── espeak-ng-data/
108+
├── WasmEdge/build/
109+
│ ├── plugins/wasi_nn/libwasmedgePluginWasiNN.so
110+
│ └── tools/wasmedge/wasmedge
111+
└── WasmEdge-WASINN-examples/wasmedge-piper/target/wasm32-wasi/release/wasmedge-piper.wasm
112+
```
113+
114+
Then the command will be:
115+
116+
```bash
117+
WASMEDGE_PLUGIN_PATH=WasmEdge/build/plugins/wasi_nn WasmEdge/build/tools/wasmedge/wasmedge --dir .:. WasmEdge-WASINN-examples/wasmedge-piper/target/wasm32-wasi/release/wasmedge-piper.wasm
118+
```
119+
120+
The output `welcome.wav` is the synthesized audio.
121+
122+
## Config options
123+
124+
The JSON config options passed to WasmEdge WASI-NN Piper plugin via `bytes_array` in `wasmedge_wasi_nn::GraphBuilder::build_from_bytes` is similar to the Piper command-line program options.
125+
126+
See [config.schema.json](config.schema.json) for available options and [json_input.schema.json](json_input.schema.json) for JSON input.

wasmedge-piper/config.schema.json

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
{
2+
"$schema": "http://json-schema.org/draft-07/schema#",
3+
"properties": {
4+
"model": {
5+
"description": "Path to .onnx voice file",
6+
"type": "string"
7+
},
8+
"config": {
9+
"description": "Path to JSON voice config file, default is model path + .json",
10+
"type": "string"
11+
},
12+
"output_type": {
13+
"default": "wav",
14+
"description": "Type of output to produce",
15+
"enum": [
16+
"raw",
17+
"wav"
18+
]
19+
},
20+
"speaker": {
21+
"default": 0,
22+
"description": "Numerical id of the default speaker (multi-speaker voices)",
23+
"type": "number"
24+
},
25+
"noise_scale": {
26+
"default": 0.667,
27+
"description": "Amount of noise to add during audio generation, default value can be overridden by the value in voice model config",
28+
"type": "number"
29+
},
30+
"length_scale": {
31+
"default": 1.0,
32+
"description": "Speed of speaking (1 = normal, < 1 is faster, > 1 is slower), default value can be overridden by the value in voice model config",
33+
"type": "number"
34+
},
35+
"noise_w": {
36+
"default": 0.8,
37+
"description": "Variation in phoneme lengths, default value can be overridden by the value in voice model config",
38+
"type": "number"
39+
},
40+
"sentence_silence": {
41+
"default": 0.2,
42+
"description": "Seconds of silence to add after each sentence",
43+
"type": "number"
44+
},
45+
"espeak_data": {
46+
"description": "Path to espeak-ng data directory, required for espeak phonemes",
47+
"type": "string"
48+
},
49+
"tashkeel_model": {
50+
"description": "Path to libtashkeel ort model (https://github.com/mush42/libtashkeel), required for Arabic",
51+
"type": "string"
52+
},
53+
"json_input": {
54+
"default": false,
55+
"description": "input is JSON instead of text",
56+
"type": "boolean"
57+
},
58+
"phoneme_silence": {
59+
"additionalProperties": {
60+
"type": "number"
61+
},
62+
"description": "Seconds of extra silence to insert after a single phoneme, this is a mapping from single codepoints to seconds"
63+
}
64+
},
65+
"required": [
66+
"model"
67+
]
68+
}

wasmedge-piper/dependencies.d2

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
direction: right
2+
WasmEdge WASI-NN Piper plugin -> piper
3+
piper -> piper-phonemize
4+
piper -> espeak-ng
5+
piper -> onnxruntime
6+
piper-phonemize -> espeak-ng
7+
piper-phonemize -> onnxruntime

0 commit comments

Comments
 (0)