Skip to content

Commit ec15988

Browse files
authored
feat(vibevoice): add ASR support (#8222)
* feat(vibevoice): add ASR support Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): download voice files Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Try to run on bigger runner Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * debug Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * CI can't hold vibevoice Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
1 parent 93d7e5d commit ec15988

File tree

13 files changed

+574
-117
lines changed

13 files changed

+574
-117
lines changed

.github/workflows/test-extra.yml

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,7 @@ jobs:
238238
- name: Dependencies
239239
run: |
240240
sudo apt-get update
241-
sudo apt-get install build-essential ffmpeg
241+
sudo apt-get install -y build-essential ffmpeg
242242
sudo apt-get install -y ca-certificates cmake curl patch espeak espeak-ng python3-pip
243243
# Install UV
244244
curl -LsSf https://astral.sh/uv/install.sh | sh
@@ -257,7 +257,7 @@ jobs:
257257
- name: Dependencies
258258
run: |
259259
sudo apt-get update
260-
sudo apt-get install build-essential ffmpeg
260+
sudo apt-get install -y build-essential ffmpeg
261261
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
262262
# Install UV
263263
curl -LsSf https://astral.sh/uv/install.sh | sh
@@ -276,7 +276,7 @@ jobs:
276276
- name: Dependencies
277277
run: |
278278
sudo apt-get update
279-
sudo apt-get install build-essential ffmpeg
279+
sudo apt-get install -y build-essential ffmpeg
280280
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
281281
# Install UV
282282
curl -LsSf https://astral.sh/uv/install.sh | sh
@@ -295,12 +295,31 @@ jobs:
295295
- name: Dependencies
296296
run: |
297297
sudo apt-get update
298-
sudo apt-get install build-essential ffmpeg
298+
sudo apt-get install -y build-essential ffmpeg
299299
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
300300
# Install UV
301301
curl -LsSf https://astral.sh/uv/install.sh | sh
302302
pip install --user --no-cache-dir grpcio-tools==1.64.1
303303
- name: Test qwen-tts
304304
run: |
305305
make --jobs=5 --output-sync=target -C backend/python/qwen-tts
306-
make --jobs=5 --output-sync=target -C backend/python/qwen-tts test
306+
make --jobs=5 --output-sync=target -C backend/python/qwen-tts test
307+
# tests-vibevoice:
308+
# runs-on: bigger-runner
309+
# steps:
310+
# - name: Clone
311+
# uses: actions/checkout@v6
312+
# with:
313+
# submodules: true
314+
# - name: Dependencies
315+
# run: |
316+
# sudo apt-get update
317+
# sudo apt-get install -y build-essential ffmpeg
318+
# sudo apt-get install -y ca-certificates cmake curl patch python3-pip wget
319+
# # Install UV
320+
# curl -LsSf https://astral.sh/uv/install.sh | sh
321+
# pip install --user --no-cache-dir --break-system-packages grpcio-tools==1.64.1
322+
# - name: Test vibevoice
323+
# run: |
324+
# make --jobs=5 --output-sync=target -C backend/python/vibevoice
325+
# make --jobs=5 --output-sync=target -C backend/python/vibevoice test

backend/python/vibevoice/Makefile

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,51 @@
22
vibevoice:
33
bash install.sh
44

5+
.PHONY: download-voices
6+
download-voices:
7+
@echo "Downloading voice preset files..."
8+
@mkdir -p voices/streaming_model
9+
@if command -v wget >/dev/null 2>&1; then \
10+
wget -q -O voices/streaming_model/en-Frank_man.pt \
11+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Frank_man.pt && \
12+
wget -q -O voices/streaming_model/en-Grace_woman.pt \
13+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Grace_woman.pt && \
14+
wget -q -O voices/streaming_model/en-Mike_man.pt \
15+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Mike_man.pt && \
16+
wget -q -O voices/streaming_model/en-Emma_woman.pt \
17+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Emma_woman.pt && \
18+
wget -q -O voices/streaming_model/en-Carter_man.pt \
19+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Carter_man.pt && \
20+
wget -q -O voices/streaming_model/en-Davis_man.pt \
21+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Davis_man.pt && \
22+
echo "Voice files downloaded successfully"; \
23+
elif command -v curl >/dev/null 2>&1; then \
24+
curl -sL -o voices/streaming_model/en-Frank_man.pt \
25+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Frank_man.pt && \
26+
curl -sL -o voices/streaming_model/en-Grace_woman.pt \
27+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Grace_woman.pt && \
28+
curl -sL -o voices/streaming_model/en-Mike_man.pt \
29+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Mike_man.pt && \
30+
curl -sL -o voices/streaming_model/en-Emma_woman.pt \
31+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Emma_woman.pt && \
32+
curl -sL -o voices/streaming_model/en-Carter_man.pt \
33+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Carter_man.pt && \
34+
curl -sL -o voices/streaming_model/en-Davis_man.pt \
35+
https://raw.githubusercontent.com/microsoft/VibeVoice/main/demo/voices/streaming_model/en-Davis_man.pt && \
36+
echo "Voice files downloaded successfully"; \
37+
else \
38+
echo "Error: Neither wget nor curl found. Cannot download voice files."; \
39+
exit 1; \
40+
fi
41+
542
.PHONY: run
643
run: vibevoice
744
@echo "Running vibevoice..."
845
bash run.sh
946
@echo "vibevoice run."
1047

1148
.PHONY: test
12-
test: vibevoice
49+
test: vibevoice download-voices
1350
@echo "Testing vibevoice..."
1451
bash test.sh
1552
@echo "vibevoice tested."

0 commit comments

Comments
 (0)