[WIP] Add initial GPU support by edurenye · Pull Request #4 · rhasspy/wyoming-addons

edurenye · 2023-08-24T22:30:09Z

This is a work in progress.
I think for whisper it is working, but I'm not sure how to check it.
And for piper it is giving me an error unrecognized arguments: --cuda, but I got the instructions from here: https://github.com/rhasspy/piper At the end it says that it should work just installing onnxruntime-gpu and running piper with the --cuda argument.

What am I missing?

I guess this will conflict with those that just want to use the CPU, how can we handle that? Making different images?
Ex: piper and piper-gpu

edurenye · 2023-08-24T22:31:08Z

Closes #3

DBaker85 · 2023-09-12T12:41:04Z

Just wanted to leave my 2cents here:
I tried your whisper changes locally and it is working perfectly on my 1080ti and Docker.
VRam is assigned and the container works as well. Home assistant also recognised and used it perfectly.
Nice one!

(Did not try Piper)

edurenye · 2023-09-14T10:24:19Z

Piper does not work because of this: rhasspy/rhasspy3#49

wdunn001 · 2023-10-05T23:26:28Z

Whisper is still targeting 20.04 is there a reason for that?

wdunn001 · 2023-10-05T23:27:32Z

This may need to be its own image since the majority of users would not want the cuda version

wdunn001 · 2023-10-05T23:29:13Z

could this be split into 2 tickets one for whisper and one for piper. The whisper portion is in reality the more useful of the two and benefits more from this feature. If piper is experiencing issues.

edurenye · 2023-10-06T11:27:34Z

@wdunn001 From the documentation https://github.com/guillaumekln/faster-whisper/ it says it requires cuDNN 8 for CUDA 11, and for those versions of CUDA and cuDNN the highest version of ubuntu available is 20.04, and I had to look for it because it was not working with the image I set for the other containers sadly.
And updating to CUDA 12 is not planned in the very short term. See an explanation here: SYSTRAN/faster-whisper#47 (comment).

edurenye · 2023-10-06T11:32:42Z

Sorry, editing because I missunderstood your comment.
Yes, makes sense to make it 2 different images, I can add that.

But I guess for better maintainability the solution we add for one should be the same as for the others, for that is I think is better to have the conversation in a single issue and PR.
If you need to use it right now you can just add the changes to your local Dockerfile and build it.
Or if you need to use CUDA 12 you could try the workarounds that they comment in here: SYSTRAN/faster-whisper#153 (comment)

edurenye · 2023-10-06T11:43:54Z

And I'll try to add porcupine1 too

wdunn001 · 2023-10-06T14:06:31Z

Awesome! I am happy to help if you need anything. Would we want to add the docker arguments for the CUDA image to the documentation here?

edurenye · 2023-10-06T15:44:07Z

I added the changes.
I have not tested the new porcupine1 container, since that software does not support my language yet.

And yes, ofc we should document this, also I was thinking should we add a docker-compose.yml file?
It made sense for me since I use home assistant and need the 3 services. But now that porcupine1 has been added I am not sure anymore since as far as I know porcupine1 and openwakeword do the same, which is quite confusing for me.

edurenye · 2023-10-06T15:47:06Z

But in the README.md file right now there is just the documentation for using it pulling the images, not building them, so that will depend on the tags the maintainer might wanna use. Should we add building instructions to the README.md file?

wdunn001 · 2023-10-06T16:07:25Z

I think so for sure we can create a contributors section. I'll work on it I will be building it for the first time this weekend so I'll try and document the process.

edurenye · 2023-10-06T17:19:26Z

I will give you the docker-compose files and a starting point.

edurenye · 2023-10-06T18:09:33Z

I just added it, tell me how it works for you, you can create your own docker-compose.x.yml file for your use case.

I have not added porcupine1 to the docker compose because it uses the same port as openwakeword, so for that particular case it could be added in the custom extend file.

wdunn001 · 2023-10-08T16:43:42Z

ok so I am getting an error deploying this via compose or run

usage: main.py [-h] --model {tiny,tiny-int8,base,base-int8,small,small-int8,medium,medium-int8} --uri URI --data-dir DATA_DIR [--download-dir DOWNLOAD_DIR] [--device DEVICE] [--language LANGUAGE] [--compute-type COMPUTE_TYPE] [--beam-size BEAM_SIZE] [--debug]
main.py: error: the following arguments are required: --model, --uri, --data-dir
/run.sh: line 3: --uri: command not found
/run.sh: line 4: --data-dir: command not found
/run.sh: line 5: --download-dir: command not found

It needs additional params in contrast with the other build.

These appear to be supplied by the run.sh file and I see its called in the Dockerfile.

I added commands to the GPU compose file identical to those in the NOGPU version and they work fine and made a pr. Its only the ones in the run.sh that seem to not work.

I am on Ubuntu 22.04 with latest docker is that matters.

edurenye · 2023-10-09T10:37:48Z

This is weird, according to the documentation, the only thinks not extended should be volumes_from and depends_on. We can follow this discussion in the PR that you created edurenye#1

AnkushMalaker · 2023-10-15T21:57:16Z

I needed to add --device cuda to actually load the whisper model onto my GPU. I second that we could split this into different branches to handle GPU for whisper, piper and wakeword. I made a branch for that, not sure if I should raise this as a PR.

removed --cuda for piper as that isn't working upstream yet.
changed the default data directories to /var/data to be consistent with some other docker compose files I saw.

New to contributing, happy to hear thoughts.

https://github.com/AnkushMalaker/wyoming-addons/tree/gpu

edurenye · 2023-10-16T14:51:20Z

I rebased with the last chnages from master and the typos in the readme file.

I don´t think we need to create another branch for the meanwhile you can just have an extend file where you use GPU options for whisper and openwakeword and nongpu for piper.

And regarding /var/data, I am generally against storing user data in a system folder. And passing all the folder to the docker container might load a lot of data that is not needed from other applications.

wdunn001 · 2023-10-18T10:57:11Z

@edurenye agreed using cpu for piper seems to be more than sufficient. I am still experiencing issues with openwakeword but it may just be my environment. I'll pull down the changes here and try again. I'll push any fixes I find to the PR on your branch.

wdunn001 · 2023-10-18T10:59:18Z

piper/GPU.Dockerfile

@@ -0,0 +1,35 @@
+FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04


Perhaps we remove this file in the interim to get rid of dead code?

I do not see it as dead code, when this issue gets fixed it should just work right away.

ok sounds good

wdunn001 · 2023-10-18T11:00:28Z

porcupine1/GPU.Dockerfile

@@ -0,0 +1,32 @@
+FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04


remove to get rid of deadcode?

I do not see it as dead code either, the people that wants to use it can just use it extending the docker compose or use it directly with docker run as documented here: https://github.com/rhasspy/wyoming-porcupine1/blob/master/README.md but adding the cuda stuff.

sounds good

wdunn001 · 2023-10-18T11:08:22Z

.gitignore

@@ -0,0 +1,12 @@
+# OpenWakeWord


perhaps we reference managed volumes instead to prevent this?

i.e.
volumes:
openwakeword-data:
whisper-data:
piper-data:

this is what I did in my version.
we could also add -gpu for volumes connected to gpu enabled instances in the GPU compose file so that we can keep data seperate between instance types.

Do you mean non binded mounts? But then adding custom models (thinking mainly about OpenWakeWord here) is hard, with binded mounts you can just move the model to that directory. Also I don't think there will be a case where you want to move from GPU to NONGPU changing models, but probably I am wrong there.

I think I agree with you here, probably the best way is to not bind them by default and then you can bind them extending the docker compose and point wherever you have the custom model.

Or maybe we could look at passing it as a parameter, haven't looked into it, I'm still fighting to generate the custom model actually.

wdunn001 · 2023-10-18T11:11:30Z

README.md

+docker compose down
+```
+
+### Run with GPU


should we reference documentation on how to setup docker for gpu? (I can of course add it in a seperate pr)

Yes, good idea!

Maxcodesthings · 2023-10-25T23:08:17Z

I have tried applying the contents of this PR to my local instance. I do not see the faster-whisper implementation use GPU over CPU.

I have conflated the dockerfiles as such and focused on only using GPU for whisper container:

  whisper:
    container_name: whisper
    build:
      context: /opt/wyoming-addons/whisper/
      dockerfile: GPU.Dockerfile
    # image: rhasspy/wyoming-whisper:latest
    restart: unless-stopped
    ports:
      - 10300:10300
    volumes:
      - /opt/homeassistant/whisper:/data
    command: 
      - --model
      - medium-int8
      - --language
      - en
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

I can tell my GPU is passed through because it appears in nvidia-smi on the container

However when watching GPU when processing my speech the usage does not increase, and when watching CPU the usage clearly spikes since it's the CPU processing my speech

How have you all tested that this implementation of faster-whisper is working? I would like to do the same on my machine

Edit:

Found the issue!

You are missing --device in your compose

command: 
      - --model
      - small
      - --language
      - en
      - --device
      - cuda

edurenye · 2023-10-27T04:54:38Z

Good finiding! Was not documented, but that parameter exists in https://github.com/rhasspy/wyoming-faster-whisper/blob/master/wyoming_faster_whisper/__main__.py

mreilaender · 2023-11-10T16:54:51Z

Can u resolve the conflicts? I would love to see the improvements from using the GPU directly :)

mreilaender · 2023-11-12T13:40:46Z

Doesn't work with piper since wyoming-piper doesn't declare the --cuda argument. I created a PR

Rudd-O · 2025-08-04T21:58:18Z

I've pull requested my work on your branch. Feel free to update this PR based on that.

edurenye · 2025-08-05T16:42:36Z

Thanks for your contribution @Rudd-O!

I left you a comment in the PR, could you take a look at it please?

edurenye · 2025-08-05T16:45:46Z

For others following the thread, this is the PR: edurenye#4

Also, I removed the runtime: nvidia because docker compose now supports NVIDIA right away without "nvidia-docker" and in the last versions it has been deprecated and removed, https://github.com/NVIDIA/nvidia-docker#usage

When I have more time, I'll do a rebase with the last work from the base repository.

Adds --cuda arg support

Adds bind mounts for __main__.py and process.py to add --cuda arg support to piper until the upstream project includes it.

Runtime NVIDIA was deprecated and removed from docker: https://github.com/NVIDIA/nvidia-docker#usage

This was referenced Sep 14, 2023

[Question] Faster-Whisper Home Assistant GPU or Tensor (Coral) Suport? rhasspy/rhasspy3#37

Open

Add support for CUDA in piper rhasspy/rhasspy3#49

Open

edurenye force-pushed the gpu branch from 1a7f3e8 to 760312c Compare October 6, 2023 15:07

edurenye force-pushed the gpu branch from 9a99267 to 48f4a98 Compare October 16, 2023 14:44

wdunn001 reviewed Oct 18, 2023

View reviewed changes

edurenye force-pushed the gpu branch from 48f4a98 to fe4ddc4 Compare October 20, 2023 08:12

edurenye and others added 24 commits September 23, 2025 18:43

Add initial GPU support

6289a66

Add support for both GPU and NONGPU Dockerfiles

b3a639b

Add docker compose and instructions

d8e8a49

Make whisper use cuda

7f4077a

Update Dockerfiles and add Vosk

321f617

Create __main__.py

b13a5db

Adds --cuda arg support

Create process.py

3875b0a

Adds --cuda arg support

Update docker-compose.gpu.yml

1a076de

Adds bind mounts for __main__.py and process.py to add --cuda arg support to piper until the upstream project includes it.

Remove volumes, they can be added by extended docker compose files

556d6e5

Fixes from rebases

fce9141

Fixes from rebases

dc033d0

Fixes from rebases

730f3e8

Use YAML anchores, all commented services, and add whisper-cpp

8cc843c

Simplify everything using BASE image args

c382120

Simplify everything using BASE image args

8d9de09

Add default values and use env for runtime

e854a12

Go back to two files for piper and updated piper gpu

1ef3433

Add microwakeword option

c01cba3

chore: Add rhasspy-speech option

3288832

chore: Update images and whisper version

b60aa3f

Go back to openwakeword 1.8.2

5ba673c

Fix errors and warnings

da80154

Remove runtime nvidia

9a46639

Runtime NVIDIA was deprecated and removed from docker: https://github.com/NVIDIA/nvidia-docker#usage

Fixes from the last rebase, and add speech-to-phrase

f2a939e

edurenye force-pushed the gpu branch from 49b1a05 to f2a939e Compare September 23, 2025 18:07

Update CUDA image to Ubuntu 24.04

75bc496

synesthesiam removed their assignment Oct 28, 2025

		@@ -0,0 +1,35 @@
		FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04

		@@ -0,0 +1,32 @@
		FROM nvidia/cuda:12.1.1-cudnn8-runtime-ubuntu22.04

Conversation

edurenye commented Aug 24, 2023

Uh oh!

edurenye commented Aug 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DBaker85 commented Sep 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

edurenye commented Sep 14, 2023

Uh oh!

wdunn001 commented Oct 5, 2023

Uh oh!

wdunn001 commented Oct 5, 2023

Uh oh!

wdunn001 commented Oct 5, 2023

Uh oh!

edurenye commented Oct 6, 2023

Uh oh!

edurenye commented Oct 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

edurenye commented Oct 6, 2023

Uh oh!

wdunn001 commented Oct 6, 2023

Uh oh!

edurenye commented Oct 6, 2023

Uh oh!

edurenye commented Oct 6, 2023

Uh oh!

wdunn001 commented Oct 6, 2023

Uh oh!

edurenye commented Oct 6, 2023

Uh oh!

edurenye commented Oct 6, 2023

Uh oh!

wdunn001 commented Oct 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

edurenye commented Oct 9, 2023

Uh oh!

AnkushMalaker commented Oct 15, 2023

Uh oh!

edurenye commented Oct 16, 2023

Uh oh!

wdunn001 commented Oct 18, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wdunn001 Oct 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Maxcodesthings commented Oct 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Found the issue!

Uh oh!

edurenye commented Oct 27, 2023

Uh oh!

mreilaender commented Nov 10, 2023

Uh oh!

mreilaender commented Nov 12, 2023

edurenye commented Aug 24, 2023 •

edited

Loading

DBaker85 commented Sep 12, 2023 •

edited

Loading

edurenye commented Oct 6, 2023 •

edited

Loading

wdunn001 commented Oct 8, 2023 •

edited

Loading

wdunn001 Oct 18, 2023 •

edited

Loading

Maxcodesthings commented Oct 25, 2023 •

edited

Loading

edurenye commented Aug 5, 2025 •

edited

Loading