-
Notifications
You must be signed in to change notification settings - Fork 436
Open
Description
Hi Everyone,
Trying to run faster-whisper on a DGX Spark but running into issues. I've built CTranslate2 from source (see Dockerfile) but every time I try to run it with faster-whisper i get the below error message. I'm not entirely sure if this is a CTranslate issue or not, but i believe this error is coming from it as the only google result related back to here. Any ideas would be greatly appreciated.
Sysinfo:
Platform: Nvidia DGX Spark
GPU: GB10
OS: DGX OS (Ubuntu 24.04.3)
Arch: arm64
Error Message
whisper_cshore | Traceback (most recent call last):
whisper_cshore | File "/usr/src/.venv/lib/python3.10/site-packages/wyoming/server.py", line 41, in run
whisper_cshore | if not (await self.handle_event(event)):
whisper_cshore | File "/usr/src/wyoming_faster_whisper/faster_whisper_handler.py", line 77, in handle_event
whisper_cshore | text = " ".join(segment.text for segment in segments)
whisper_cshore | File "/usr/src/wyoming_faster_whisper/faster_whisper_handler.py", line 77, in <genexpr>
whisper_cshore | text = " ".join(segment.text for segment in segments)
whisper_cshore | File "/usr/src/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 1190, in generate_segments
whisper_cshore | encoder_output = self.encode(segment)
whisper_cshore | File "/usr/src/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 1400, in encode
whisper_cshore | return self.model.encode(features, to_cpu=to_cpu)
whisper_cshore | RuntimeError: Conv1D on GPU currently requires the cuDNN library which is not integrated in this build
compose.yml
services:
whisper:
container_name: whisper_cshore
image: cshore/wyoming-whisper:latest
build:
dockerfile: Dockerfile
environment:
- TZ=Europe/London
ports:
- 10300:10300
volumes:
- ./data:/data
restart: unless-stopped
command: --model tiny-int8 --language en --device=cuda
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
Dockerfile
FROM nvidia/cuda:12.6.3-cudnn-devel-ubuntu22.04
# TODO, try the runtime only one
ARG TARGETARCH
ARG TARGETVARIANT
WORKDIR /usr/src
# Install faster-whisper
COPY ./pyproject.toml ./
RUN \
apt update \
&& apt-get install -y --no-install-recommends \
python3 \
python3-pip \
python3-venv \
\
&& python3 -m venv .venv \
&& .venv/bin/pip3 install --no-cache-dir -U \
setuptools \
wheel \
&& .venv/bin/pip3 install --no-cache-dir \
--extra-index-url 'https://download.pytorch.org/whl/cu126' \
'torch==2.6.0' \
\
&& .venv/bin/pip3 install --no-cache-dir \
# --extra-index-url https://www.piwheels.org/simple \
-e '.[transformers,sherpa,onnx-asr]'
RUN \
.venv/bin/pip3 install -U --no-cache-dir nvidia-cublas-cu12
RUN \
.venv/bin/pip3 install -U --no-cache-dir nvidia-cudnn-cu12
# build ctranslate2 on arm64 with cuda and cudnn backend
RUN \
apt update \
&& apt install -y --no-install-recommends \
git \
cmake \
make \
libomp-dev \
python3-dev
RUN \
git clone --recursive https://github.com/OpenNMT/CTranslate2.git
RUN \
cd CTranslate2 && \
mkdir build && \
cd build && \
cmake .. -DWITH_CUDA=ON -DWITH_CUDNN=ON -DWITH_MKL=OFF && \
make -j20 && \
make install && \
ldconfig
# Install the built CTranslate2 python bindings
RUN \
cd CTranslate2 && \
cd python && \
/usr/src/.venv/bin/pip3 install -r install_requirements.txt && \
/usr/src/.venv/bin/python3 setup.py bdist_wheel && \
/usr/src/.venv/bin/pip3 install dist/*.whl
RUN \
apt remove -y \
git \
make \
cmake
COPY ./ ./
EXPOSE 10300
ENTRYPOINT ["bash", "docker_run.sh"]
Metadata
Metadata
Assignees
Labels
No labels