-
Win 10, Python 3.10.11, CUDA 12.3, CUDNN 8.9.7.29, Ampere (3060ti). pip list:accelerate 0.25.0 aiohttp 3.9.1 aiosignal 1.3.1 anyio 3.6.2 appdirs 1.4.4 argcomplete 3.2.1 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 arrow 1.2.3 asttokens 2.2.1 async-timeout 4.0.2 attrs 23.1.0 audioread 3.0.1 backcall 0.2.0 backoff 2.1.2 beautifulsoup4 4.12.2 bleach 6.0.0 boto3 1.26.120 botocore 1.29.120 certifi 2022.12.7 cffi 1.15.1 charset-normalizer 3.1.0 click 8.1.7 colorama 0.4.6 coloredlogs 15.0.1 comm 0.1.3 cuda-python 12.1.0 Cython 0.29.34 datasets 2.16.1 debugpy 1.6.7 decorator 5.1.1 defusedxml 0.7.1 dill 0.3.7 einops 0.6.1 encodec 0.1.1 executing 1.2.0 fastjsonschema 2.16.3 filelock 3.12.0 flash_attn 2.4.2 fqdn 1.5.1 frozenlist 1.4.1 fsspec 2023.10.0 funcy 2.0 githubrelease 1.5.9 huggingface-hub 0.20.2 humanfriendly 10.0 idna 3.4 ifaddr 0.2.0 ipykernel 6.22.0 ipython 8.12.0 ipython-genutils 0.2.0 ipywidgets 8.0.6 isoduration 20.11.0 jedi 0.18.2 Jinja2 3.1.2 jmespath 1.0.1 joblib 1.3.2 jsonpointer 2.3 jsonschema 4.17.3 jupyter 1.0.0 jupyter_client 8.2.0 jupyter-console 6.6.3 jupyter_core 5.3.0 jupyter-events 0.6.3 jupyter_server 2.5.0 jupyter_server_terminals 0.4.4 jupyterlab-pygments 0.2.2 jupyterlab-widgets 3.0.7 lazy_loader 0.3 librespot 0.0.9 librosa 0.10.1 LinkHeader 0.4.3 llvmlite 0.41.1 MarkupSafe 2.1.2 matplotlib-inline 0.1.6 mistune 2.0.5 more-itertools 10.1.0 mpmath 1.3.0 msgpack 1.0.7 multidict 6.0.4 multiprocess 0.70.15 music-tag 0.4.3 mutagen 1.46.0 nbclassic 0.5.5 nbclient 0.7.4 nbconvert 7.3.1 nbformat 5.8.0 nest-asyncio 1.5.6 networkx 3.1 ninja 1.11.1.1 notebook 6.5.4 notebook_shim 0.2.3 numba 0.58.1 numpy 1.24.3 nvidia-cuda-runtime-cu12 12.3.101 openai-whisper 20231117 optimum 1.16.1 packaging 23.2 pandas 2.1.4 pandocfilters 1.5.0 parso 0.8.3 pickleshare 0.7.5 Pillow 9.5.0 pip 23.3.2 pipx 1.4.1 platformdirs 3.4.0 pooch 1.8.0 prometheus-client 0.16.0 prompt-toolkit 3.0.38 protobuf 3.20.1 psutil 5.9.5 pure-eval 0.2.2 pyarrow 14.0.2 pyarrow-hotfix 0.6 pycparser 2.21 pycryptodomex 3.18.0 pydub 0.25.1 Pygments 2.15.1 PyOgg 0.6.14a1 pyreadline3 3.4.1 pyrsistent 0.19.3 python-dateutil 2.8.2 python-json-logger 2.0.7 pytz 2023.3.post1 pywin32 306 pywinpty 2.0.10 PyYAML 6.0 pyzmq 25.0.2 qtconsole 5.4.2 QtPy 2.3.1 regex 2023.3.23 requests 2.31.0 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 s3transfer 0.6.0 safetensors 0.4.1 scikit-learn 1.3.2 scipy 1.10.1 Send2Trash 1.8.0 sentencepiece 0.1.99 setuptools 69.0.3 six 1.16.0 sniffio 1.3.0 soundfile 0.12.1 soupsieve 2.4.1 soxr 0.3.7 stack-data 0.6.2 suno-bark 0.0.1a0 sympy 1.11.1 terminado 0.17.1 threadpoolctl 3.2.0 tiktoken 0.5.2 tinycss2 1.2.1 tokenizers 0.15.0 tomli 2.0.1 torch 2.1.2+cu121 torchaudio 2.1.2+cu121 torchvision 0.16.2+cu121 tornado 6.3.1 tqdm 4.65.0 traitlets 5.9.0 transformers 4.37.0.dev0 typing_extensions 4.5.0 tzdata 2023.4 uri-template 1.2.0 urllib3 1.26.15 userpath 1.9.1 wcwidth 0.2.6 webcolors 1.13 webencodings 0.5.1 websocket-client 1.5.2 wheel 0.42.0 widgetsnbextension 4.0.7 xxhash 3.4.1 yarl 1.9.4 zeroconf 0.64.0 I have the following code: import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset
import time
# Measure the start time
start_time = time.time()
# Check if GPU is available
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
# Load the model on the CPU
model_id = "openai/whisper-large-v3"
model = AutoModelForSpeechSeq2Seq.from_pretrained(
model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True, attn_implementation="flash_attention_2"
)
# Move the model to the GPU if available
model.to(device)
processor = AutoProcessor.from_pretrained(model_id)
pipe = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
max_new_tokens=128,
chunk_length_s=30,
batch_size=16,
return_timestamps=True,
torch_dtype=torch_dtype,
device=device,
)
dataset = load_dataset("distil-whisper/librispeech_long", "clean", split="validation")
sample = dataset[0]["audio"]
# Execute the pipeline
result = pipe("1.ogg")
# Print the result and execution time
print(result["text"])
end_time = time.time()
execution_time = end_time - start_time
print(f"Execution time: {execution_time} seconds") I am getting this error:
😥 |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
load model in gpu |
Beta Was this translation helpful? Give feedback.
-
I was wrong; speed gains are noticeable with a larger input. |
Beta Was this translation helpful? Give feedback.
I did this: