diff --git a/examples/windows/accuracy_benchmark/requirements.txt b/examples/windows/accuracy_benchmark/requirements.txt index 7b2a568f2..ad4c91cac 100644 --- a/examples/windows/accuracy_benchmark/requirements.txt +++ b/examples/windows/accuracy_benchmark/requirements.txt @@ -8,4 +8,4 @@ peft>=0.5.0 rwkv>=0.7.3 tiktoken==0.7.0 tqdm==4.66.5 -transformers==4.49.0 +transformers==4.57.3 diff --git a/examples/windows/onnx_ptq/genai_llm/requirements.txt b/examples/windows/onnx_ptq/genai_llm/requirements.txt index dd9b80084..21b59ac03 100644 --- a/examples/windows/onnx_ptq/genai_llm/requirements.txt +++ b/examples/windows/onnx_ptq/genai_llm/requirements.txt @@ -1,4 +1,4 @@ datasets>=2.14.5 onnx==1.18.0 -torch==2.6.0 -transformers==4.49.0 +torch==2.9.0 +transformers==4.57.3 diff --git a/examples/windows/onnx_ptq/whisper/README.md b/examples/windows/onnx_ptq/whisper/README.md index 2f8fd23d5..7d82c0dc4 100644 --- a/examples/windows/onnx_ptq/whisper/README.md +++ b/examples/windows/onnx_ptq/whisper/README.md @@ -41,6 +41,37 @@ pip install -r requirements.txt ``` +### Optional: Installing latest PyTorch and TorchAudio versions (2.8+) + +For users who need the latest versions of PyTorch and TorchAudio (version >=2.8), follow these additional steps: + +1. **Install PyTorch and TorchAudio with CUDA support:** + +```bash + +pip install --upgrade torch torchaudio --index-url https://download.pytorch.org/whl/cu128 + +``` + +1. **Install torchcodec:** + +```bash + +pip install torchcodec + +``` + +1. **Set up FFMPEG dependencies:** + - Download FFMPEG from: + - Extract the archive + - Copy all DLL files from the `bin` folder of the extracted FFMPEG directory and paste them into the torchcodec package folder (typically located at `/Lib/site-packages/torchcodec`) + +1. **Copy PyTorch DLL files:** + - Navigate to the `lib` folder in the torch package directory + - Copy all DLL files from this folder and paste them into the torchcodec package folder + +After completing these steps, torchaudio will be ready for use. + ## Inference script The script `whisper_optimum_ort_inference.py` is for Optimum-ORT based inference of an ONNX Whisper model. It takes an audio file (.wav) as input and transcribes its content in english. This script also supports Word Error Rate (WER) accuracy measurement. diff --git a/examples/windows/onnx_ptq/whisper/requirements.txt b/examples/windows/onnx_ptq/whisper/requirements.txt index 4b85b8989..6bf00ee98 100644 --- a/examples/windows/onnx_ptq/whisper/requirements.txt +++ b/examples/windows/onnx_ptq/whisper/requirements.txt @@ -4,10 +4,10 @@ datasets==2.19.0 evaluate jiwer librosa -onnx==1.17.0 +onnx==1.19.0 onnxruntime-gpu==1.20.1 optimum==1.23.3 soundfile torch==2.7.0+cu128 torchaudio==2.7.0+cu128 -transformers==4.48.0 +transformers==4.57.3 diff --git a/examples/windows/onnx_ptq/whisper/whisper_optimum_ort_inference.py b/examples/windows/onnx_ptq/whisper/whisper_optimum_ort_inference.py index 8866a94fc..112441df4 100644 --- a/examples/windows/onnx_ptq/whisper/whisper_optimum_ort_inference.py +++ b/examples/windows/onnx_ptq/whisper/whisper_optimum_ort_inference.py @@ -64,6 +64,10 @@ def main(args): use_merged=USE_MERGED, ) + # Fix for transformers == 4.53: Disable the new Dynamic Cache Format + # that optimum 1.23 doesn't support. This forces the legacy tuple format + model._supports_default_dynamic_cache = lambda: False + # print(model.encoder) # print(model.decoder)