Skip to content

Commit ac455ba

Browse files
authored
NeMo RTFx updates (#33)
* Remove common voice from evaluation, as discussed. Pin nemo to a particular version to make sure results are reproducible. In particular, include: NVIDIA-NeMo/NeMo#10054 Make sure that optional dependency cuda-python is included to ensure that we use cuda graph accelerated decoder inference in RNN-T and TDT mdoels.
1 parent 61e9ffc commit ac455ba

File tree

6 files changed

+7
-31
lines changed

6 files changed

+7
-31
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,9 @@ Each library has its own set of requirements. We recommend using a clean conda e
99
1) Clone this repository.
1010
2) Install PyTorch by following the instructions here: https://pytorch.org/get-started/locally/
1111
3) Install the common requirements for all library by running `pip install -r requirements/requirements.txt`.
12-
4) Install the requirements for each library you wish to evalaute by running `pip install -r requirements/requirements_<library_name>.txt`.
13-
5) Connect your Hugging Face account by running `huggingface-cli login`.
12+
4) If you wish to run NeMo, note that the benchmark currently needs CUDA 12.6 (`nvidia-smi` should output "CUDA Version: 12.6" or higher), to fix a problem in previous drivers for RNN-T inference with cooperative kernels inside of conditional nodes (see here: https://github.com/NVIDIA/NeMo/pull/9869)
13+
5) Install the requirements for each library you wish to evalaute by running `pip install -r requirements/requirements_<library_name>.txt`.
14+
6) Connect your Hugging Face account by running `huggingface-cli login`.
1415

1516
# Evaluate a model
1617

nemo_asr/run_canary.sh

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -85,15 +85,6 @@ do
8585
--batch_size=${BATCH_SIZE} \
8686
--max_eval_samples=-1
8787

88-
python run_eval.py \
89-
--model_id=${MODEL_ID} \
90-
--dataset_path="hf-audio/esb-datasets-test-only-sorted" \
91-
--dataset="common_voice" \
92-
--split="test" \
93-
--device=${DEVICE_ID} \
94-
--batch_size=${BATCH_SIZE} \
95-
--max_eval_samples=-1
96-
9788
# Evaluate results
9889
RUNDIR=`pwd` && \
9990
cd ../normalizer && \

nemo_asr/run_eval.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ def download_audio_files(batch):
101101
total_time = 0
102102
for _ in range(2): # warmup once and calculate rtf
103103
if _ == 0:
104-
audio_files = all_data["audio_filepaths"][:256] # warmup with 4 batches
104+
audio_files = all_data["audio_filepaths"][:args.batch_size * 4] # warmup with 4 batches
105105
else:
106106
audio_files = all_data["audio_filepaths"]
107107
start_time = time.time()

nemo_asr/run_fast_conformer_ctc.sh

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -86,15 +86,6 @@ do
8686
--batch_size=${BATCH_SIZE} \
8787
--max_eval_samples=-1
8888

89-
python run_eval.py \
90-
--model_id=${MODEL_ID} \
91-
--dataset_path="hf-audio/esb-datasets-test-only-sorted" \
92-
--dataset="common_voice" \
93-
--split="test" \
94-
--device=${DEVICE_ID} \
95-
--batch_size=${BATCH_SIZE} \
96-
--max_eval_samples=-1
97-
9889
# Evaluate results
9990
RUNDIR=`pwd` && \
10091
cd ../normalizer && \

nemo_asr/run_fast_conformer_rnnt.sh

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -86,15 +86,6 @@ do
8686
--batch_size=${BATCH_SIZE} \
8787
--max_eval_samples=-1
8888

89-
python run_eval.py \
90-
--model_id=${MODEL_ID} \
91-
--dataset_path="hf-audio/esb-datasets-test-only-sorted" \
92-
--dataset="common_voice" \
93-
--split="test" \
94-
--device=${DEVICE_ID} \
95-
--batch_size=${BATCH_SIZE} \
96-
--max_eval_samples=-1
97-
9889
# Evaluate results
9990
RUNDIR=`pwd` && \
10091
cd ../normalizer && \

requirements/requirements_nemo.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
git+https://github.com/NVIDIA/NeMo.git@r2.0.0rc1#egg=nemo_toolkit[all]
1+
git+https://github.com/NVIDIA/NeMo.git@d0efff087613ea2584e215969f289fed17414d8b#egg=nemo_toolkit[all] # This commit hash is a recent version of main at the time of testing.
22
tqdm
33
soundfile
44
librosa
5+
IPython # Workaround for https://github.com/NVIDIA/NeMo/pull/9890#discussion_r1701028427
6+
cuda-python>=12.4 # Used for fast TDT and RNN-T inference

0 commit comments

Comments
 (0)