Skip to content

Commit 1cc7081

Browse files
cp: check asr models (14989) into r2.5.0 (#15017)
* check asr models (#14989) * check asr models Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> * Apply isort and black reformatting Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> * update return Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> Co-authored-by: nithinraok <nithinraok@users.noreply.github.com> Co-authored-by: Charlie Truong <chtruong@nvidia.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com> * Pass timeout when running speech functional tests (#15012) Signed-off-by: Charlie Truong <chtruong@nvidia.com> * Add timeout to speech test Signed-off-by: Charlie Truong <chtruong@nvidia.com> * Use cpu runner for asr cpu unit tests Signed-off-by: Charlie Truong <chtruong@nvidia.com> --------- Signed-off-by: nithinraok <nithinrao.koluguri@gmail.com> Signed-off-by: nithinraok <nithinraok@users.noreply.github.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com> Signed-off-by: Charlie Truong <chtruong@nvidia.com> Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com> Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>
1 parent 4bf0c84 commit 1cc7081

File tree

2 files changed

+20
-3
lines changed

2 files changed

+20
-3
lines changed

.github/workflows/cicd-main-speech.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ jobs:
3939
runner: self-hosted-azure-gpus-1
4040
timeout: 30
4141
- script: L0_Unit_Tests_CPU_ASR
42-
runner: self-hosted-azure-cpu
42+
runner: azure-gpu-vm-runner1-cpu
4343
cpu-only: true
4444
timeout: 20
4545
- script: L0_Unit_Tests_GPU_TTS
@@ -180,6 +180,7 @@ jobs:
180180
script: SPEECHLM_HF_Training_DuplexS2SSpeechDecoder
181181
- runner: self-hosted-azure
182182
script: SPEECHLM_HF_Training_SALM
183+
timeout: 20
183184
needs: [unit-tests]
184185
runs-on: ${{ matrix.runner }}
185186
name: ${{ matrix.is-optional && 'PLEASEFIXME_' || '' }}${{ matrix.script }}
@@ -195,4 +196,5 @@ jobs:
195196
script: ${{ matrix.script }}
196197
tests_to_run: ${{ inputs.test_to_run }}
197198
image: ${{ inputs.image-name }}
199+
timeout: ${{ matrix.timeout || 10 }}
198200
is_optional: ${{ matrix.is-optional || false }}

nemo/core/connectors/save_restore_connector.py

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -754,8 +754,23 @@ def _save_state_dict_to_disk(state_dict, filepath):
754754
torch.save(state_dict, filepath)
755755

756756
@staticmethod
757-
def _load_state_dict_from_disk(model_weights, map_location=None):
758-
return torch.load(model_weights, map_location='cpu', weights_only=False)
757+
def _load_state_dict_from_disk(model_weights, map_location='cpu'):
758+
"""
759+
Load model state dict from disk.
760+
761+
Args:
762+
model_weights: Path to the checkpoint file
763+
map_location: Device to map tensors to
764+
765+
Returns:
766+
State dict loaded from checkpoint
767+
768+
"""
769+
try:
770+
return torch.load(model_weights, map_location=map_location, weights_only=True)
771+
except Exception as e:
772+
logging.error(f"Failed to load checkpoint with weights_only=True: {e}")
773+
raise e
759774

760775
@property
761776
def model_config_yaml(self) -> str:

0 commit comments

Comments
 (0)