You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pretrained Model: nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0
Model Type: EncDecHybridRNNTCTCModel
Training Dataset Size: 25 samples
Validation Dataset Size: 10 samples
Problem Description
After fine-tuning the Arabic FastConformer model, inference produces only (unknown) tokens instead of actual transcriptions. The model appears to fine-tune successfully, but the checkpoint fails to generate meaningful output.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Environment Information
NeMo Version: 2.5.3
Installation Method: pip install nemo_toolkit['all']
Platform: Kaggle
GPU/Accelerator: GPU (single device)
Precision: 16-bit
Model Information
Pretrained Model: nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0
Model Type: EncDecHybridRNNTCTCModel
Training Dataset Size: 25 samples
Validation Dataset Size: 10 samples
Problem Description
After fine-tuning the Arabic FastConformer model, inference produces only (unknown) tokens instead of actual transcriptions. The model appears to fine-tune successfully, but the checkpoint fails to generate meaningful output.
Fine-tuning Code
Inference Code
Output
The model produces only
<unk>tokens:Observations
Fine-tuning appears to complete without errors
Validation WER is very high (135.0000)
Additional Context
audio_filepath,text, anddurationfields.modeland.vocabfiles from the pretrained modelSample Train Manifest
{"audio_filepath": "/kaggle/input/nemo-ar-dataset-small/train_0.wav", "duration": 3.06, "text": "فِيهِنَّ خَيْرَاتٌ حِسَانٌ"} {"audio_filepath": "/kaggle/input/nemo-ar-dataset-small/train_1.wav", "duration": 2.16, "text": "وَإِذَا النُّفوسُ زُوِّجَت"} {"audio_filepath": "/kaggle/input/nemo-ar-dataset-small/train_2.wav", "duration": 4.176, "text": "مِن نُّطْفَةٍ خَلَقَهُ فَقَدَّرَهُ"} {"audio_filepath": "/kaggle/input/nemo-ar-dataset-small/train_3.wav", "duration": 2.088, "text": "إستخدم رأسك!"}Beta Was this translation helpful? Give feedback.
All reactions