Skip to content

Conversation

@SAMortazavi
Copy link

PR Description

In the current implementation, .opus files are not included in the list of supported audio formats, even though soundfile (sf) can successfully load them. Additionally, using pydub for segment extraction is inefficient for long audio files, as it first loads the entire file into memory before clipping segments.

This PR addresses the issue by adding .opus to the supported formats and improving loading efficiency by introducing librosa for audio loading. If librosa and soundfile fail to load the audio, the fallback mechanism uses pydub as a backup loader.

Relevant Code

View on GitHub

Summary of Changes

  • Added .opus to supported audio formats.
  • Added librosa for efficient audio loading to improve performance on long audio files.
  • Implemented a fallback mechanism: if librosa and soundfile fail, the loader falls back to pydub.
  • Updated segment extraction logic to handle librosa-loaded data efficiently.

…ading

perf(audio): use librosa for faster and more efficient audio loading
Added librosa as an alternative audio loading method to improve performance and reduce load times compared to the current implementation.


Signed-off-by: S Abolfazl Mortazavi <[email protected]>
@SAMortazavi SAMortazavi changed the title feat(audio): integrate librosa for faster and more efficient audio loading integrate librosa for faster and more efficient audio loading Oct 25, 2025
@github-actions github-actions bot added the ASR label Oct 25, 2025
@nithinraok nithinraok requested a review from pzelasko December 15, 2025 18:20
@pzelasko
Copy link
Collaborator

pzelasko commented Jan 7, 2026

You can load every format by adding train_ds.use_lhotse=True which defers under the hood to libsoundfile or torchaudio if installed.

We don't really maintain the non-lhotse dataloader anymore and this code may be deprecated and removed in future NeMo versions. If you still wish to proceed with the PR though, we'd need unit test coverage for these new features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants