-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Description
threshold.Silence crashes if the batch dimension of the input tensor is > 1. It seems to be because loudness is calculated by converting tensors to numpy and using librosa. The line that causes the actual crash is line 37 from loudness.py, which tries to squeeze the 0th dimension.
Librosa doesn't support batches, but since calculating the loudness only involves STFT, logs and means, it doesn't seems hard change it to use the torch version of these functions.
Since the rest of the package (at least seems to) works fine with batched inputs, this might be a worthwhile change.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels