-
Notifications
You must be signed in to change notification settings - Fork 66
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Hi Yuan
We have another question: What was the length of the audio files you used? In the paper it is written thatthey are 10 seconds but with 10 seconds the resulting spectrograms (from torchaudio.compliance.kaldi.fbank) are 998 frames (with the frame_shift set to 10ms and the frame_length set to 25ms) and thus the remaining 26 frames are being zero padded by the dataloader (if the target_length is set to 1024).
Best Regards,
Fabian
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested
