An ONNX-exportable Mel Filterbank implementation in PyTorch, meticulously aligned with Kaldi's feature extraction behavior.
This project was developed with the assistance of Claude Opus 4.5 and Antigravity.
- Kaldi Consistency: Verified against
kaldifeatandkaldi-native-fbankfor numerical accuracy. - ONNX Ready: Implements an ONNX-compatible signal processing chain (DFT-based FFT and
gather-based framing) for seamless model export and deployment. - Pure PyTorch: No custom C++ extensions required, ensuring high portability across platforms.
For standard feature extraction in PyTorch:
import torch
from kaldi_filter_bank.filter_bank import Filterbank
fbank = Filterbank()
waveform = torch.randn(1, 16000) # [batch, samples]
features = fbank(waveform) # [batch, frames, mel_bins]To include the filterbank as a front-end layer in your ASR model for ONNX export:
from kaldi_filter_bank.filter_bank import Filterbank
class AsrModel(torch.nn.Module):
def __init__(self, model):
super().__init__()
# Filterbank now automatically detects ONNX export environment
self.fbank = Filterbank()
self.model = model
def forward(self, waveforms):
# waveforms shape: [batch, time]
features = self.fbank(waveforms)
logits = self.model(features)
return logitsSee
tests/test-fbank-onnx-export.pyfor a full export example.
You can verify the numerical consistency with Kaldi using the provided test scripts:
python tests/test-fbank-diff-knf.pyExample Output:
[6] Comparison test with kaldi-native-fbank
Our features shape: torch.Size([1, 298, 80])
knf features shape: torch.Size([298, 80])
All frames - Max diff: 0.000185, Mean diff: 0.000003
✓ kaldi-native-fbank comparison passed! (Max diff < 0.001)
kaldi_filter_bank/: Core implementation of theFilterbankmodule.tests/: Comprehensive test suite for validation and comparison.test-fbank-diff-kaldifeat.py: Numerical alignment withkaldifeat.test-fbank-diff-knf.py: Numerical alignment withkaldi-native-fbank.test-fbank-diff-torchaudio.py: Comparison withtorchaudio.test-fbank-onnx-export.py: ONNX export and verification script.
This implementation draws inspiration and verification from: