Spectral features are widely used in machine and deep learning research. The code shows implementations of basic features for voice/speech analysis.
SpectralEntropySpectralCentroidSpectralSpreadSpectralSkewnessSpectralKurtosisSpectralRolloffPointSpectralCrest- (WIP)
SpectralFlux SpectralSlopeSpectralFlatness- (WIP) Ma, Y., Nishihara, A. Efficient voice activity detection algorithm using
long-term spectral flatnessmeasure. J AUDIO SPEECH MUSIC PROC. 2013, 87 (2013).
The code highly refers to the Matlab tutorial, and uses Pytorch and some functions of Speechbrain to achieve.
git clone https://github.com/BrownsugarZeer/SpectralFeatures.git
python -m venv venv
venv\Scripts\activate.bat
pip install -r requirements.txt
My file path of waveform is <path_to_matlab>\MATLAB\R2019b\toolbox\audio\samples\Counting-16-44p1-mono-15secs.wav and has been downsampled from 44100 Hz to 16000 Hz.
- Spectral Entropy
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralEntropy(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Centroid
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralCentroid(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Spread
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralSpread(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Skewness
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralSkewness(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Kurtosis
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralKurtosis(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Rolloff Point
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralRolloffPoint(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Crest
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralCrest(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Flux
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralFlux(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Slope
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralSlope(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Spectral Flatness
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = SpectralFlatness(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)- Long-Term Spectral Flatness
x, fs = torchaudio.load("Counting-16-44p1-mono-15secs_16000.wav")
compute_feat = LongTermSpectralFlatness(sample_rate=fs)
spectr_feat = compute_feat(x)
plot_feature(x, fs, spectr_feat)The readability of the code will sometimes leads to lower performance..










