This code contains routines to easily extract acoustic features from audio files inside a specific directory. I coded in R, with well-known packages, and created a wrapper to invoke the routines with Python. It works with mp3, wav, and flac file formats. Also, some of my researches used this R code and its versions to study natural sound patterns, such as [1], [2], and [3].
Available features and their references are grouped by their respective R packages.
-
Acoustic Complexity Index (ACI) [ref]
-
Acoustic Diversity Index (ADI) [ref]
-
Acoustic Evenness Index (AEI) [ref]
-
Bioacoustic Index (BIO) [ref]
-
Normalized Difference Soundscape Index (NDSI) [ref]
-
Acoustic Entropy Index (H), Temporal Entropy (Ht), and Frequency Entropy (Hf) [ref]
-
Acoustic Richness (AR) and Median of Amplitude Envelope (M) [ref]
-
Number of Peaks (NP) [ref]
-
Roughness [ref]
-
Rugosity [ref]
-
Zero-crossing rate (ZCR) [ref]
- Mel-frequency Cepstrum Coefficients (MFCC) [ref]
-
Background noise index (BGN) [ref]
-
Power Spectral Density (PSD) [ref]
-
Signal-to-noise Ratio (SNR) [ref]
-
Mean Square Pressure (MSP) and Sound Pressure Level (SPL) [ref]
-
Root Mean Square (RMS) [ref]
The code also generate spectrogram images of the recordings.
Additionally to R and the previous packages, the user will need to install doParallel to perform parallel processing and reduce time consuming. Also, to use Python wrapper, you must install the package rpy2.
These functions are available in R to extract features and generate spectrograms of recordings.
#It takes a directory and process all recordings inside it.
process.dir <- function(
source.path, #directory with recordings to be processed
target.path, #directory where results will be saved (default = source.path)
aquatic, #if the audio files are from underwater landscape (default = FALSE). Depending on the landscape, some routines has different parameters.
generate, #define the output indices ("index") or spectrograms ("spec") (default = "index")
slice.size, #how many slices a recording (in seconds) should be divided (default = 1). A slice must not be greater than 90s.
batch.size, #save indices results in batchs of batch.size (default = 100)
img.dim, #if generate = "spec", it defines the image size (default = c(1366, 758))
palette #if generate = "spec", it defines the image color palette (default = spectro.colors)
) #It process a specific recording.
process.file <- function(
path, #recording to be processed
spec.path, #if generate = "spec", it defines where image will be saved (default = NULL, inside path)
aquatic, #if the audio files are from underwater landscape (default = FALSE). Depending on the landscape, some routines has different parameters.
generate, #define the output indices ("index") or spectrograms ("spec") (default = "index")
slice.size, #how many slices a recording (in seconds) should be divided (default = 1). A slice must not be greater than 90s.
img.dim, #if generate = "spec", it defines the image size (default = c(1366, 758))
palette, #if generate = "spec", it defines the image color palette (default = spectro.colors)
start.parallel #if TRUE, process file in parallel mode (default = FALSE)
)Python code has the class AcousticIndices with the following methods that have the same meaning of previous R functions:
process_dir(
source_path,
target_path = None,
aquatic = False,
slice_size = 1)process_file(
path,
aquatic = False,
slice_size = 1)Functions generate a feature set with 35 values and one identifier for each defined slice.size. Indices are represented by a single value and the functions also return NDSI components (anthrophony and biophony), the mean and standard deviation of the PSD, and 12 MFCCs. If you have stereo recordings, functions return the mean of the two channels.
Functions return a matrix of features and generate a CSV file inside target.path. Spectrograms are also generated inside a target.path subdir.
Function process.dir generates log files to track the process. If index calculation stops for any reason, just reinvoke that function and the calculation will be resumed due to log files.
source("indices.R")
#Process all recordings inside a directory and returns a matrix with the results
files = process.dir("/home/user/recordings")
#Process a specific recording and returns a matrix with the results
one_file = process.file("/home/user/recordings/file.wav")
#Divides recordings into 2-second clips before calculating indices
process.dir("/home/user/recordings", slice.size = 2)
#Divides recording into 2-second clips before generating spectrograms
process.dir("/home/user/recordings", slice.size = 2, generate = "spec")from indices import AcousticIndices
extractor = AcousticIndices()
files = extractor.process_dir("/home/user/recordings")
one_file = extractor.process_file("/home/user/recordings/file.wav")- Fábio Felix Dias - e-mail: f_diasfabio@usp.br